BayesEoR CLI workflows (ValSKA)

This page is a practical, “example gallery”-style guide to running BayesEoR validation workflows using ValSKA.

It is written to be:

copy/paste friendly (commands shown as indented code blocks)
HPC-friendly (explicit about dry-runs, SLURM submission, and dependencies)
reproducible (paths, templates, and variants recorded in manifests)

If you are new, start with Quick Start. If you are iterating on a validation campaign, use Detailed examples.

Quick Start

Before You Start

Before trying the workflow examples below, make sure you have completed the setup steps that ValSKA assumes:

ValSKA setup:

install the valska environment so the valska-bayeseor-* CLI commands are available See the Installation section in the project README for environment setup details.

BayesEoR setup:

clone BayesEoR locally in a location you control from BayesEoR on GitHub
create the BayesEoR conda environment using the environment file shipped with that BayesEoR checkout
make sure the BayesEoR checkout and conda environment used by batch jobs are compatible with this ValSKA version

Runtime configuration in your ValSKA checkout:

copy config/runtime_paths.example.yaml to config/runtime_paths.yaml in your ValSKA repository
edit config/runtime_paths.yaml for your system:
- set results_root
- set data.root if you want relative --data paths to resolve automatically
- set bayeseor.repo_path to your local BayesEoR clone
- set bayeseor.conda_sh and bayeseor.conda_env
- set CPU and GPU SLURM defaults for your site
obtain or generate the UVH5 dataset you want to analyse
if you are using one of the example commands below, replace the example --data value with a dataset path that actually exists on your system

If you are unsure which command to start with, run:

valska-bayeseor-help

If you want copy/paste command sequences rather than a command map, jump to Detailed examples.

CLI quick reference

Setup And Submission

Command	Purpose	Detailed docs
`valska-bayeseor-help`	Print command index and common workflows	operations: command index
`valska-bayeseor-prepare`	Prepare one run directory and artefacts	valska-bayeseor-prepare
`valska-bayeseor-sweep`	Prepare and/or submit sweep points	valska-bayeseor-sweep
`valska-bayeseor-submit`	Submit CPU/GPU stages for one prepared run	valska-bayeseor-submit –stage cpu, valska-bayeseor-submit –stage gpu

Reporting And Health

Command	Purpose	Detailed docs
`valska-bayeseor-report`	Generate report tables/plots for one sweep	reporting: CLI usage
`valska-bayeseor-list-sweeps`	Discover available sweep directories	reporting: wrapper script usage
`valska-bayeseor-sweep-status`	Inspect per-point completeness for one sweep	reporting: sweep health helpers
`valska-bayeseor-validate-sweep`	Validate sweep integrity with exit-code semantics	reporting: sweep health helpers
`valska-bayeseor-sweep-audit`	Aggregate discovery + status + validation	reporting: sweep health helpers

Operations

Command	Purpose	Detailed docs
`valska-bayeseor-resume`	Generate exact submit commands for incomplete points	operations: resume incomplete sweep points
`valska-bayeseor-report-all`	Batch-generate reports across discovered sweeps	operations: batch reporting
`valska-bayeseor-compare-sweeps`	Compare metrics between two sweep summaries	operations: compare two sweep outcomes
`valska-bayeseor-cleanup`	Safe cleanup workflow (dry-run by default)	operations: cleanup

Quick Definitions

a single point is one prepared run directory for one perturbation value
a sweep is a collection of single-point runs across multiple perturbation values

Which command should I use?

Need a quick command map before you start? → valska-bayeseor-help
Need to create run inputs/scripts for one single-point run? → valska-bayeseor-prepare
Need to submit stages for one prepared run dir? → valska-bayeseor-submit
Need to prepare/submit a sweep across multiple perturbation values? → valska-bayeseor-sweep
Need to inspect one sweep health quickly? → valska-bayeseor-sweep-status
Need pass/fail validation semantics for one sweep? → valska-bayeseor-validate-sweep
Need campaign-wide health and validation overview? → valska-bayeseor-sweep-audit
Need restart suggestions for incomplete points? → valska-bayeseor-resume
Need reports for one sweep? → valska-bayeseor-report
Need reports for many sweeps? → valska-bayeseor-report-all
Need side-by-side metric comparison between two sweeps? → valska-bayeseor-compare-sweeps
Need maintenance cleanup (dry-run first)? → valska-bayeseor-cleanup

For command-local examples of helper CLIs, run each command with --help.

Public documentation note:

do not put personal filesystem paths into your committed runtime_paths.yaml
replace all example paths with paths that are valid for your own system or site

Replace:

achromatic_Gaussian with a beam-model label matching the data you are analysing
GLEAM with a sky-model label matching the data you are analysing
...uvh5 with your dataset
RUN_ID / SWEEP_ID with something meaningful

valska-bayeseor-help

If you want the shortest possible command map before you start:

valska-bayeseor-help

For topic-specific help:

valska-bayeseor-help --topic setup
valska-bayeseor-help --topic submission
valska-bayeseor-help --topic reporting

valska-bayeseor-prepare

What this command does:

creates a run directory containing:
BayesEoR config YAML(s)
SLURM submit scripts for CPU and GPU stages
a manifest recording provenance and resolved paths

Dry-run example:

valska-bayeseor-prepare \
  --beam achromatic_Gaussian \
  --sky GLEAM \
  --data gsm_plus_gleam-158.30-167.10-MHz-nf-38-fov-19.4deg-circ-field-1_quentin.uvh5 \
  --run-id RUN_ID \
  --fwhm-perturb-frac 0.01 \
  --dry-run

To create the files for real, run the same command without --dry-run.

valska-bayeseor-sweep

A sweep prepares N run dirs (one per FWHM perturbation) and can optionally submit CPU/GPU stages across all points.

Prepare only (no submission):

valska-bayeseor-sweep \
  --beam achromatic_Gaussian \
  --sky GLEAM \
  --data gsm_plus_gleam-158.30-167.10-MHz-nf-38-fov-19.4deg-circ-field-1_quentin.uvh5 \
  --run-id SWEEP_ID \
  --fwhm-fracs 0.01 0.0 \
  --submit none

For a first end-to-end run on a new system, prefer a single submission command:

valska-bayeseor-sweep \
  --beam achromatic_Gaussian \
  --sky GLEAM \
  --data /path/to/your/input.uvh5 \
  --run-id SWEEP_ID \
  --fwhm-fracs 0.01 0.0 \
  --submit all

This is the most reliable path because ValSKA submits CPU and GPU together per point and manages the dependency chain in one invocation.

If you want finer control, you can split the stages:

Submit CPU stage across all points:

valska-bayeseor-sweep \
  --beam achromatic_Gaussian \
  --sky GLEAM \
  --data gsm_plus_gleam-158.30-167.10-MHz-nf-38-fov-19.4deg-circ-field-1_quentin.uvh5 \
  --run-id SWEEP_ID \
  --fwhm-fracs 0.01 0.0 \
  --submit cpu

Submit GPU stage across all points later (advanced / recovery workflow):

valska-bayeseor-sweep \
  --beam achromatic_Gaussian \
  --sky GLEAM \
  --data gsm_plus_gleam-158.30-167.10-MHz-nf-38-fov-19.4deg-circ-field-1_quentin.uvh5 \
  --run-id SWEEP_ID \
  --fwhm-fracs 0.01 0.0 \
  --submit gpu

If you use the split CPU/GPU path, make sure:

your CPU stage has already produced the required precompute outputs
jobs.json exists for each point if you expect ValSKA to reuse recorded CPU job ids
your site-specific SLURM defaults in runtime_paths.yaml are correct before submission

Dry-run submission (show sbatch commands but do not submit):

valska-bayeseor-sweep \
  --beam achromatic_Gaussian \
  --sky GLEAM \
  --data gsm_plus_gleam-158.30-167.10-MHz-nf-38-fov-19.4deg-circ-field-1_quentin.uvh5 \
  --run-id SWEEP_ID \
  --fwhm-fracs 0.01 0.0 \
  --submit cpu \
  --submit-dry-run

valska-bayeseor-submit –stage cpu

If you are using the submit CLI:

valska-bayeseor-submit /path/to/run_dir --stage cpu

Or manually (inside the run_dir output from prepare):

sbatch /path/to/run_dir/submit_cpu_precompute.sh

valska-bayeseor-submit –stage gpu

Use this mode when you already have a prepared run directory and you want to launch GPU work separately from CPU precompute.

Important:

this split CPU-then-GPU workflow is useful for recovery and explicit control
on some SLURM sites, later GPU submission can be sensitive to how CPU job dependencies are handled
for a first end-to-end run, valska-bayeseor-sweep --submit all is usually the safest path

Using the submit CLI:

valska-bayeseor-submit /path/to/run_dir --stage gpu

Or manually:

sbatch --dependency=afterok:<CPU_JOBID> /path/to/run_dir/submit_signal_fit_gpu_run.sh
sbatch --dependency=afterok:<CPU_JOBID> /path/to/run_dir/submit_no_signal_gpu_run.sh

Concepts

Beam / sky taxonomy (directory layout)

We organise results by the two “observation-defining” axes:

beam_model: instrument / beam model label (e.g. achromatic_Gaussian, chromatic_Gaussian, airy)
sky_model: sky model label (e.g. GLEAM, GSM, GLEAM_plus_GSM)

This keeps campaigns predictable when you explore multiple sky models and multiple beam models.

Template + variant concept (collision-free template differences)

Many BayesEoR runs differ only by template-level settings (chromatic vs achromatic, alternate priors, etc). To avoid collisions, we include a <variant> directory level.

If you do not specify --variant, it is derived from the template filename stem by removing the first _template.

Examples:

validation_v1d0_template.yaml → validation_v1d0
validation_v1d0_template_achromatic.yaml → validation_v1d0_achromatic
validation_achromatic_Gaussian.yaml → validation_achromatic_Gaussian

You can override the auto-derived value with --variant.

What gets created where

Canonical single-run directory:

<results_root>/bayeseor/<beam_model>/<sky_model>/<variant>/<run_label>/<run_id>[/<UTCSTAMP>]

Canonical sweep root and points:

<results_root>/bayeseor/<beam_model>/<sky_model>/_sweeps/<sweep_id>/<variant>/<run_label>[/<UTCSTAMP>]

Notes:

<run_label> is typically fwhm_<value> (e.g. fwhm_1.0e-02) and is auto-generated from FWHM frac.
--unique appends a UTC timestamp suffix (useful for one-off runs; usually not recommended for resumable sweeps).

Lifecycle diagram

This is the “mental model” for the typical workflow.

+---------------------------+
| Choose beam + sky + data  |
| Choose template (optional)|
+-------------+-------------+
              |
              v
+---------------------------+
| PREPARE (per run)         |
| valska-bayeseor-prepare   |
| - writes run_dir          |
| - config_*.yaml           |
| - submit_*.sh             |
| - manifest.json           |
+-------------+-------------+
              |
              v
+---------------------------+
| CPU stage (precompute)    |
| valska-bayeseor-submit    |
|   --stage cpu             |
| or sbatch submit_cpu*.sh  |
| - records CPU job id      |
|   into jobs.json          |
+-------------+-------------+
              |
              v
+---------------------------+
| GPU stage (run analyses)  |
| valska-bayeseor-submit    |
|   --stage gpu             |
| - uses afterok:<CPU_JOBID>|
| - submits signal/no-signal|
| - records GPU job ids     |
+---------------------------+

Sweeps are a thin wrapper that repeats PREPARE across multiple FWHM fractions.

Recommended first run:

valska-bayeseor-sweep --submit all    (submit CPU+GPU in one go per point)

Other modes for setup checks, recovery, or tighter stage-by-stage control:

valska-bayeseor-sweep --submit none   (prepare all points)
valska-bayeseor-sweep --submit cpu    (submit CPU across points)
valska-bayeseor-sweep --submit gpu    (submit GPU across points; reuses completed CPU outputs or CPU job ids)

Detailed examples

The examples below are intentionally explicit, and many include abridged output snippets.

A) Prepare (dry-run)

valska-bayeseor-prepare \
  --beam achromatic_Gaussian \
  --sky GLEAM \
  --data gsm_plus_gleam-158.30-167.10-MHz-nf-38-fov-19.4deg-circ-field-1_quentin.uvh5 \
  --run-id test_prepare1 \
  --fwhm-perturb-frac 0.01 \
  --dry-run

Example output (abridged):

[DRY RUN] Prepare would be executed with:
  results_root:       /share/.../validation_results/UKSRC
  beam_model:         achromatic_Gaussian
  sky_model:          GLEAM
  run_id:             test_prepare1
  run_label:          fwhm_1.0e-02
  template:           .../templates/validation_achromatic_Gaussian.yaml
  variant:            validation_achromatic_Gaussian
  data:               /share/.../gsm_plus_gleam...uvh5
  run_dir (preview):  /share/.../bayeseor/achromatic_Gaussian/GLEAM/validation_achromatic_Gaussian/fwhm_1.0e-02/test_prepare1
  ...
[DRY RUN] No files will be created.

B) Prepare (real)

valska-bayeseor-prepare \
  --beam achromatic_Gaussian \
  --sky GLEAM \
  --data gsm_plus_gleam-158.30-167.10-MHz-nf-38-fov-19.4deg-circ-field-1_quentin.uvh5 \
  --run-id test_prepare1 \
  --fwhm-perturb-frac 0.01

Example output (abridged):

Run prepared:
  run_dir:      /share/.../bayeseor/achromatic_Gaussian/GLEAM/validation_achromatic_Gaussian/fwhm_1.0e-02/test_prepare1
  manifest:     /share/.../manifest.json
  beam_model:   achromatic_Gaussian
  sky_model:    GLEAM
  variant:      validation_achromatic_Gaussian
  run_label:    fwhm_1.0e-02
  run_id:       test_prepare1

Next steps:
  Option A) Submit via ValSKA (recommended):
    valska-bayeseor-submit /share/.../test_prepare1 --stage cpu
    valska-bayeseor-submit /share/.../test_prepare1 --stage gpu
  Option B) Manual submission:
    sbatch /share/.../submit_cpu_precompute.sh
    sbatch /share/.../submit_signal_fit_gpu_run.sh
    sbatch /share/.../submit_no_signal_gpu_run.sh

C) Sweep (dry-run with point directories)

valska-bayeseor-sweep \
  --beam achromatic_Gaussian \
  --sky GLEAM \
  --data gsm_plus_gleam-158.30-167.10-MHz-nf-38-fov-19.4deg-circ-field-1_quentin.uvh5 \
  --run-id sweep_test2 \
  --fwhm-fracs 0.01 0.0 \
  --dry-run

Example output (abridged):

[DRY RUN] Sweep would be executed with:
  sweep_dir: /share/.../bayeseor/achromatic_Gaussian/GLEAM/_sweeps/sweep_test2
  variant:   validation_achromatic_Gaussian
  ...

[DRY RUN] Points:
  +0.010  fwhm_1.0e-02  ->  /share/.../_sweeps/sweep_test2/validation_achromatic_Gaussian/fwhm_1.0e-02
  +0.000  fwhm_0.0e+00  ->  /share/.../_sweeps/sweep_test2/validation_achromatic_Gaussian/fwhm_0.0e+00

D) Sweep (prepare only)

valska-bayeseor-sweep \
  --beam achromatic_Gaussian \
  --sky GLEAM \
  --data gsm_plus_gleam-158.30-167.10-MHz-nf-38-fov-19.4deg-circ-field-1_quentin.uvh5 \
  --run-id sweep_test2 \
  --fwhm-fracs 0.01 0.0 \
  --submit none

This writes:

sweep manifest: .../_sweeps/<sweep_id>/sweep_manifest.json
per-point run dirs containing manifest.json, configs, and SLURM scripts

E) Submit CPU+GPU together (fresh sweep)

If you want the most reliable first end-to-end run, use:

valska-bayeseor-sweep \
  --beam achromatic_Gaussian \
  --sky GLEAM \
  --data gsm_plus_gleam-158.30-167.10-MHz-nf-38-fov-19.4deg-circ-field-1_quentin.uvh5 \
  --run-id sweep_test2 \
  --fwhm-fracs 0.01 0.0 \
  --submit all

This is the recommended first-run path because ValSKA submits CPU and GPU together per point and manages the dependency chain in one invocation.

F) Submit CPU across sweep points

valska-bayeseor-sweep \
  --beam achromatic_Gaussian \
  --sky GLEAM \
  --data gsm_plus_gleam-158.30-167.10-MHz-nf-38-fov-19.4deg-circ-field-1_quentin.uvh5 \
  --run-id sweep_test2 \
  --fwhm-fracs 0.01 0.0 \
  --submit cpu

Typical output includes a “Submission summary” listing the sbatch calls per point. It should also record job ids into each point’s jobs.json (real submit).

To preview the sbatch commands without submitting:

valska-bayeseor-sweep \
  --beam achromatic_Gaussian \
  --sky GLEAM \
  --data gsm_plus_gleam-158.30-167.10-MHz-nf-38-fov-19.4deg-circ-field-1_quentin.uvh5 \
  --run-id sweep_test2 \
  --fwhm-fracs 0.01 0.0 \
  --submit cpu \
  --submit-dry-run

G) Submit GPU across sweep points (after CPU)

GPU-only submission works when each point either:

has completed CPU precompute outputs already present, or
has a dependency job id available (typically from that point’s jobs.json)

valska-bayeseor-sweep
–beam achromatic_Gaussian
–sky GLEAM
–data gsm_plus_gleam-158.30-167.10-MHz-nf-38-fov-19.4deg-circ-field-1_quentin.uvh5
–run-id sweep_test2
–fwhm-fracs 0.01 0.0
–submit gpu

If you attempt GPU submission before CPU job ids exist, ValSKA should report an error explaining you must either:

submit CPU in the same invocation (--submit all), or
pass --depend-afterok <JOBID> (advanced), or
ensure jobs.json exists with a recorded CPU job id, or
wait for CPU precompute to finish so the required matrix stack exists under the run directory

Dry-run GPU submission (show commands, no jobs submitted):

valska-bayeseor-sweep \
  --beam achromatic_Gaussian \
  --sky GLEAM \
  --data gsm_plus_gleam-158.30-167.10-MHz-nf-38-fov-19.4deg-circ-field-1_quentin.uvh5 \
  --run-id sweep_test2 \
  --fwhm-fracs 0.01 0.0 \
  --submit gpu \
  --submit-dry-run

Example output (abridged; dependency read from jobs.json):

sbatch --dependency=afterok:<CPU_JOBID> .../submit_signal_fit_gpu_run.sh
sbatch --dependency=afterok:<CPU_JOBID> .../submit_no_signal_gpu_run.sh

H) Advanced: per-point submission with valska-bayeseor-submit

Sometimes you only want to submit a subset of points or a single point, especially when testing.

Example: submit GPU for just one perturbation fraction (assuming CPU already submitted and recorded):

valska-bayeseor-sweep \
  --beam achromatic_Gaussian \
  --sky GLEAM \
  --data gsm_plus_gleam-158.30-167.10-MHz-nf-38-fov-19.4deg-circ-field-1_quentin.uvh5 \
  --run-id sweep_test2 \
  --fwhm-fracs 0.01 \
  --submit gpu

Or, if you know the run_dir explicitly:

valska-bayeseor-submit /share/.../_sweeps/sweep_test2/validation_achromatic_Gaussian/fwhm_1.0e-02 --stage gpu

I) Monitoring jobs

Common SLURM checks:

squeue -u $USER
sacct -j <JOBID> --format=JobID,JobName,State,Elapsed,ExitCode
tail -n 200 /path/to/run_dir/slurm-<JOBID>.out

ValSKA also records submission information into:

per-point jobs.json
sweep-level sweep_manifest.json (including submit results)

J) Post-processing reports (tables + plots)

After sweep jobs complete (or partially complete), generate report artefacts with:

valska-bayeseor-report /path/to/_sweeps/<run_id>

To include extended outputs (plot_analysis_results and run_complete_bayeseor_analysis table/json):

valska-bayeseor-report /path/to/_sweeps/<run_id> \
  --include-plot-analysis-results \
  --include-complete-analysis-table

Wrapper equivalent (extended outputs enabled by default):

bash_scripts/valska-bayeseor-report-sweep.sh --sweep-dir /path/to/_sweeps/<run_id>

Airy helper convenience (prepare/submit sweep and auto-run reporting at the end):

bash_scripts/valska-bayeseor-sweep-airy_diam14m-GSM_plus_GLEAM.sh –submit all –report

Skip plot generation when auto-reporting:

bash_scripts/valska-bayeseor-sweep-airy_diam14m-GSM_plus_GLEAM.sh –submit all –report-no-plots

For full reporting options and failure-handling behavior, see:

BayesEoR reporting workflows

K) Sweep health/status checks

Inspect a sweep and summarize point completeness:

valska-bayeseor-sweep-status /path/to/_sweeps/SWEEP_ID

JSON mode (scripting/automation):

valska-bayeseor-sweep-status /path/to/_sweeps/SWEEP_ID --json

Validate and fail non-zero for incomplete sweeps:

valska-bayeseor-validate-sweep /path/to/_sweeps/SWEEP_ID

If partial completion is acceptable:

valska-bayeseor-validate-sweep /path/to/_sweeps/SWEEP_ID --allow-partial

If you also require jobs.json per point:

valska-bayeseor-validate-sweep /path/to/_sweeps/SWEEP_ID --require-jobs-json

L) Aggregate sweep audit

Run one command that discovers sweeps and evaluates status + validation:

valska-bayeseor-sweep-audit

Apply filters and output JSON:

valska-bayeseor-sweep-audit –beam airy –sky GSM_plus_GLEAM –json

Use non-zero exit if any audited sweep is invalid:

valska-bayeseor-sweep-audit –fail-on-invalid

M) Backwards compatibility: deprecated –scenario

Older scripts used --scenario as a single label that mixed multiple concepts.

ValSKA now prefers --beam and --sky explicitly.

If you must use --scenario, it is deprecated and must be unambiguous:

--scenario <beam>/<sky>
--scenario <beam>__<sky>

Examples:

valska-bayeseor-sweep \
  --scenario achromatic_Gaussian/GLEAM \
  --data ...uvh5 \
  --run-id sweep_oldstyle \
  --fwhm-fracs 0.01 0.0 \
  --submit none

Ambiguous older patterns like GLEAM_beam are rejected to prevent silent misrouting.

Notes for UKSRC users

Keep beam/sky labels stable across campaigns. Your analysis notebooks, plots, and archiving will thank you.
Prefer --submit-dry-run before real submissions when testing new templates or SLURM settings.
For large sweeps, consider committing a standard bayeseor.sweep.fwhm_fracs set in runtime_paths.yaml and only override with --fwhm-fracs for special experiments.
If you run into walltime, MultiNest is typically resumable; ValSKA supports resubmission patterns (see --resubmit in your CLI help).

BayesEoR CLI workflows (ValSKA)

Contents

Quick Start

Before You Start

CLI quick reference

Setup And Submission

Reporting And Health

Operations

Quick Definitions

Which command should I use?

valska-bayeseor-help

valska-bayeseor-prepare

valska-bayeseor-sweep

valska-bayeseor-submit –stage cpu

valska-bayeseor-submit –stage gpu

Concepts

Beam / sky taxonomy (directory layout)

Template + variant concept (collision-free template differences)

What gets created where

Lifecycle diagram

Detailed examples

A) Prepare (dry-run)

B) Prepare (real)

C) Sweep (dry-run with point directories)

D) Sweep (prepare only)

E) Submit CPU+GPU together (fresh sweep)

F) Submit CPU across sweep points

G) Submit GPU across sweep points (after CPU)

H) Advanced: per-point submission with valska-bayeseor-submit

I) Monitoring jobs

J) Post-processing reports (tables + plots)

K) Sweep health/status checks

L) Aggregate sweep audit

M) Backwards compatibility: deprecated –scenario

Notes for UKSRC users

Related files (in this repo)