BayesEoR CLI workflows (ValSKA)
This page is a practical, “example gallery”-style guide to running BayesEoR validation workflows using ValSKA.
It is written to be:
copy/paste friendly (commands shown as indented code blocks)
HPC-friendly (explicit about dry-runs, SLURM submission, and dependencies)
reproducible (paths, templates, and variants recorded in manifests)
If you are new, start with Quick Start. If you are iterating on a validation campaign, use Detailed examples.
Contents
Quick Start
Before You Start
Before trying the workflow examples below, make sure you have completed the setup steps that ValSKA assumes:
ValSKA setup:
install the
valskaenvironment so thevalska-bayeseor-*CLI commands are available See the Installation section in the project README for environment setup details.
BayesEoR setup:
clone BayesEoR locally in a location you control from BayesEoR on GitHub
create the BayesEoR conda environment using the environment file shipped with that BayesEoR checkout
make sure the BayesEoR checkout and conda environment used by batch jobs are compatible with this ValSKA version
Runtime configuration in your ValSKA checkout:
copy
config/runtime_paths.example.yamltoconfig/runtime_paths.yamlin your ValSKA repositoryedit
config/runtime_paths.yamlfor your system:set
results_rootset
data.rootif you want relative--datapaths to resolve automaticallyset
bayeseor.repo_pathto your local BayesEoR cloneset
bayeseor.conda_shandbayeseor.conda_envset CPU and GPU SLURM defaults for your site
obtain or generate the UVH5 dataset you want to analyse
if you are using one of the example commands below, replace the example
--datavalue with a dataset path that actually exists on your system
If you are unsure which command to start with, run:
valska-bayeseor-help
If you want copy/paste command sequences rather than a command map, jump to Detailed examples.
CLI quick reference
Setup And Submission
Command |
Purpose |
Detailed docs |
|---|---|---|
|
Print command index and common workflows |
|
|
Prepare one run directory and artefacts |
|
|
Prepare and/or submit sweep points |
|
|
Submit CPU/GPU stages for one prepared run |
valska-bayeseor-submit –stage cpu, valska-bayeseor-submit –stage gpu |
Reporting And Health
Command |
Purpose |
Detailed docs |
|---|---|---|
|
Generate report tables/plots for one sweep |
|
|
Discover available sweep directories |
|
|
Inspect per-point completeness for one sweep |
|
|
Validate sweep integrity with exit-code semantics |
|
|
Aggregate discovery + status + validation |
Operations
Command |
Purpose |
Detailed docs |
|---|---|---|
|
Generate exact submit commands for incomplete points |
|
|
Batch-generate reports across discovered sweeps |
|
|
Compare metrics between two sweep summaries |
|
|
Safe cleanup workflow (dry-run by default) |
Quick Definitions
a single point is one prepared run directory for one perturbation value
a sweep is a collection of single-point runs across multiple perturbation values
Which command should I use?
Need a quick command map before you start? →
valska-bayeseor-helpNeed to create run inputs/scripts for one single-point run? →
valska-bayeseor-prepareNeed to submit stages for one prepared run dir? →
valska-bayeseor-submitNeed to prepare/submit a sweep across multiple perturbation values? →
valska-bayeseor-sweepNeed to inspect one sweep health quickly? →
valska-bayeseor-sweep-statusNeed pass/fail validation semantics for one sweep? →
valska-bayeseor-validate-sweepNeed campaign-wide health and validation overview? →
valska-bayeseor-sweep-auditNeed restart suggestions for incomplete points? →
valska-bayeseor-resumeNeed reports for one sweep? →
valska-bayeseor-reportNeed reports for many sweeps? →
valska-bayeseor-report-allNeed side-by-side metric comparison between two sweeps? →
valska-bayeseor-compare-sweepsNeed maintenance cleanup (dry-run first)? →
valska-bayeseor-cleanup
For command-local examples of helper CLIs, run each command with --help.
Public documentation note:
do not put personal filesystem paths into your committed
runtime_paths.yamlreplace all example paths with paths that are valid for your own system or site
Replace:
achromatic_Gaussianwith a beam-model label matching the data you are analysingGLEAMwith a sky-model label matching the data you are analysing...uvh5with your datasetRUN_ID/SWEEP_IDwith something meaningful
valska-bayeseor-help
If you want the shortest possible command map before you start:
valska-bayeseor-help
For topic-specific help:
valska-bayeseor-help --topic setup
valska-bayeseor-help --topic submission
valska-bayeseor-help --topic reporting
valska-bayeseor-prepare
What this command does:
creates a run directory containing:
BayesEoR config YAML(s)
SLURM submit scripts for CPU and GPU stages
a manifest recording provenance and resolved paths
Dry-run example:
valska-bayeseor-prepare \
--beam achromatic_Gaussian \
--sky GLEAM \
--data gsm_plus_gleam-158.30-167.10-MHz-nf-38-fov-19.4deg-circ-field-1_quentin.uvh5 \
--run-id RUN_ID \
--fwhm-perturb-frac 0.01 \
--dry-run
To create the files for real, run the same command without --dry-run.
valska-bayeseor-sweep
A sweep prepares N run dirs (one per FWHM perturbation) and can optionally submit CPU/GPU stages across all points.
Prepare only (no submission):
valska-bayeseor-sweep \
--beam achromatic_Gaussian \
--sky GLEAM \
--data gsm_plus_gleam-158.30-167.10-MHz-nf-38-fov-19.4deg-circ-field-1_quentin.uvh5 \
--run-id SWEEP_ID \
--fwhm-fracs 0.01 0.0 \
--submit none
For a first end-to-end run on a new system, prefer a single submission command:
valska-bayeseor-sweep \
--beam achromatic_Gaussian \
--sky GLEAM \
--data /path/to/your/input.uvh5 \
--run-id SWEEP_ID \
--fwhm-fracs 0.01 0.0 \
--submit all
This is the most reliable path because ValSKA submits CPU and GPU together per point and manages the dependency chain in one invocation.
If you want finer control, you can split the stages:
Submit CPU stage across all points:
valska-bayeseor-sweep \
--beam achromatic_Gaussian \
--sky GLEAM \
--data gsm_plus_gleam-158.30-167.10-MHz-nf-38-fov-19.4deg-circ-field-1_quentin.uvh5 \
--run-id SWEEP_ID \
--fwhm-fracs 0.01 0.0 \
--submit cpu
Submit GPU stage across all points later (advanced / recovery workflow):
valska-bayeseor-sweep \
--beam achromatic_Gaussian \
--sky GLEAM \
--data gsm_plus_gleam-158.30-167.10-MHz-nf-38-fov-19.4deg-circ-field-1_quentin.uvh5 \
--run-id SWEEP_ID \
--fwhm-fracs 0.01 0.0 \
--submit gpu
If you use the split CPU/GPU path, make sure:
your CPU stage has already produced the required precompute outputs
jobs.jsonexists for each point if you expect ValSKA to reuse recorded CPU job idsyour site-specific SLURM defaults in
runtime_paths.yamlare correct before submission
Dry-run submission (show sbatch commands but do not submit):
valska-bayeseor-sweep \
--beam achromatic_Gaussian \
--sky GLEAM \
--data gsm_plus_gleam-158.30-167.10-MHz-nf-38-fov-19.4deg-circ-field-1_quentin.uvh5 \
--run-id SWEEP_ID \
--fwhm-fracs 0.01 0.0 \
--submit cpu \
--submit-dry-run
valska-bayeseor-submit –stage cpu
If you are using the submit CLI:
valska-bayeseor-submit /path/to/run_dir --stage cpu
Or manually (inside the run_dir output from prepare):
sbatch /path/to/run_dir/submit_cpu_precompute.sh
valska-bayeseor-submit –stage gpu
Use this mode when you already have a prepared run directory and you want to launch GPU work separately from CPU precompute.
Important:
this split CPU-then-GPU workflow is useful for recovery and explicit control
on some SLURM sites, later GPU submission can be sensitive to how CPU job dependencies are handled
for a first end-to-end run,
valska-bayeseor-sweep --submit allis usually the safest path
Using the submit CLI:
valska-bayeseor-submit /path/to/run_dir --stage gpu
Or manually:
sbatch --dependency=afterok:<CPU_JOBID> /path/to/run_dir/submit_signal_fit_gpu_run.sh
sbatch --dependency=afterok:<CPU_JOBID> /path/to/run_dir/submit_no_signal_gpu_run.sh
Concepts
Beam / sky taxonomy (directory layout)
We organise results by the two “observation-defining” axes:
beam_model: instrument / beam model label (e.g.achromatic_Gaussian,chromatic_Gaussian,airy)sky_model: sky model label (e.g.GLEAM,GSM,GLEAM_plus_GSM)
This keeps campaigns predictable when you explore multiple sky models and multiple beam models.
Template + variant concept (collision-free template differences)
Many BayesEoR runs differ only by template-level settings (chromatic vs achromatic, alternate priors, etc).
To avoid collisions, we include a <variant> directory level.
If you do not specify
--variant, it is derived from the template filename stem by removing the first_template.
Examples:
validation_v1d0_template.yaml→validation_v1d0validation_v1d0_template_achromatic.yaml→validation_v1d0_achromaticvalidation_achromatic_Gaussian.yaml→validation_achromatic_Gaussian
You can override the auto-derived value with --variant.
What gets created where
Canonical single-run directory:
<results_root>/bayeseor/<beam_model>/<sky_model>/<variant>/<run_label>/<run_id>[/<UTCSTAMP>]
Canonical sweep root and points:
<results_root>/bayeseor/<beam_model>/<sky_model>/_sweeps/<sweep_id>/<variant>/<run_label>[/<UTCSTAMP>]
Notes:
<run_label>is typicallyfwhm_<value>(e.g.fwhm_1.0e-02) and is auto-generated from FWHM frac.--uniqueappends a UTC timestamp suffix (useful for one-off runs; usually not recommended for resumable sweeps).
Lifecycle diagram
This is the “mental model” for the typical workflow.
+---------------------------+
| Choose beam + sky + data |
| Choose template (optional)|
+-------------+-------------+
|
v
+---------------------------+
| PREPARE (per run) |
| valska-bayeseor-prepare |
| - writes run_dir |
| - config_*.yaml |
| - submit_*.sh |
| - manifest.json |
+-------------+-------------+
|
v
+---------------------------+
| CPU stage (precompute) |
| valska-bayeseor-submit |
| --stage cpu |
| or sbatch submit_cpu*.sh |
| - records CPU job id |
| into jobs.json |
+-------------+-------------+
|
v
+---------------------------+
| GPU stage (run analyses) |
| valska-bayeseor-submit |
| --stage gpu |
| - uses afterok:<CPU_JOBID>|
| - submits signal/no-signal|
| - records GPU job ids |
+---------------------------+
Sweeps are a thin wrapper that repeats PREPARE across multiple FWHM fractions.
Recommended first run:
valska-bayeseor-sweep --submit all (submit CPU+GPU in one go per point)
Other modes for setup checks, recovery, or tighter stage-by-stage control:
valska-bayeseor-sweep --submit none (prepare all points)
valska-bayeseor-sweep --submit cpu (submit CPU across points)
valska-bayeseor-sweep --submit gpu (submit GPU across points; reuses completed CPU outputs or CPU job ids)
Detailed examples
The examples below are intentionally explicit, and many include abridged output snippets.
A) Prepare (dry-run)
valska-bayeseor-prepare \
--beam achromatic_Gaussian \
--sky GLEAM \
--data gsm_plus_gleam-158.30-167.10-MHz-nf-38-fov-19.4deg-circ-field-1_quentin.uvh5 \
--run-id test_prepare1 \
--fwhm-perturb-frac 0.01 \
--dry-run
Example output (abridged):
[DRY RUN] Prepare would be executed with:
results_root: /share/.../validation_results/UKSRC
beam_model: achromatic_Gaussian
sky_model: GLEAM
run_id: test_prepare1
run_label: fwhm_1.0e-02
template: .../templates/validation_achromatic_Gaussian.yaml
variant: validation_achromatic_Gaussian
data: /share/.../gsm_plus_gleam...uvh5
run_dir (preview): /share/.../bayeseor/achromatic_Gaussian/GLEAM/validation_achromatic_Gaussian/fwhm_1.0e-02/test_prepare1
...
[DRY RUN] No files will be created.
B) Prepare (real)
valska-bayeseor-prepare \
--beam achromatic_Gaussian \
--sky GLEAM \
--data gsm_plus_gleam-158.30-167.10-MHz-nf-38-fov-19.4deg-circ-field-1_quentin.uvh5 \
--run-id test_prepare1 \
--fwhm-perturb-frac 0.01
Example output (abridged):
Run prepared:
run_dir: /share/.../bayeseor/achromatic_Gaussian/GLEAM/validation_achromatic_Gaussian/fwhm_1.0e-02/test_prepare1
manifest: /share/.../manifest.json
beam_model: achromatic_Gaussian
sky_model: GLEAM
variant: validation_achromatic_Gaussian
run_label: fwhm_1.0e-02
run_id: test_prepare1
Next steps:
Option A) Submit via ValSKA (recommended):
valska-bayeseor-submit /share/.../test_prepare1 --stage cpu
valska-bayeseor-submit /share/.../test_prepare1 --stage gpu
Option B) Manual submission:
sbatch /share/.../submit_cpu_precompute.sh
sbatch /share/.../submit_signal_fit_gpu_run.sh
sbatch /share/.../submit_no_signal_gpu_run.sh
C) Sweep (dry-run with point directories)
valska-bayeseor-sweep \
--beam achromatic_Gaussian \
--sky GLEAM \
--data gsm_plus_gleam-158.30-167.10-MHz-nf-38-fov-19.4deg-circ-field-1_quentin.uvh5 \
--run-id sweep_test2 \
--fwhm-fracs 0.01 0.0 \
--dry-run
Example output (abridged):
[DRY RUN] Sweep would be executed with:
sweep_dir: /share/.../bayeseor/achromatic_Gaussian/GLEAM/_sweeps/sweep_test2
variant: validation_achromatic_Gaussian
...
[DRY RUN] Points:
+0.010 fwhm_1.0e-02 -> /share/.../_sweeps/sweep_test2/validation_achromatic_Gaussian/fwhm_1.0e-02
+0.000 fwhm_0.0e+00 -> /share/.../_sweeps/sweep_test2/validation_achromatic_Gaussian/fwhm_0.0e+00
D) Sweep (prepare only)
valska-bayeseor-sweep \
--beam achromatic_Gaussian \
--sky GLEAM \
--data gsm_plus_gleam-158.30-167.10-MHz-nf-38-fov-19.4deg-circ-field-1_quentin.uvh5 \
--run-id sweep_test2 \
--fwhm-fracs 0.01 0.0 \
--submit none
This writes:
sweep manifest:
.../_sweeps/<sweep_id>/sweep_manifest.jsonper-point run dirs containing
manifest.json, configs, and SLURM scripts
E) Submit CPU+GPU together (fresh sweep)
If you want the most reliable first end-to-end run, use:
valska-bayeseor-sweep \
--beam achromatic_Gaussian \
--sky GLEAM \
--data gsm_plus_gleam-158.30-167.10-MHz-nf-38-fov-19.4deg-circ-field-1_quentin.uvh5 \
--run-id sweep_test2 \
--fwhm-fracs 0.01 0.0 \
--submit all
This is the recommended first-run path because ValSKA submits CPU and GPU together per point and manages the dependency chain in one invocation.
F) Submit CPU across sweep points
valska-bayeseor-sweep \
--beam achromatic_Gaussian \
--sky GLEAM \
--data gsm_plus_gleam-158.30-167.10-MHz-nf-38-fov-19.4deg-circ-field-1_quentin.uvh5 \
--run-id sweep_test2 \
--fwhm-fracs 0.01 0.0 \
--submit cpu
Typical output includes a “Submission summary” listing the sbatch calls per point.
It should also record job ids into each point’s jobs.json (real submit).
To preview the sbatch commands without submitting:
valska-bayeseor-sweep \
--beam achromatic_Gaussian \
--sky GLEAM \
--data gsm_plus_gleam-158.30-167.10-MHz-nf-38-fov-19.4deg-circ-field-1_quentin.uvh5 \
--run-id sweep_test2 \
--fwhm-fracs 0.01 0.0 \
--submit cpu \
--submit-dry-run
G) Submit GPU across sweep points (after CPU)
GPU-only submission works when each point either:
has completed CPU precompute outputs already present, or
has a dependency job id available (typically from that point’s
jobs.json)valska-bayeseor-sweep
–beam achromatic_Gaussian
–sky GLEAM
–data gsm_plus_gleam-158.30-167.10-MHz-nf-38-fov-19.4deg-circ-field-1_quentin.uvh5
–run-id sweep_test2
–fwhm-fracs 0.01 0.0
–submit gpu
If you attempt GPU submission before CPU job ids exist, ValSKA should report an error explaining you must either:
submit CPU in the same invocation (
--submit all), orpass
--depend-afterok <JOBID>(advanced), orensure
jobs.jsonexists with a recorded CPU job id, orwait for CPU precompute to finish so the required matrix stack exists under the run directory
Dry-run GPU submission (show commands, no jobs submitted):
valska-bayeseor-sweep \
--beam achromatic_Gaussian \
--sky GLEAM \
--data gsm_plus_gleam-158.30-167.10-MHz-nf-38-fov-19.4deg-circ-field-1_quentin.uvh5 \
--run-id sweep_test2 \
--fwhm-fracs 0.01 0.0 \
--submit gpu \
--submit-dry-run
Example output (abridged; dependency read from jobs.json):
sbatch --dependency=afterok:<CPU_JOBID> .../submit_signal_fit_gpu_run.sh
sbatch --dependency=afterok:<CPU_JOBID> .../submit_no_signal_gpu_run.sh
H) Advanced: per-point submission with valska-bayeseor-submit
Sometimes you only want to submit a subset of points or a single point, especially when testing.
Example: submit GPU for just one perturbation fraction (assuming CPU already submitted and recorded):
valska-bayeseor-sweep \
--beam achromatic_Gaussian \
--sky GLEAM \
--data gsm_plus_gleam-158.30-167.10-MHz-nf-38-fov-19.4deg-circ-field-1_quentin.uvh5 \
--run-id sweep_test2 \
--fwhm-fracs 0.01 \
--submit gpu
Or, if you know the run_dir explicitly:
valska-bayeseor-submit /share/.../_sweeps/sweep_test2/validation_achromatic_Gaussian/fwhm_1.0e-02 --stage gpu
I) Monitoring jobs
Common SLURM checks:
squeue -u $USER
sacct -j <JOBID> --format=JobID,JobName,State,Elapsed,ExitCode
tail -n 200 /path/to/run_dir/slurm-<JOBID>.out
ValSKA also records submission information into:
per-point
jobs.jsonsweep-level
sweep_manifest.json(including submit results)
J) Post-processing reports (tables + plots)
After sweep jobs complete (or partially complete), generate report artefacts with:
valska-bayeseor-report /path/to/_sweeps/<run_id>
To include extended outputs (plot_analysis_results and run_complete_bayeseor_analysis table/json):
valska-bayeseor-report /path/to/_sweeps/<run_id> \
--include-plot-analysis-results \
--include-complete-analysis-table
Wrapper equivalent (extended outputs enabled by default):
bash_scripts/valska-bayeseor-report-sweep.sh --sweep-dir /path/to/_sweeps/<run_id>
Airy helper convenience (prepare/submit sweep and auto-run reporting at the end):
bash_scripts/valska-bayeseor-sweep-airy_diam14m-GSM_plus_GLEAM.sh --submit all --report
Skip plot generation when auto-reporting:
bash_scripts/valska-bayeseor-sweep-airy_diam14m-GSM_plus_GLEAM.sh --submit all --report-no-plots
For full reporting options and failure-handling behavior, see:
K) Sweep health/status checks
Inspect a sweep and summarize point completeness:
valska-bayeseor-sweep-status /path/to/_sweeps/SWEEP_ID
JSON mode (scripting/automation):
valska-bayeseor-sweep-status /path/to/_sweeps/SWEEP_ID --json
Validate and fail non-zero for incomplete sweeps:
valska-bayeseor-validate-sweep /path/to/_sweeps/SWEEP_ID
If partial completion is acceptable:
valska-bayeseor-validate-sweep /path/to/_sweeps/SWEEP_ID --allow-partial
If you also require jobs.json per point:
valska-bayeseor-validate-sweep /path/to/_sweeps/SWEEP_ID --require-jobs-json
L) Aggregate sweep audit
Run one command that discovers sweeps and evaluates status + validation:
valska-bayeseor-sweep-audit
Apply filters and output JSON:
valska-bayeseor-sweep-audit --beam airy --sky GSM_plus_GLEAM --json
Use non-zero exit if any audited sweep is invalid:
valska-bayeseor-sweep-audit --fail-on-invalid
M) Backwards compatibility: deprecated –scenario
Older scripts used --scenario as a single label that mixed multiple concepts.
ValSKA now prefers --beam and --sky explicitly.
If you must use --scenario, it is deprecated and must be unambiguous:
--scenario <beam>/<sky>
--scenario <beam>__<sky>
Examples:
valska-bayeseor-sweep \
--scenario achromatic_Gaussian/GLEAM \
--data ...uvh5 \
--run-id sweep_oldstyle \
--fwhm-fracs 0.01 0.0 \
--submit none
Ambiguous older patterns like GLEAM_beam are rejected to prevent silent misrouting.
Notes for UKSRC users
Keep beam/sky labels stable across campaigns. Your analysis notebooks, plots, and archiving will thank you.
Prefer
--submit-dry-runbefore real submissions when testing new templates or SLURM settings.For large sweeps, consider committing a standard
bayeseor.sweep.fwhm_fracsset in runtime_paths.yaml and only override with--fwhm-fracsfor special experiments.If you run into walltime, MultiNest is typically resumable; ValSKA supports resubmission patterns (see
--resubmitin your CLI help).