valska.evidence

Evidence evaluation module for the ValSKA project.

This module provides functions to calculate and interpret Bayes factors between models in the BayesEoR analysis of HERA beam perturbations.

Typical usage examples

>>> from pathlib import Path
>>> from valska.evidence import (
...     calculate_bayes_factor,
...     find_chain_pairs,
...     analyze_chain_pair,
...     run_complete_bayeseor_analysis,
... )
...
>>> base = Path("/path/to/BayesEoR/chains")
>>> v7_base = base / "v7d0"
>>> pairs = find_chain_pairs(v7_base)
>>> cp = pairs["1.0e00pp"]
>>> bf_result = calculate_bayes_factor(
...     cp.fgeor_root / "data-",
...     cp.fgonly_root / "data-",
... )
>>> bf_result["log_bayes_factor"]
-3.2

>>> summary = run_complete_bayeseor_analysis(
...     chain_pairs=pairs,
...     create_plots=False,
...     verbose=False,
... )
>>> summary["summary"]["pass"]
5

Functions

`analyze_chain_pair`(pair[, dir_prefix, ...])	Analyze a single FgEoR/FgOnly chain pair using BaNTER-style validation.
`calculate_bayes_factor`(chain_path_1, ...[, ...])	Calculate Bayes factor between two models given their nested-sampling chains.
`find_chain_pairs`(base_dir[, fgeor_prefix, ...])	Discover matched FgEoR / FgOnly chain pairs under a base directory.
`interpret_bayes_factor`(log_bf)	Interpret the strength of evidence given a log Bayes factor.
`run_complete_bayeseor_analysis`(chain_pairs)	Run a complete BayesEoR perturbation analysis over multiple chain pairs.

Classes

ChainPair(perturbation, fgeor_root, fgonly_root)

Container for a matched FgEoR / FgOnly chain pair.

class valska.evidence.ChainPair(perturbation: str, fgeor_root: Path, fgonly_root: Path)

Container for a matched FgEoR / FgOnly chain pair.

fgeor_root: Path

fgonly_root: Path

perturbation: str

valska.evidence._find_single_mn_subdir(root: Path) → Path

Find the MN-* (or similar) subdirectory under a given root.

Assumes there is exactly one subdirectory; raises if 0 or >1. This keeps the logic explicit and surfaces layout issues early.

valska.evidence._normalize_perturbation_key(raw_suffix: str) → str

Normalize a perturbation suffix into a stable key.

For now this is just a passthrough, but by putting it in one place you can later convert between formats (e.g. ‘+1e0pp’ vs ‘1.0e00pp’) if needed.

valska.evidence.analyze_chain_pair(pair: ChainPair, dir_prefix: Path | None = None, expected_ps: float = 214777.66068216303, create_plots: bool = True, verbose: bool = True) → dict[str, Any]

Analyze a single FgEoR/FgOnly chain pair using BaNTER-style validation.

Parameters

pair :: ChainPair describing the perturbation and root directories.
dir_prefix :: Optional prefix to strip off when handing paths to DataContainer. If None, the common ancestor of both roots is used.
expected_ps :: Expected power spectrum value passed through to DataContainer.
create_plots :: If True, generate and show posterior / power spectrum plots.
verbose :: If True, print detailed log output to stdout.

Returns

dict

Result dictionary with keys:

'perturbation'
'plot_success'
'bayes_factor_result'
'validation' ('PASS', 'FAIL' or 'ERROR')

valska.evidence.calculate_bayes_factor(chain_path_1: str | Path, chain_path_2: str | Path, model_name_1: str = 'Model 1', model_name_2: str = 'Model 2', verbose: bool = True) → dict[str, Any]

Calculate Bayes factor between two models given their nested-sampling chains.

The function assumes that the directories at chain_path_1 and chain_path_2 are readable by anesthetic.read_chains() and that the returned objects implement a logZ() method (as in anesthetic).

Parameters

chain_path_1 :: Path to the first model’s chain directory (numerator in Bayes factor).
chain_path_2 :: Path to the second model’s chain directory (denominator in Bayes factor).
model_name_1 :: Name of the first model for display and reporting.
model_name_2 :: Name of the second model for display and reporting.
verbose :: If True, print intermediate information (loaded evidences and resulting Bayes factor) to stdout.

Returns

dict

Dictionary containing results with keys:

'model_1': str, the name of model 1.
'model_2': str, the name of model 2.
'log_evidence_1': float or None, log-evidence of model 1
'log_evidence_2': float or None, log-evidence of model 2
'log_bayes_factor': float or None, ln(Z1/Z2).
'interpretation': str, readable interpretation.
'success': bool, True if computation succeeded.
'error': str or None, error message if failed.

valska.evidence.find_chain_pairs(base_dir: Path, fgeor_prefix: str = 'GL_FgEoR_', fgonly_prefix: str = 'GL_FgOnly_', debug: bool = False) → dict[str, ChainPair]

Discover matched FgEoR / FgOnly chain pairs under a base directory.

This is meant to work with layouts like:

``base_dir / "GL_FgEoR_1.0e00pp"/MN-23-23-38-2-ffm-.../data-``
``base_dir / "GL_FgOnly_1.0e00pp"/MN-23-23-38-2-ffm-.../data-``

or v5-style directories such as:

``base_dir / "GSM_FgEoR_-5e0pp"/MN-23-23-38-2-.../data-``
``base_dir / "GSM_FgOnly_-5e0pp"/MN-23-23-38-2-.../data-``

by adjusting fgeor_prefix and fgonly_prefix.

Parameters

base_dir :: Directory containing GL_FgEoR_* and GL_FgOnly_* subdirectories (e.g. paths.chains_dir / 'v7d0'), or GSM_*_* for v5-style.
fgeor_prefix :: Prefix for Fg+EoR directories.
fgonly_prefix :: Prefix for FgOnly directories.
debug :: If True, print information about discovered entries and matches.

Returns

dict: Mapping from a normalized perturbation key to a ChainPair. The fgeor_root and fgonly_root paths are the MN-* level directories that directly contain the data- files.

valska.evidence.interpret_bayes_factor(log_bf: float) → str

Interpret the strength of evidence given a log Bayes factor.

Parameters

log_bf :: Natural logarithm of the Bayes factor, ln(Z1 / Z2).

Returns

str: Readable description of evidence strength, based on commonly used (Jeffreys-like) thresholds.

valska.evidence.run_complete_bayeseor_analysis(chain_pairs: dict[str, ChainPair], perturbation_levels: Iterable[str] | None = None, dir_prefix: Path | None = None, expected_ps: float = 214777.66068216303, create_plots: bool = False, show_detailed_results: bool = False, verbose: bool = True, show_progress: bool = True) → dict[str, Any]

Run a complete BayesEoR perturbation analysis over multiple chain pairs.

Parameters

chain_pairs :: Mapping of perturbation keys to ChainPair objects, typically created by find_chain_pairs().
perturbation_levels :: Optional iterable of perturbation keys to analyze. If None, all keys in chain_pairs are used.
dir_prefix :: Optional directory prefix for analyze_chain_pair(). If None, the common ancestor of each pair is used individually.
expected_ps :: Expected power spectrum value passed on to analyze_chain_pair().
create_plots :: If True, generate plots for each perturbation level.
show_detailed_results :: If True, print detailed numerical results per successful perturbation.
verbose :: If True, print readable progress and summary messages.
show_progress :: If True and multiple perturbation levels are analyzed, display a tqdm progress bar (if available).

Returns

dict: Contains 'results', 'summary', and 'successful_results'.