valska.evidence

Evidence evaluation module for the ValSKA project.

This module provides functions to calculate and interpret Bayes factors between models in the BayesEoR analysis of HERA beam perturbations.

Typical usage examples

>>> from pathlib import Path
>>> from valska.evidence import (
...     calculate_bayes_factor,
...     find_chain_pairs,
...     analyze_chain_pair,
...     run_complete_bayeseor_analysis,
... )
...
>>> base = Path("/path/to/BayesEoR/chains")
>>> v7_base = base / "v7d0"
>>> pairs = find_chain_pairs(v7_base)
>>> cp = pairs["1.0e00pp"]
>>> bf_result = calculate_bayes_factor(
...     cp.fgeor_root / "data-",
...     cp.fgonly_root / "data-",
... )
>>> bf_result["log_bayes_factor"]
-3.2
>>> summary = run_complete_bayeseor_analysis(
...     chain_pairs=pairs,
...     create_plots=False,
...     verbose=False,
... )
>>> summary["summary"]["pass"]
5

Functions

analyze_chain_pair(pair[, dir_prefix, ...])

Analyze a single FgEoR/FgOnly chain pair using BaNTER-style validation.

calculate_bayes_factor(chain_path_1, ...[, ...])

Calculate Bayes factor between two models given their nested-sampling chains.

find_chain_pairs(base_dir[, fgeor_prefix, ...])

Discover matched FgEoR / FgOnly chain pairs under a base directory.

interpret_bayes_factor(log_bf)

Interpret the strength of evidence given a log Bayes factor.

run_complete_bayeseor_analysis(chain_pairs)

Run a complete BayesEoR perturbation analysis over multiple chain pairs.

Classes

ChainPair(perturbation, fgeor_root, fgonly_root)

Container for a matched FgEoR / FgOnly chain pair.

class valska.evidence.ChainPair(perturbation: str, fgeor_root: Path, fgonly_root: Path)

Container for a matched FgEoR / FgOnly chain pair.

fgeor_root: Path
fgonly_root: Path
perturbation: str
valska.evidence._find_single_mn_subdir(root: Path) Path

Find the MN-* (or similar) subdirectory under a given root.

Assumes there is exactly one subdirectory; raises if 0 or >1. This keeps the logic explicit and surfaces layout issues early.

valska.evidence._normalize_perturbation_key(raw_suffix: str) str

Normalize a perturbation suffix into a stable key.

For now this is just a passthrough, but by putting it in one place you can later convert between formats (e.g. ‘+1e0pp’ vs ‘1.0e00pp’) if needed.

valska.evidence.analyze_chain_pair(pair: ChainPair, dir_prefix: Path | None = None, expected_ps: float = 214777.66068216303, create_plots: bool = True, verbose: bool = True) dict[str, Any]

Analyze a single FgEoR/FgOnly chain pair using BaNTER-style validation.

Parameters

pair :

ChainPair describing the perturbation and root directories.

dir_prefix :

Optional prefix to strip off when handing paths to DataContainer. If None, the common ancestor of both roots is used.

expected_ps :

Expected power spectrum value passed through to DataContainer.

create_plots :

If True, generate and show posterior / power spectrum plots.

verbose :

If True, print detailed log output to stdout.

Returns

dict

Result dictionary with keys:

  • 'perturbation'

  • 'plot_success'

  • 'bayes_factor_result'

  • 'validation' ('PASS', 'FAIL' or 'ERROR')

valska.evidence.calculate_bayes_factor(chain_path_1: str | Path, chain_path_2: str | Path, model_name_1: str = 'Model 1', model_name_2: str = 'Model 2', verbose: bool = True) dict[str, Any]

Calculate Bayes factor between two models given their nested-sampling chains.

The function assumes that the directories at chain_path_1 and chain_path_2 are readable by anesthetic.read_chains() and that the returned objects implement a logZ() method (as in anesthetic).

Parameters

chain_path_1 :

Path to the first model’s chain directory (numerator in Bayes factor).

chain_path_2 :

Path to the second model’s chain directory (denominator in Bayes factor).

model_name_1 :

Name of the first model for display and reporting.

model_name_2 :

Name of the second model for display and reporting.

verbose :

If True, print intermediate information (loaded evidences and resulting Bayes factor) to stdout.

Returns

dict

Dictionary containing results with keys:

  • 'model_1': str, the name of model 1.

  • 'model_2': str, the name of model 2.

  • 'log_evidence_1': float or None, log-evidence of model 1

  • 'log_evidence_2': float or None, log-evidence of model 2

  • 'log_bayes_factor': float or None, ln(Z1/Z2).

  • 'interpretation': str, readable interpretation.

  • 'success': bool, True if computation succeeded.

  • 'error': str or None, error message if failed.

valska.evidence.find_chain_pairs(base_dir: Path, fgeor_prefix: str = 'GL_FgEoR_', fgonly_prefix: str = 'GL_FgOnly_', debug: bool = False) dict[str, ChainPair]

Discover matched FgEoR / FgOnly chain pairs under a base directory.

This is meant to work with layouts like:

``base_dir / "GL_FgEoR_1.0e00pp"/MN-23-23-38-2-ffm-.../data-``
``base_dir / "GL_FgOnly_1.0e00pp"/MN-23-23-38-2-ffm-.../data-``

or v5-style directories such as:

``base_dir / "GSM_FgEoR_-5e0pp"/MN-23-23-38-2-.../data-``
``base_dir / "GSM_FgOnly_-5e0pp"/MN-23-23-38-2-.../data-``

by adjusting fgeor_prefix and fgonly_prefix.

Parameters

base_dir :

Directory containing GL_FgEoR_* and GL_FgOnly_* subdirectories (e.g. paths.chains_dir / 'v7d0'), or GSM_*_* for v5-style.

fgeor_prefix :

Prefix for Fg+EoR directories.

fgonly_prefix :

Prefix for FgOnly directories.

debug :

If True, print information about discovered entries and matches.

Returns

dict

Mapping from a normalized perturbation key to a ChainPair. The fgeor_root and fgonly_root paths are the MN-* level directories that directly contain the data- files.

valska.evidence.interpret_bayes_factor(log_bf: float) str

Interpret the strength of evidence given a log Bayes factor.

Parameters

log_bf :

Natural logarithm of the Bayes factor, ln(Z1 / Z2).

Returns

str

Readable description of evidence strength, based on commonly used (Jeffreys-like) thresholds.

valska.evidence.run_complete_bayeseor_analysis(chain_pairs: dict[str, ChainPair], perturbation_levels: Iterable[str] | None = None, dir_prefix: Path | None = None, expected_ps: float = 214777.66068216303, create_plots: bool = False, show_detailed_results: bool = False, verbose: bool = True, show_progress: bool = True) dict[str, Any]

Run a complete BayesEoR perturbation analysis over multiple chain pairs.

Parameters

chain_pairs :

Mapping of perturbation keys to ChainPair objects, typically created by find_chain_pairs().

perturbation_levels :

Optional iterable of perturbation keys to analyze. If None, all keys in chain_pairs are used.

dir_prefix :

Optional directory prefix for analyze_chain_pair(). If None, the common ancestor of each pair is used individually.

expected_ps :

Expected power spectrum value passed on to analyze_chain_pair().

create_plots :

If True, generate plots for each perturbation level.

show_detailed_results :

If True, print detailed numerical results per successful perturbation.

verbose :

If True, print readable progress and summary messages.

show_progress :

If True and multiple perturbation levels are analyzed, display a tqdm progress bar (if available).

Returns

dict

Contains 'results', 'summary', and 'successful_results'.