Skip to content

statspai.crossval

crossval

Cross-engine validation for StatsPAI.

sp.cross_validate runs one estimand through several independent engines (StatsPAI native, pyfixest, linearmodels, DoubleML, R's fixest via Rscript, Stata via batch do) and reports whether they agree — turning the cross-package-reproducibility discipline of Scott Cunningham's "estimate it two ways and check they match" into a single callable for humans and agents.

Public surface
  • :func:cross_validate — the dispatcher.
  • :class:CrossValidationResult — the verdict + per-engine table.
  • :class:EstimandSpec, :class:EngineEstimate, :class:TolerancePolicy — building blocks, exposed for advanced use and testing.

EngineEstimate dataclass

One engine's answer for one focal coefficient.

Every backend adapter normalises its native result into this shape so the reconciliation logic never has to special-case a library. status keeps unavailable / failed engines in the data flow rather than dropping them.

Parameters:

Name Type Description Default
engine str

Backend label, e.g. "statspai", "pyfixest", "linearmodels", "R::fixest", "Stata::reghdfe".

required
estimand str

The estimand key that was requested ("ols", "iv", "did" …).

required
term str

Name of the focal coefficient this estimate refers to.

None
coef float

Point estimate and standard error of term. None when the engine did not produce one (status != ok).

None
se float

Point estimate and standard error of term. None when the engine did not produce one (status != ok).

None
tstat float
None
pvalue float
None
ci_lower float
None
ci_upper float
None
nobs int
None
vcov str

Variance estimator flavour actually used ("iid", "HC1", "cluster" …) so a SE mismatch can be attributed to a vcov difference rather than a genuine disagreement.

None
status str

One of ok / unavailable / error / skipped.

STATUS_OK
message str

Human-readable note (why it was unavailable, the exception text …).

None
elapsed_s float
None
extra dict

Free-form backend extras (first-stage F, n_folds, full coef table …).

dict()

TolerancePolicy dataclass

How close two engines should be on a focal coefficient, and why.

Attributes:

Name Type Description
mode str

"exact" (judge coefficients on a relative-difference scale) or "statistical" (judge on a standard-error scale).

coef_rtol, coef_atol float

Relative / absolute tolerance on the point estimate (exact mode).

se_rtol float

Relative tolerance on the standard error (exact mode). SEs are looser than coefficients because dof corrections and default vcov flavours legitimately differ across libraries.

se_band float

statistical mode: two estimates agree if |Δcoef| <= se_band * max(se).

rationale str

Plain-language justification, surfaced in the report so a reader can see why a given tolerance was applied.

CrossValidationResult

Outcome of cross-validating one estimand across several engines.

Attributes:

Name Type Description
estimand str
term str

Focal coefficient that was reconciled.

estimates list of EngineEstimate

Every engine that was requested (including unavailable / errored ones).

agreement AgreementReport

Verdict + spread diagnostics.

spec dict

Serialised :class:EstimandSpec (what was fit).

provenance dict

Engine versions / environment captured for reproducibility.

degradations list of dict

Structured records for every engine that could not contribute, mirrored from :func:statspai.workflow.record_degradation.

Examples:

>>> import pandas as pd
>>> import statspai as sp
>>> df = pd.DataFrame(
...     {"y": [1.0, 2.0, 3.0, 4.0], "x": [0.0, 1.0, 0.0, 1.0]}
... )
>>> cv = sp.cross_validate(
...     df, "ols", formula="y ~ x", treatment="x", engines=["statspai"]
... )
>>> isinstance(cv, sp.CrossValidationResult)
True
>>> cv.term
'x'

estimates_table property

estimates_table: DataFrame

One row per requested engine (ok and not-ok alike).

engine_status_counts property

engine_status_counts: Dict[str, int]

Count engines by status, including unavailable/error entries.

can_claim_cross_engine_agreement property

can_claim_cross_engine_agreement: bool

Whether it is honest to report cross-engine agreement.

ok_table

ok_table() -> DataFrame

Only the engines that produced an estimate.

plot

plot(ax: Any = None, **kwargs: Any) -> Any

Forest plot of the engines' estimates with shared-range shading.

EstimandSpec dataclass

Engine-neutral description of one estimand.

Either formula (fixest 1-3 part syntax) or the structured fields (y + treatment + covariates …) must pin down the model. The constructor helpers :meth:from_kwargs and :meth:from_result fill both representations so downstream adapters can pick whichever they prefer.

Parameters:

Name Type Description Default
estimand str

Canonical estimand key ("ols", "feols", "iv", "did", "poisson", "dml" …).

required
data DataFrame
required
formula str

fixest-style: "y ~ x1 + x2" (OLS), "y ~ x | fe1 + fe2" (FE), "y ~ x | fe | endog ~ z1 + z2" (IV with FE).

None
y str

Outcome and focal regressor (the coefficient cross-validation reconciles by default).

None
treatment str

Outcome and focal regressor (the coefficient cross-validation reconciles by default).

None
covariates list of str
list()
fixed_effects list of str
list()
endog list of str

Endogenous regressors (IV).

list()
instruments list of str
list()
cluster list of str
list()
weights str
None
vcov str

Requested variance estimator ("iid", "HC1", "cluster" …).

None
term str

Focal coefficient to reconcile. Defaults to treatment (or the first endogenous regressor for IV).

None
extra dict

Estimand-specific extras forwarded verbatim (e.g. DiD time / unit / gname columns).

dict()

from_result classmethod

from_result(result: Any) -> 'EstimandSpec'

Best-effort recovery of a spec from a fitted StatsPAI result.

Reads the metadata StatsPAI result objects commonly carry (estimand / formula / data / treatment column). Raises a clear error when the result does not expose enough to re-run it elsewhere — rather than guessing and silently cross-validating the wrong model.

focal_term

focal_term() -> str

Name of the coefficient cross-validation reconciles.

build_formula

build_formula() -> str

Construct a fixest-style formula from the structured fields.