statspai.crossval¶
crossval ¶
Cross-engine validation for StatsPAI.
sp.cross_validate runs one estimand through several independent engines
(StatsPAI native, pyfixest, linearmodels, DoubleML, R's fixest via Rscript,
Stata via batch do) and reports whether they agree — turning the
cross-package-reproducibility discipline of Scott Cunningham's "estimate it two
ways and check they match" into a single callable for humans and agents.
Public surface
- :func:
cross_validate— the dispatcher. - :class:
CrossValidationResult— the verdict + per-engine table. - :class:
EstimandSpec, :class:EngineEstimate, :class:TolerancePolicy— building blocks, exposed for advanced use and testing.
EngineEstimate
dataclass
¶
One engine's answer for one focal coefficient.
Every backend adapter normalises its native result into this shape so the
reconciliation logic never has to special-case a library. status keeps
unavailable / failed engines in the data flow rather than dropping them.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
engine
|
str
|
Backend label, e.g. |
required |
estimand
|
str
|
The estimand key that was requested ( |
required |
term
|
str
|
Name of the focal coefficient this estimate refers to. |
None
|
coef
|
float
|
Point estimate and standard error of |
None
|
se
|
float
|
Point estimate and standard error of |
None
|
tstat
|
float
|
|
None
|
pvalue
|
float
|
|
None
|
ci_lower
|
float
|
|
None
|
ci_upper
|
float
|
|
None
|
nobs
|
int
|
|
None
|
vcov
|
str
|
Variance estimator flavour actually used ( |
None
|
status
|
str
|
One of |
STATUS_OK
|
message
|
str
|
Human-readable note (why it was unavailable, the exception text …). |
None
|
elapsed_s
|
float
|
|
None
|
extra
|
dict
|
Free-form backend extras (first-stage F, n_folds, full coef table …). |
dict()
|
TolerancePolicy
dataclass
¶
How close two engines should be on a focal coefficient, and why.
Attributes:
| Name | Type | Description |
|---|---|---|
mode |
str
|
|
coef_rtol, coef_atol |
float
|
Relative / absolute tolerance on the point estimate ( |
se_rtol |
float
|
Relative tolerance on the standard error ( |
se_band |
float
|
|
rationale |
str
|
Plain-language justification, surfaced in the report so a reader can see why a given tolerance was applied. |
CrossValidationResult ¶
Outcome of cross-validating one estimand across several engines.
Attributes:
| Name | Type | Description |
|---|---|---|
estimand |
str
|
|
term |
str
|
Focal coefficient that was reconciled. |
estimates |
list of EngineEstimate
|
Every engine that was requested (including unavailable / errored ones). |
agreement |
AgreementReport
|
Verdict + spread diagnostics. |
spec |
dict
|
Serialised :class: |
provenance |
dict
|
Engine versions / environment captured for reproducibility. |
degradations |
list of dict
|
Structured records for every engine that could not contribute, mirrored
from :func: |
Examples:
>>> import pandas as pd
>>> import statspai as sp
>>> df = pd.DataFrame(
... {"y": [1.0, 2.0, 3.0, 4.0], "x": [0.0, 1.0, 0.0, 1.0]}
... )
>>> cv = sp.cross_validate(
... df, "ols", formula="y ~ x", treatment="x", engines=["statspai"]
... )
>>> isinstance(cv, sp.CrossValidationResult)
True
>>> cv.term
'x'
estimates_table
property
¶
One row per requested engine (ok and not-ok alike).
engine_status_counts
property
¶
Count engines by status, including unavailable/error entries.
can_claim_cross_engine_agreement
property
¶
Whether it is honest to report cross-engine agreement.
plot ¶
Forest plot of the engines' estimates with shared-range shading.
EstimandSpec
dataclass
¶
Engine-neutral description of one estimand.
Either formula (fixest 1-3 part syntax) or the structured fields
(y + treatment + covariates …) must pin down the model. The
constructor helpers :meth:from_kwargs and :meth:from_result fill both
representations so downstream adapters can pick whichever they prefer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
estimand
|
str
|
Canonical estimand key ( |
required |
data
|
DataFrame
|
|
required |
formula
|
str
|
fixest-style: |
None
|
y
|
str
|
Outcome and focal regressor (the coefficient cross-validation reconciles by default). |
None
|
treatment
|
str
|
Outcome and focal regressor (the coefficient cross-validation reconciles by default). |
None
|
covariates
|
list of str
|
|
list()
|
fixed_effects
|
list of str
|
|
list()
|
endog
|
list of str
|
Endogenous regressors (IV). |
list()
|
instruments
|
list of str
|
|
list()
|
cluster
|
list of str
|
|
list()
|
weights
|
str
|
|
None
|
vcov
|
str
|
Requested variance estimator ( |
None
|
term
|
str
|
Focal coefficient to reconcile. Defaults to |
None
|
extra
|
dict
|
Estimand-specific extras forwarded verbatim (e.g. DiD |
dict()
|
from_result
classmethod
¶
Best-effort recovery of a spec from a fitted StatsPAI result.
Reads the metadata StatsPAI result objects commonly carry
(estimand / formula / data / treatment column). Raises a
clear error when the result does not expose enough to re-run it
elsewhere — rather than guessing and silently cross-validating the
wrong model.