StatsPAI¶
Validation-tiered Python workflows for causal inference and applied
econometrics. One import statspai as sp exposes 1,000+ registered
functions across 81 submodules (live count: python
scripts/registry_stats.py) spanning classical regression, staggered
DiD, regression discontinuity,
synthetic control, decomposition, stochastic frontier, multilevel /
mixed-effects, modern ML causal inference, the full three-school
(Econometrics / Epidemiology / ML) toolkit, 2025-2026 research-frontier
modules (bridging theorems, fairness, surrogates, PCMCI, TMLE survival,
etc.), and manuscript/reporting output in Word / Excel / LaTeX / HTML.
Current release: v1.16.0 (2026-05-29) — correctness fixes (
sp.qreg√n standard-error fix and ansp.xtabondArellano–Bond GMM rebuild, both ⚠️ correctness — re-run affected analyses) plus a Track A cross-language parity expansion from 36 to 51 R-aligned modules (Stata reference for 43 of the R-joined rows, plus one Py-Stata-onlyxtabondmigration check). See the changelog for detail.
import statspai as sp
# One-call DiD pipeline with sensitivity + export
rpt = sp.cs_report(data, y='y', g='g', t='t', i='id',
n_boot=500, random_state=0,
save_to='~/study/cs_v1')
What's inside¶
Release highlights (v0.9.17 → v1.5.0)¶
| Release | Focus | Headline |
|---|---|---|
| v1.5.0 | Interference / Conformal / Mendelian family consolidation | Three family guides (interference_family, conformal_family, mendelian_family) covering all 36 functions; three unified dispatchers sp.mr(method=...) / sp.conformal(kind=...) / sp.interference(design=...) with 91 aliases in total; two silent-wrong-numbers fixes — mr_egger slope t(n−2) parity with mr_pleiotropy_egger (anti-conservative CIs at small n_snps before fix) and mr_presso MC p-value floor at 1/(B+1) (no more p = 0). Breaking: sp.mr is now a dispatcher function, not a module alias. Registry coverage fixes for 5 previously-unregistered family functions. |
| v1.4.2 | Correctness patches + Proximal / QTE / Causal-RL family guides | sp.dml_model_averaging √n SE scaling bug (CIs were √n × too wide) + sp.gardner_did event-study reference-category contamination (pre-trend bias ~0.3). Three family guides. No breaking changes. |
| v1.4.1 | v3-frontier Sprint 3 (AKM SE + Claude thinking + test suites + docs) | sp.shift_share_political_panel(cluster='shock') — panel-extended Adão-Kolesár-Morales (2019) shock-cluster variance (Park-Xu 2026 §4.2); sp.causal_llm.anthropic_client(thinking_budget=N) — Claude 4.5 / Opus 4.7 extended-thinking API; 10-check assimilation parity suite + 11-test MAS integration suite with 3 Claude thinking block-splitter tests; two new MkDocs guides (shift_share_political_panel, causal_mas). Strictly additive over v1.4.0. |
| v1.4.0 | v3-frontier Sprint 2 (extensions + LLM SDK + docs) | shift_share_political_panel (Park-Xu 2026 multi-period); real LLM adapters openai_client / anthropic_client / echo_client for Causal MAS; particle_filter backend for assimilative_causal (non-Gaussian / nonlinear); three new MkDocs guides (synth_experimental, harvest_did, assimilative_ci); 20 unused-import cleanups; CausalForest parity-test de-flake. |
| v1.3.0 | v3-frontier sprint (Sprint 1 of 知识地图 v3) | 11 frontier methods: Abadie-Zhao inverse synthetic experimental design, CJM RBC bootstrap for rdrobust, Kwak-Pleasants evidence-without-injustice fairness test, JAMA/BMJ TARGET manuscript renderer, Borusyak harvest-DiD, Zorzetto ordinal / factor-exposure BCF, multi-agent causal discovery (causal_mas), Park-Xu political shift-share IV, state-space causal_kalman. 35 new tests, 869 registered functions, zero regressions. tabulate promoted to core dep. |
| v1.0.1 | Post-review correctness + NEEDS_VERIFICATION closeout | All Critical / High / Medium findings from the independent code-review-expert pass on v1.0 frontier modules fixed and pinned by regression tests. Abadie κ-weighted complier QTE now implemented for beyond_average_late; bridge.surrogate_pci path B now a genuine dual-path (arm-specific counterfactual bridge), not OLS-tautology. 2 706+ tests passing. |
| v1.0.0 | Research-frontier capstone | sp.bridge (6 bridging theorems), sp.fairness, sp.surrogate, mr_multivariable/mr_mediation/mr_bma, PCMCI/LPCMCI/DYNOTEARS, conformal frontiers (debiased/density/fair/multi-DP), proximal frontiers, sequential SDID, BCF longitudinal, LTMLE survival, ML bounds, JAMA/BMJ 2025 TARGET Statement 21-item reporting checklist. |
| v0.9.17 | Three-school completion | sp.epi (OR/RR/MH/standardization/Bradford-Hill/ROC/kappa), sp.longitudinal (MSM/g-formula/IPW unified + safe regime DSL), sp.question (estimand-first DSL), full MR diagnostic suite, DAG recommend_estimator(), unified result.sensitivity(), preregister() + load_preregister(). |
| v0.9.3 | Frontier + Multilevel + GLMM + Trinity | sp.frontier / sp.xtfrontier full Stata/R parity; sp.zisf, sp.lcsf, sp.malmquist; sp.mixed lme4-grade; GLMMs with AGHQ; sp.dml(model='pliv'), sp.mixlogit, sp.ivqreg; sp.verify posterior verification. |
| v0.9.2 | Decomposition | 18 methods under sp.decompose(method=...). |
| v0.9.1 | Regression discontinuity | 18+ estimators across 14 modules. |
| v0.9.0 | Synthetic control | 20 SCM estimators + 6 inference strategies. |
Methodological coverage¶
Regression & panel. OLS / IV / panel / GLM; fixed-effect high-
dimensional estimation; GMM; quantile regression; instrumental-variable
quantile regression (sp.ivqreg); mixed logit (sp.mixlogit).
Difference-in-differences (10+ variants).
sp.callaway_santanna (DR/IPW/REG), sp.aggte with Mammen uniform
bands, sp.sun_abraham, sp.bjs (Borusyak-Jaravel-Spiess imputation),
sp.dcdh (de Chaisemartin-D'Haultfoeuille), sp.etwfe,
sp.goodman_bacon; sensitivity via sp.honest_did, sp.breakdown_m;
one-call sp.cs_report with Markdown / LaTeX / Excel export.
Regression discontinuity (18+ estimators).
sp.rdrobust (CCT sharp/fuzzy/kink with bias-corrected robust CI),
sp.rd2d (2D/boundary), sp.rkd, sp.rdit, multi-cutoff and
multi-score designs, sp.rdhonest (Armstrong-Kolesar), local
randomization (sp.rdrandinf, sp.rdwinselect, sp.rdsensitivity),
sp.cjm_density, ML-based CATE (sp.rd_forest, sp.rd_boost,
sp.rd_lasso), Angrist-Rokkanen extrapolation, sp.rdpower,
sp.rdsampsi, bundled sp.rdsummary dashboard.
Synthetic control (20 estimators).
Classical SCM, SDID, Augmented SCM (ASCM), Bayesian SCM (MCMC), BSTS
and CausalImpact (Kalman smoother), Penalized SCM (Abadie-L'Hour),
Forward-DID, cluster SCM, sparse (LASSO) SCM, kernel and kernel-ridge
SCM, staggered synthetic control, multi-outcome SCM; research workflow:
sp.synth_compare, sp.synth_recommend, sp.synth_power,
sp.synth_mde, sp.synth_sensitivity, sp.synth_report.
Decomposition analysis (18 methods).
Mean: sp.oaxaca (5 reference coefficients), sp.gelbach,
sp.fairlie, sp.bauer_sinning, sp.yun_nonlinear.
Distributional: sp.rifreg, sp.ffl_decompose, sp.dfl_decompose,
sp.machado_mata, sp.melly_decompose, sp.cfm_decompose.
Inequality: sp.subgroup_decompose, sp.shapley_inequality,
sp.source_decompose.
Demographic: sp.kitagawa_decompose, sp.das_gupta.
Causal: sp.gap_closing, sp.mediation_decompose,
sp.disparity_decompose. Unified entry: sp.decompose(method=…).
Stochastic frontier (v0.9.3).
sp.frontier cross-sectional with half-normal / exponential /
truncated-normal, heteroskedastic usigma / vsigma, Battese-Coelli
(1995) determinants emean, Battese-Coelli (1988) TE and JLMS,
Kodde-Palm LR mixed-\(\bar\chi^2\) test, bootstrap unit-efficiency CI.
sp.xtfrontier panel with Pitt-Lee (1981), BC92 time-decay, BC95, Greene
(2005) TFE/TRE with Dhaene-Jochmans (2015) jackknife. sp.zisf,
sp.lcsf, sp.malmquist (M = EC × TC), sp.translog_design.
Multilevel / mixed-effects (v0.9.3).
sp.mixed linear mixed models with unstructured G default, three-level
nested, BLUP posterior SEs, Nakagawa-Schielzeth \(R^2\). GLMMs
(sp.melogit, sp.mepoisson, sp.meglm, sp.megamma, sp.menbreg,
sp.meologit) via Laplace or adaptive Gauss-Hermite quadrature
(nAGQ>1 matches Stata intpoints() and R lme4::glmer).
sp.icc with delta-method CI; sp.lrtest with Self-Liang boundary
correction.
Modern ML causal.
Double/debiased ML (sp.dml with PLR / IRM / PLIV); causal forests;
meta-learners (S / T / X / R / DR); TMLE and Super Learner; neural
causal (TARNet, CFRNet, DragonNet); causal discovery (NOTEARS, PC,
LiNGAM, GES); policy trees; Bayesian causal forests; matrix
completion; conformal causal inference; dose-response; dynamic-
treatment regimes; interference / spillover.
Spatial, time-series, survival, survey, bunching, Mendelian. Spatial econometrics (weights, ESDA, ML/GMM, GWR/MGWR, spatial panel); time-series (ARIMA, VAR, BVAR, GARCH, cointegration, local projections, structural break); survival (Cox, AFT, frailty); survey calibration and complex-survey regression; bunching; Mendelian randomization.
Sensitivity analysis.
Oster bounds; sensemakr; E-values; Rosenbaum bounds; Manski bounds;
sp.spec_curve() specification curve analysis;
sp.robustness_report() one-call battery.
Smart Workflow¶
# Recommend estimators + run posterior verification
rec = sp.recommend(df, outcome='y', treatment='d', verify=True)
rec.summary() # ranked estimators with verify_score
rec.plot('verify_radar') # visual stability check
Agent-facing API¶
Registered functions are discoverable programmatically:
sp.list_functions(category='did') # enumerate methods
sp.describe_function('rdrobust') # natural-language description
sp.function_schema('dml') # JSON schema: args, types, returns
Installation¶
pip install statspai # core
pip install 'statspai[plotting]' # matplotlib + seaborn
pip install 'statspai[fixest]' # pyfixest HDFE
pip install 'statspai[deepiv]' # PyTorch (Deep IV, TARNet)
pip install 'statspai[text]' # sentence-transformers for sbert
Citation¶
If you use StatsPAI in research, please cite the underlying papers
implemented by each estimator — sp.citation() returns the package entry, and
many result objects expose a .cite() method for the estimator-level
reference — together with this package:
@software{wang2026statspai,
author = {Wang, Biaoyue and Rozelle, Scott},
title = {StatsPAI: Validation-Tiered Causal Inference and
Econometrics Workflows for Python},
year = {2026},
version = {1.16.0},
url = {https://github.com/brycewang-stanford/StatsPAI}
}
Further reading¶
- Changelog — release history, including the critical frontier correctness fix in v0.9.3.
- Choosing a DID estimator — how to pick between TWFE / CS / Sun-Abraham / BJS / multiple-groups DID.
- Callaway–Sant'Anna staggered DID —
end-to-end tutorial with
cs_report()and honest sensitivity. - Synth guide — synthetic control with inference and research workflow.
- GitHub — source, issues, and API reference.