statspai.question¶
question ¶
Estimand-first causal question DSL (sp.causal_question).
The article emphasizes "causal question precedes statistical model" as the common foundation of all three causal-inference schools: econometrics' identification, epidemiology's target trial protocol, and ML's estimand-aware learning.
This module lets a user declare a causal question in one place, then automatically:
- Identify the appropriate research design (IV / DiD / RD / backdoor).
- Suggest the right StatsPAI estimator.
- Run the analysis and attach diagnostics + sensitivity.
- Produce a reproducible Methods paragraph.
import statspai as sp q = sp.causal_question( ... treatment="minimum_wage_hike", ... outcome="employment", ... estimand="ATT", ... design="policy_shock", ... data=df, ... time_structure="panel", ... covariates=["industry", "skill"], ... ) q.identify() r = q.estimate() q.report()
CausalQuestion
dataclass
¶
Pre-registered causal question declaration.
Fields map directly onto the Target Trial Protocol (Hernán 2016) and the "PICOTS + identification" rubric the article describes.
save ¶
Save the question to a pre-registration file.
See :func:statspai.question.preregister.preregister for details.
load
classmethod
¶
Load a CausalQuestion from a preregistration file.
identify ¶
identify() -> IdentificationPlan
Choose an estimator based on the declared design / estimand.
estimate ¶
estimate(**kwargs) -> EstimationResult
Execute the identification plan against self.data.
paper ¶
paper(*, fmt: str = 'markdown', output_path: Optional[str] = None, dag: Any = None, include_robustness: bool = True, cite: bool = True, reviewer_mode: bool = False)
Build a full :class:PaperDraft from this declared question.
Convenience wrapper around :func:statspai.paper_from_question.
Calls identify() and estimate() on demand, then assembles
a Question / Data / Identification / Estimator / Results /
Robustness / References draft. Renders to markdown by default;
pass fmt='qmd' for a Quarto document with statspai
provenance and an auto-appended Reproducibility appendix.
Examples:
IdentificationPlan
dataclass
¶
Output of :meth:CausalQuestion.identify.
Describes which estimator is planned, why it is identifying, and which assumptions the user must defend.
EstimationResult
dataclass
¶
Unified view of a causal-question estimate.
Thin wrapper that preserves the underlying estimator's full result
object while exposing a canonical estimate / se / ci interface.
causal_question ¶
causal_question(treatment: str, outcome: str, *, data: Optional[DataFrame] = None, population: str = '', estimand: str = 'ATE', design: str = 'auto', time_structure: str = 'cross_section', time: Optional[str] = None, id: Optional[str] = None, covariates: Optional[Sequence[str]] = None, instruments: Optional[Sequence[str]] = None, running_variable: Optional[str] = None, cutoff: Optional[float] = None, cohort: Optional[str] = None, notes: str = '') -> CausalQuestion
Declare a causal question (see :class:CausalQuestion).
Supported design values
Classical / quasi-experimental:
- 'rct' — randomised assignment; OLS ATE.
- 'iv' / 'natural_experiment' — 2SLS / LATE.
- 'regression_discontinuity' — local polynomial RD.
- 'did' / 'event_study' — difference-in-differences.
- 'synthetic_control' — convex-hull weighting.
- 'longitudinal_observational' — MSM / g-formula.
- 'selection_on_observables' — AIPW (default).
ML-based selection-on-observables (v1.13+):
- 'dml' — Double/debiased ML for ATE / LATE
[chernozhukov2018double].
- 'tmle' — Targeted Maximum Likelihood with Super Learner
[vanderlaan2006targeted].
- 'metalearner' — S/T/X/R/DR-Learner for tau(x);
population ATE summary via AIPW influence function
[kunzel2019metalearners; nie2021quasi].
- 'causal_forest' — honest random forest for tau(x)
[athey2019generalized; wager2018estimation]; population
ATE inference uses cross-fit AIPW
[vanderlaan2003unified; chernozhukov2018double].
All bib keys above resolve in paper.bib.
load_preregister ¶
load_preregister(filename: Union[str, Path]) -> CausalQuestion
Load a pre-registration file back into a :class:CausalQuestion.
Deviations and metadata are preserved on the returned object via
the .notes field (concatenated).