Replication Workflow — Question → Estimate → Paper → Archive¶

One call to bundle data, code, environment, paper, citations, and per-number provenance into a submission-oriented replication archive. Built for the AEA / AEJ data-editor checklist out of the box.

This guide ties together the v1.7.2 export trinity:

sp.paper(...) / q.paper() — data → draft pipeline (Markdown, LaTeX, Quarto .qmd, Word).
sp.replication_pack(...) — draft → submission-oriented zip with manifest, hashes, environment lock, and lineage.
sp.Provenance / sp.attach_provenance() — per-number traceability back to the call that produced it.

Plus the surrounding glue:

sp.gt(result) — great_tables adapter for formatted HTML / LaTeX tables.
sp.csl_url(...) / sp.write_bib(...) — CSL hub + paper.bib writer for Quarto citation rendering.
sp.paper(..., llm='auto') — auto-propose a Causal DAG via LLM (see LLM-DAG setup guide).

When to use¶

You're submitting to AER / AEJ / Econometrica / QJE / RestStat / RestUd / JF / JPE and the data editor wants a self-contained replication archive.
You're an agent that needs to produce audit-grade empirical reports — every number traceable to a function call + parameter set + input data hash.
You want one source that compiles to PDF / HTML / DOCX / Beamer via Quarto, with auto-generated citations and an embedded Reproducibility appendix.

Quickstart — full pipeline in 4 lines¶

import statspai as sp
import pandas as pd

df = pd.read_csv("training_panel.csv")

# 1. Question → estimate → draft
draft = sp.paper(df, "effect of trained on wage",
                 treatment="trained", y="wage",
                 fmt="qmd")  # Quarto-native output

# 2. Draft -> submission-oriented replication archive
sp.replication_pack(draft, "submission.zip",
                    code="analysis.py")

Open submission.zip and you'll find:

submission.zip
├── MANIFEST.json          versions, timestamp, git SHA, per-file SHA-256
├── README.md              replication instructions
├── data/
│   ├── dataset.csv        the analysis frame
│   └── manifest.json      shape + dtypes + SHA-256
├── code/
│   └── script.py          your analysis script
├── env/
│   └── requirements.txt   from pip freeze (or importlib.metadata fallback)
├── paper/
│   ├── paper.qmd          Quarto source — `quarto render paper.qmd`
│   └── paper.bib          auto-emitted from estimator citations
└── lineage.json           per-result Provenance (function + params + data hash)

Hand the zip to a co-author or upload it to the journal's data repository — quarto render paper/paper.qmd reproduces your draft verbatim, and MANIFEST.json lets anyone verify the data is byte- identical to what you analyzed.

Two entry points¶

A. Natural-language path¶

When you want StatsPAI to infer the design from prose:

draft = sp.paper(df,
    "effect of training on wages, controlling for education",
    treatment="trained", y="wage",
    covariates=["edu", "experience"],
    fmt="qmd",
)

The question parser fills in any column hints you didn't pass explicitly (treatment / y / design); explicit kwargs always win.

B. Estimand-first (`sp.causal_question`) path¶

When you've pre-registered the analysis (Target Trial Protocol / PICOTS rubric) and want the paper to match the declaration verbatim:

q = sp.causal_question(
    treatment="trained",
    outcome="wage",
    data=df,
    population="manufacturing workers, 2018-2019",
    estimand="ATT",
    design="did",
    time="year", id="worker_id",
    covariates=["edu"],
    notes="Pre-registered 2026-04-15.",
)

# Method-style:
draft = q.paper(fmt="qmd")

# Or function-style dispatch:
draft = sp.paper(q, fmt="qmd")

The Question / Identification / Estimator / Results sections come straight from your declaration + q.identify() + q.estimate(), not from natural-language inference. Use this path when you want the draft's identification claims to match what was pre-registered with your IRB / journal preregistration.

Output formats¶

PaperDraft exposes four renderers (route via .write() extension or explicit method):

Format	Method	When
Markdown	`to_markdown()` / `.md`	Quick review, GitHub gist
Quarto	`to_qmd()` / `.qmd`	Formatted pipeline (recommended)
LaTeX	`to_tex()` / `.tex`	Direct overleaf submission
Word	`to_docx(path)` / `.docx`	Co-authors who only edit in Word

draft.write("paper.qmd")    # → quarto render paper.qmd
draft.write("paper.tex")    # → pdflatex paper.tex
draft.write("paper.docx")   # → opens in Word

The Quarto path is the strongest — one source compiles to PDF / HTML / DOCX / Beamer with cross-refs, citations, and a machine-readable provenance block in the YAML header.

Quarto integration¶

draft.to_qmd() emits:

---
title: "Causal Analysis Draft"
date: "2026-04-27"
subtitle: "effect of trained on wage"
format:
  pdf: default
  html: default
  docx: default
bibliography: "paper.bib"
csl: "american-economic-association.csl"
statspai:
  version: "1.7.2"
  run_id: "9c3aa1bf"
  data_hash: "5c64c6e6b67c"
---

## Question
...

Notable bits:

format: block lists every Quarto output you want (pdf, html, docx, beamer, ...). Override via draft.to_qmd(formats=["pdf", "beamer"]).
bibliography: auto-emits when draft.citations is non-empty; replication_pack writes the actual paper.bib next to the qmd.
csl: accepts short names — csl='aer' resolves to american-economic-association.csl. See the CSL section below.
statspai: block carries version / run_id / data_hash so any reader can audit "is this paper running on the same code + data I have?".

When the underlying result carries a _provenance (any of the 9 instrumented estimators — see provenance scorecard), the qmd auto-appends:

## Reproducibility {.appendix}

```text
Provenance
  function   : sp.did.callaway_santanna
  run_id     : 9c3aa1bf
  ...
  data       : SHA256:5c64c6e6b67c 1200×7
  params     :
    - y = 'wage'
    - g = 'first_treat'
    ...
```

Causal DAG appendix¶

Pass a DAG and the draft gains a Causal DAG section with edges, adjustment sets, back-door paths, and bad controls — rendered as text-art for markdown / LaTeX, mermaid for Quarto:

from statspai.dag.graph import DAG

g = DAG("trained -> wage; edu -> wage; edu -> trained")
draft = sp.paper(df, "effect of trained on wage",
                 treatment="trained", y="wage",
                 dag=g, fmt="qmd")

The qmd renders the DAG as a Quarto-native mermaid block:

## Causal DAG

```{mermaid}
%%| fig-cap: Declared causal DAG
graph LR
  trained --> wage
  edu --> wage
  edu --> trained
```

**Adjustment sets** (back-door criterion for `trained` → `wage`):
- {`edu`}

**Back-door paths** from `trained` to `wage`:
- `trained` — `edu` — `wage`

LLM-DAG auto-propose¶

When you don't have a hand-built DAG, ask an LLM to propose one:

# Set ANTHROPIC_API_KEY or OPENAI_API_KEY in your environment, then:
draft = sp.paper(df, "effect of trained on wage",
                 treatment="trained", y="wage",
                 llm="auto",                # opt-in
                 llm_domain="labor economics, training programmes",
                 fmt="qmd")

llm="auto" resolves a credential via the layered fallback (env var → explicit param → config file → terminal prompt → fail with concrete remediation), calls llm_dag_propose, and attaches the resulting DAG. Failures (no key, network error, malformed JSON) silently fall back to a no-DAG paper — auto-DAG never breaks the pipeline.

See the LLM-DAG setup guide for credential setup, provider choice, and configure_llm() persistence.

To pin the offline heuristic backend (no API call):

draft = sp.paper(..., llm="heuristic")

Cite style (CSL) and bibliography¶

StatsPAI auto-emits paper/paper.bib from estimator cite() strings inside replication_pack. To pick a journal style, pass csl= to to_qmd():

draft = q.paper(fmt="qmd")
qmd = draft.to_qmd(csl="aer")  # → american-economic-association.csl

Short names supported: aer, aeja, aejmac, aejmicro, aejpol, qje, econometrica, restat, restud, jpe, jf, chicago-author-date, apa. See sp.list_csl_styles() for the full list.

.csl files themselves are not bundled with StatsPAI (Zotero styles are CC-BY-SA-3.0, incompatible with our MIT license). Download once at project setup:

curl -O $(python -c "import statspai as sp; print(sp.csl_url('aer'))")
# → american-economic-association.csl in the current directory

Quarto resolves csl: "american-economic-association.csl" against that local copy.

For finer control, build the bib yourself:

sp.write_bib([
    "Callaway B, Sant'Anna PHC. (2021). DiD with multiple time periods. JoE.",
    "Imbens GW (2004). Nonparametric estimation of ATEs.",
], "paper.bib")

Numerical lineage / Provenance¶

Every result from an instrumented estimator carries a _provenance dataclass:

r = sp.callaway_santanna(df, y="y", g="g", t="t", i="i")
prov = sp.get_provenance(r)
print(prov.short())
# → sp.did.callaway_santanna · data:48b58dd2b436 · run:c8bdcc04
print(prov.params)
# → {'y': 'y', 'g': 'g', 't': 't', 'i': 'i', 'estimator': 'dr',
#    'control_group': 'nevertreated', 'base_period': 'universal', ...}

Provenance flows into replication_pack automatically:

synth_r = sp.synth(df, ...)
rd_r = sp.rdrobust(df_rd, y="y", x="x", c=0)

rp = sp.replication_pack([synth_r, rd_r], "out.zip",
                         data=df, code="analysis.py")
# rp.output_path / lineage.json now contains both runs.

lineage.json shape:

{
  "n_runs": 2,
  "runs": {
    "9c3aa1bf...": {
      "function": "sp.synth",
      "params": {"outcome": "gdp", "method": "augmented", ...},
      "data_hash": "5c64c6e6b67c",
      "run_id": "9c3aa1bf...",
      "statspai_version": "1.7.2",
      "python_version": "3.11.5",
      "timestamp": "2026-04-27T15:34:55"
    },
    "1874e42d...": {...}
  },
  "data_inputs": [
    {"hash": "5c64c6e6b67c",
     "consumers": [{"function": "sp.synth", "run_id": "9c3aa1bf..."}]}
  ],
  "statspai_version": "1.7.2",
  "python_version": "3.11.5"
}

Aggregation chain (DiD `aggte`)¶

sp.did.aggte() is chain-aware — its Provenance.params records both the aggregation choice (type='simple' / 'dynamic' / ...) and the upstream Callaway-Sant'Anna run that produced its input ATTs:

cs = sp.callaway_santanna(df, y="y", g="g", t="t", i="i")
agg = sp.did.aggte(cs, type="dynamic")
prov = sp.get_provenance(agg)
print(prov.params["upstream_run_id"])     # → '9c3aa1bf'
print(prov.params["upstream_function"])   # → 'sp.did.callaway_santanna'

So lineage.json traces the full chain: aggregate → producing CS run → input data hash.

Provenance scorecard¶

As of v1.7.2, 142 estimators are instrumented (>15× original 9-baseline):

Estimator	Phase
`sp.regress`	P3
`sp.callaway_santanna`	P3
`sp.did_2x2`	P3
`statspai.regression.iv.iv`	P3
`sp.synth` (13-method dispatcher)	P4
`sp.did.did_imputation`	P4
`sp.did.aggte` (chain-aware)	P4
`sp.did.did_multiplegt`	P4
`sp.rd.rdrobust`	P4
`sp.cic` (Athey-Imbens 2006)	P7
`sp.cohort_anchored_event_study` (arXiv:2509.01829)	P7
`sp.design_robust_event_study` (Wright 2026, 2601.18801)	P7
`sp.gardner_did` / `sp.did_2stage`	P7
`sp.harvest_did` (Borusyak et al. 2025)	P7
`sp.did_misclassified` (arXiv:2507.20415)	P7
`sp.stacked_did` (Cengiz et al. 2019)	P7
`sp.wooldridge_did` (Wooldridge 2021 ETWFE)	P7
`sp.etwfe` (4-branch dispatcher, wrap pattern)	P7
`sp.drdid` (Sant'Anna-Zhao 2020 DR)	P7
`sp.rd_honest` (Armstrong-Kolesar 2018, 2020)	P7
`sp.rkd` (Card et al. 2015 Regression Kink)	P7
`sp.liml` (LIML / Fuller)	P8
`sp.jive` (legacy single-method JIVE)	P8
`sp.lasso_iv` (Belloni-Chen-Chernozhukov-Hansen 2012)	P8
`sp.iv.bayesian_iv` (Chernozhukov-Hong 2003 AR)	P8
`sp.iv.jive1` (Angrist-Imbens-Krueger 1999)	P8
`sp.iv.ujive` (Kolesar 2013)	P8
`sp.iv.ijive` (Ackerberg-Devereux 2009)	P8
`sp.iv.rjive` (Hansen-Kozbur 2014 ridge-JIVE)	P8
`sp.iv.mte` (Brinch-Mogstad-Wiswall 2017)	P8
`sp.match` (matching dispatcher)	P8
`sp.optimal_match` (Hungarian 1:1)	P8
`sp.cardinality_match` (Zubizarreta 2014 LP)	P8
`sp.genmatch` (Diamond-Sekhon 2013 genetic)	P8
`sp.sbw` (Zubizarreta 2015 Stable Balancing Weights)	P8
`sp.dml` (Chernozhukov et al. 2018 DML dispatcher)	P8
`sp.tmle` (van der Laan-Rose Targeted MLE)	P9
`sp.tmle.ltmle` (Longitudinal TMLE)	P9
`sp.tmle.hal_tmle` (TMLE with HAL nuisance)	P9
`sp.causal_forest` (GRF causal forest)	P9
`sp.multi_arm_forest` (Athey-Tibshirani-Wager)	P9
`sp.iv_forest` (IV causal forest)	P9
`sp.metalearner` (S/T/X/R/DR dispatcher)	P9
`sp.bcf` (Hahn-Murray-Carvalho Bayesian Causal Forest)	P9
`sp.aipw` (Augmented IPW, doubly robust)	P9
`sp.ipw` (Inverse Probability Weighting)	P9
`sp.g_computation` (parametric g-formula)	P9
`sp.front_door` (Pearl front-door adjustment)	P9
`sp.panel` (multi-method panel dispatcher, wrap pattern)	P10
`sp.causal_impact` (Brodersen et al. 2015 BSTS)	P10
`sp.mediate` (Imai-Keele-Tingley)	P10
`sp.mediate_interventional` (VanderWeele 2014)	P10
`sp.bartik` (Goldsmith-Pinkham-Sorkin-Swift 2020)	P10
`sp.decompose` (Oaxaca / FFL / DFL / RIF dispatcher)	P10
`sp.spatial.spatial_did` (spatial-lag DiD + spillover)	P11
`sp.spatial.spatial_iv` (spatial 2SLS)	P11
`sp.qte.dist_iv` (distributional IV / quantile LATE)	P11
`sp.qte.beyond_average_late` (quantile LATE, fuzzy)	P11
`sp.qte.qte_hd_panel` (HD panel QTE via LASSO)	P11
`sp.bootstrap` (general-purpose bootstrap)	P11
`sp.conformal_cate` (conformal CATE intervals)	P11
`sp.balke_pearl` (Balke-Pearl ATE bounds)	P12
`sp.lee_bounds` (Lee 2009 trimming bounds)	P12
`sp.manski_bounds` (Manski 1990 worst-case)	P12
`sp.fisher_exact` (Fisher randomization test)	P12
`sp.imputation.mice` (Multiple Imputation Chained Eq.)	P12
`sp.kaplan_meier` (KM survival)	P13
`sp.cox` (Cox proportional hazards)	P13
`sp.survival.aft` (Accelerated Failure Time)	P13
`sp.survival.cox_frailty` (Cox + gamma frailty)	P13
`sp.survival.causal_survival_forest`	P13
`sp.iv.kernel_iv` (Singh-Sahani-Gretton kernel IV)	P13
`sp.iv.npiv` (sieve nonparametric IV)	P13
`sp.iv.many_weak_jive` (Phillips-Hale 2018 JIVE)	P13
`sp.iv.many_weak_ar` (Mikusheva-Sun 2024 AR-CS)	P13
`sp.iv.continuous_iv_late` (quantile-bin Wald)	P13
`sp.timeseries.arima` (ARIMA / SARIMAX)	P14
`sp.timeseries.garch` (GARCH(p,q) MLE)	P14
`sp.timeseries.its` (interrupted time series)	P14
`sp.timeseries.local_projections` (Jordà 2005 IRF)	P14
`sp.mccrary_test` (RD density manipulation)	P14
`sp.rddensity` (CJM 2020 density test)	P14
`sp.pate` (population ATE; Hartman-Hidalgo 2018)	P15
`sp.jackknife_se` (cluster jackknife variance)	P15
`sp.cr2_se` (Bell-McCaffrey 2002 CR2)	P15
`sp.proximal.proximal` (linear 2SLS PCI)	P16
`sp.proximal.bidirectional_pci`	P16
`sp.proximal.pci_mtp` (modified treatment policy)	P16
`sp.gformula.ice` (parametric g-formula)	P16
`sp.gformula.gformula_mc` (Monte-Carlo g-formula)	P16
`sp.msm` (Marginal Structural Model, IPTW)	P16
`sp.conformal_causal.conformal_debiased_ml`	P17
`sp.conformal_causal.conformal_density_ite`	P17
`sp.conformal_causal.conformal_fair_ite`	P17
`sp.conformal_causal.conformal_continuous`	P17
`sp.transport.transport_weights`	P17
`sp.target_trial.emulate`	P17
`sp.target_trial.clone_censor_weight`	P17
`sp.dose_response.vcnet` (Varying-coefficient DR)	P17
`sp.mendelian.mr_mode` (Mendelian Randomization mode)	P18
`sp.bunching.kink_unified` (RDD+RKD+bunching)	P18
`sp.censoring.ipcw` (IPCW weights)	P18
`sp.surrogate.surrogate_index` (Athey-Chetty-Imbens)	P18
`sp.panel.panel_fgls` (FGLS panel)	P19
`sp.timeseries.bvar` (Minnesota-prior Bayesian VAR)	P19
`sp.causal_discovery.fci` (Fast Causal Inference)	P19
`sp.causal_discovery.ges` (Greedy Equivalence Search)	P19
`sp.causal_discovery.lingam` (LiNGAM)	P19
`sp.causal_discovery.dynotears` (dynamic NOTEARS)	P19
`sp.causal_text.text_treatment_effect` (Veitch-Wang-Blei)	P20
`sp.neural_causal.gnn_causal` (GCN-AIPW under network)	P20
`sp.fairness.demographic_parity`	P20
`sp.epi.bradford_hill` (Bradford-Hill viewpoints)	P21
`sp.epi.odds_ratio` (2×2 OR with Woolf/MH/Fisher)	P21
`sp.bridge.did_sc_bridge` (DiD vs SC bridge)	P21
`sp.interference.network_exposure` (Aronow-Samii)	P21
`sp.interference.peer_effects` (linear-in-means 2SLS)	P21
`sp.bridge.dr_calib_bridge` (DR-calibration bridge)	P22
`sp.bridge.cb_ipw_bridge` (IPW vs entropy-balancing)	P22
`sp.causal_rl.causal_dqn` (confounding-robust Q-learning)	P22
`sp.causal_rl.causal_bandit` (Bareinboim-Pearl bandit)	P22
`sp.matrix_completion.mc_panel` (Athey et al. 2021)	P22
`sp.sun_abraham` (Sun-Abraham 2021 ES)	P23
`sp.did.ddd` (Triple Differences)	P23
`sp.did.did_bcf` (Forests for Differences DiD)	P23
`sp.did.event_study` (TWFE event study)	P23
`sp.mediation.four_way_decomposition`	P23
`sp.mediation.mediate_sensitivity`	P23
`sp.principal_strat.survivor_average_causal_effect`	P23
`sp.spatial.sar` (Spatial Autoregressive)	P24
`sp.spatial.sem` (Spatial Error Model)	P24
`sp.spatial.sdm` (Spatial Durbin Model)	P24
`sp.bunching.general_bunching` (high-order bunching)	P24
`sp.selection.stepwise` (stepwise variable selection)	P24
`sp.selection.lasso_select` (LASSO variable selection)	P24
`sp.timeseries.engle_granger` (cointegration test)	P25
`sp.timeseries.johansen` (cointegration rank)	P25
`sp.mendelian.mr_heterogeneity` (Cochran Q / Rücker Q')	P25
`sp.ope.sharp_ope_unobserved` (Kallus-Mao-Uehara 2025)	P26
`sp.ope.direct_method` (DM plug-in OPE)	P26
`sp.conformal_causal.conformal_counterfactual`	P26
`sp.conformal_causal.conformal_ite_interval`	P26

The remaining ~783 estimators are scheduled for v1.7.3+ rollouts. To check whether a specific estimator is instrumented:

r = sp.your_estimator(df, ...)
print(sp.get_provenance(r))   # None if not yet instrumented

Tables — `sp.gt(result)` great_tables adapter¶

For formatted HTML / LaTeX tables, pipe a RegtableResult through Posit's great_tables:

import statspai as sp

m = sp.feols("wage ~ trained + edu | year + worker_id", df)
rt = sp.regtable(m, template="aer", title="Returns to Training")

g = sp.gt(rt)            # great_tables.GT instance
g.as_raw_html()          # → embed in Quarto / HTML
g.as_latex()             # → \begin{table}...\end{table}

sp.gt() accepts:

RegtableResult — full-fidelity (title / notes / journal preset → gt theme).
PaperTables — multi-panel with row groups.
MeanComparisonResult — flattens via to_dataframe().
DataFrame — wraps verbatim with optional rowname_col=.
Any object with to_dataframe() — duck-typed.

great_tables is an optional dependency. Install with pip install great_tables — the wider StatsPAI stack imports cleanly without it; only sp.gt(...) requires it at call time.

Recipes¶

AEA submission¶

import statspai as sp
import pandas as pd

df = pd.read_stata("nlsw88.dta")

q = sp.causal_question(
    treatment="union", outcome="wage",
    data=df, design="did",
    time="year", id="idcode",
    covariates=["age", "edu"],
    estimand="ATT",
    notes="Pre-registered for AER replication review.",
)
draft = q.paper(fmt="qmd")

# Use the AER CSL style:
draft.to_qmd(csl="aer")  # already wired by replication_pack below

sp.replication_pack(
    draft,
    "aer-submission.zip",
    code="analysis.py",
    title="Returns to Union Membership",
    paper_format="qmd",
)

Then locally:

unzip aer-submission.zip -d aer-submission/
cd aer-submission
curl -O $(python -c "import statspai as sp; print(sp.csl_url('aer'))")
quarto render paper/paper.qmd

AEJ: Applied submission with DAG¶

Same as AER but with an explicit DAG and AEJ CSL:

from statspai.dag.graph import DAG

g = DAG("union -> wage; age -> wage; age -> union; edu -> wage")
draft = q.paper(fmt="qmd", dag=g)
draft.to_qmd(csl="aeja")  # AEJ uses the AER style file

sp.replication_pack(draft, "aeja-submission.zip",
                    code="analysis.py")

Auditable agent run¶

For an autonomous-agent context where every number must be traceable:

# Agent: 50-line script.
draft = sp.paper(df, query, treatment=t, y=y, fmt="qmd")
rp = sp.replication_pack(
    draft, f"runs/{run_id}.zip",
    code=__file__,           # capture this script verbatim
    title=f"Run {run_id}",
)
print(rp.summary())
# ReplicationPack
# ===============
#   Path     : /runs/abc123.zip
#   Files    : 8
#   StatsPAI : v1.7.2
#   Created  : 2026-04-27T15:34:55

Each lineage.json then ties any reported number back to the exact function / params / data hash that produced it — auditable months later, by a different reviewer, with no shared session state.

What `sp.paper()` does NOT do¶

It does not run a hyperparameter sweep — it picks the recommendation from sp.recommend(...). Use sp.spec_curve(...) for multiverse analysis and pass the resulting summary into extra_files= on replication_pack.
It does not call any LLM by default. Pass llm="auto" to opt in; without it, no network call ever fires.
It does not verify your CSL file exists — Quarto reports the error at render time. Run quarto render once locally before shipping.