Skip to content

statspai.synth

synth

Synthetic Control module for StatsPAI.

Unified entry point: synth(method=...) dispatches to all variants.

Variants (20 methods)
  • classic — Abadie, Diamond & Hainmueller (2010)
  • penalized / ridge — Ridge-penalised SCM
  • demeaned / detrended — Ferman & Pinto (2021)
  • unconstrained / elastic_net — Doudchenko & Imbens (2016)
  • augmented / ascm — Ben-Michael, Feller & Rothstein (2021)
  • sdid — Arkhangelsky, Athey, Hirshberg, Imbens & Wager (2021)
  • factor / gsynth — Xu (2017)
  • staggered — Ben-Michael, Feller & Rothstein (2022)
  • mc / matrix_completion — Athey, Bayati et al. (2021)
  • discos / distributional — Gunsilius (2023)
  • multi_outcome — Sun (2023)
  • scpi / prediction_interval — Cattaneo, Feng & Titiunik (2021)
  • bayesian — Bayesian SCM with MCMC posterior (Vives & Martinez 2024)
  • bsts / causal_impact — Bayesian Structural Time Series (Brodersen et al. 2015)
  • penscm / abadie_lhour — Penalized SCM with pairwise discrepancy (Abadie & L'Hour 2021)
  • fdid / forward_did — Forward DID with optimal donor selection (Li 2024)
  • cluster — Cluster SCM with donor grouping (Rho et al. 2025, arXiv:2503.21629) [@rho2025clustersc]
  • sparse / lasso — Sparse SCM with L1 penalties (Amjad, Shah & Shen 2018)
  • kernel / kernel_ridge — Kernel-based nonlinear SCM
Inference
  • placebo — in-space permutation (default)
  • conformal — Chernozhukov, Wüthrich & Zhu (2021)
  • bootstrap / jackknife — for SDID
  • prediction intervals — Cattaneo et al. (2021)
  • bayesian posterior — full posterior credible intervals (Bayesian SCM)
  • bsts posterior — Bayesian structural time series uncertainty
Diagnostics
  • synth_sensitivity() — comprehensive robustness suite
  • synth_loo() — leave-one-out donor analysis
  • synth_time_placebo() — backdating tests
  • synth_donor_sensitivity() — donor pool variation
  • synth_rmspe_filter() — pre-RMSPE robustness

SyntheticControl

Canonical Synthetic Control estimator (Abadie, Diamond & Hainmueller 2010) with nested V-W optimization.

Parameters:

Name Type Description Default
data DataFrame

Long-format panel (unit, time, outcome, ...).

required
outcome str

Column names.

required
unit str

Column names.

required
time str

Column names.

required
treated_unit any

Identifier of the treated unit.

required
treatment_time any

First treatment period (inclusive).

required
covariates list of str

Column names whose pre-treatment means are used as predictors for the V-weighted matching problem.

None
special_predictors list of tuple

R/Stata Synth-style predictor specifications. Each entry is (column, period_spec, op) where period_spec is a scalar year, a list of years, or a slice(start, stop) (inclusive), and op is 'mean' or 'sum'. When omitted together with covariates, the pre-treatment outcome vector itself is used as the predictor (V has no identifying power and is fixed to the identity, following Kaul et al. 2015).

None
v_method (auto, nested, equal)

'auto' → nested V-W when covariates / special predictors are supplied, equal V otherwise. 'nested' forces the outer V optimisation even when only Y lags are used (note: the outer problem is then under-identified, per Kaul et al. 2015). Equal V reduces to the outcome-only simplex LS estimator.

'auto'
standardize_predictors bool

Rescale predictors to unit range before the V optimization.

True
n_random_starts int

Additional random Dirichlet starts for the outer V optimiser.

4
penalization float

Ridge penalty on donor weights.

0.0
alpha float

Significance level for confidence intervals.

0.05

fit

fit(placebo: bool = True) -> CausalResult

Fit the Synthetic Control model.

Parameters:

Name Type Description Default
placebo bool

Run in-space placebo tests across all donor units.

True

Returns:

Type Description
CausalResult

SynthComparison

Structured container for multi-method SCM comparison results.

Attributes:

Name Type Description
results dict

Mapping of method name to CausalResult.

comparison_table DataFrame

Side-by-side metrics for every successful method, sorted by pre_rmspe ascending.

recommended str

Name of the recommended method.

recommendation_reason str

Human-readable justification.

summary

summary() -> str

Return a formatted multi-line summary string.

Returns:

Type Description
str

plot

plot(**kwargs) -> Any

Overlay all results using synthplot(type='compare').

Parameters:

Name Type Description Default
**kwargs

Forwarded to synthplot.

{}

Returns:

Type Description
matplotlib Figure or Axes

to_latex

to_latex(**kwargs) -> str

Render the comparison as a LaTeX table.

Forwards to :func:statspai.synth.exports.synth_to_latex with the side-by-side multi-method layout.

Parameters:

Name Type Description Default
**kwargs

See :func:synth_to_latex (e.g. caption, label, show_weights, digits).

{}

Returns:

Type Description
str

to_markdown

to_markdown(**kwargs) -> str

Render the comparison as a Markdown table.

Forwards to :func:statspai.synth.exports.synth_to_markdown.

to_excel

to_excel(path: str, **kwargs) -> str

Write a multi-sheet Excel workbook covering all methods.

Forwards to :func:statspai.synth.exports.synth_to_excel. Returns the absolute path of the file written.

SequentialSDIDResult dataclass

Per-cohort and aggregated output of :func:sequential_sdid.

SyntheticSurvivalResult dataclass

Output of :func:synth_survival.

SynthExperimentalDesignResult dataclass

Structured output of :func:synth_experimental_design.

Attributes:

Name Type Description
selected list of unit ids

The k units recommended for treatment.

ranking DataFrame

All candidates with columns [unit, pre_mspe, pre_rmse, effective_donors, risk_score, selected] sorted by risk_score ascending (best first).

weights dict[unit_id, ndarray]

Leave-one-out SC weight vectors (aligned to donor_units) — useful for the post-experiment analysis and for diagnostics.

donor_units list

The donor pool that each candidate was matched against (candidates excluded from each other's donor pool by default).

expected_variance float

Sum of pre-period MSPEs over selected — proxy for the post-experiment ATT-variance under Abadie-Zhao 2025/2026 Eq. (3).

baseline_variance float

Same quantity for a random-k assignment (average over n_random draws); the gain is baseline_variance - expected_variance.

method str

Always 'abadie_zhao_2025'.

diagnostics dict

Extra metadata (n_units, pre_periods, solver, etc.).

synth

synth(data: DataFrame, outcome: str, unit: str, time: str, treated_unit: Any = None, treatment_time: Any = None, method: str = 'classic', covariates: Optional[List[str]] = None, penalization: float = 0.0, placebo: bool = True, alpha: float = 0.05, inference: Optional[str] = None, treatment: Optional[str] = None, **kwargs) -> CausalResult

Public sp.synth entry point — see _dispatch_synth_impl for the full docstring on methods and parameters.

Thin wrapper around the multi-branch dispatcher that attaches a :class:Provenance record to the returned result so downstream replication_pack / Quarto appendix / table footers can pick up the call (function name, args, data hash) without each individual SCM backend having to opt in. The 20-method dispatcher itself lives in :func:_dispatch_synth_impl.

References

Abadie, A., Diamond, A. and Hainmueller, J. (2010). Synthetic control methods for comparative case studies. Journal of the American Statistical Association. [@abadie2010synthetic]

synthplot

synthplot(result: Union[CausalResult, List[CausalResult]], type: str = 'trajectory', ax=None, figsize: Optional[tuple] = None, title: Optional[str] = None, top_n: int = 15, labels: Optional[List[str]] = None, **kwargs)

Unified plot function for all Synthetic Control variants.

Automatically detects the SCM variant and renders the appropriate visualisation. Works with results from synth(method=...), sdid(), augsynth(), gsynth(), staggered_synth(), conformal_synth(), and all other variants.

Parameters:

Name Type Description Default
result CausalResult or list of CausalResult

Output of any synth() variant. Pass a list for type='compare'.

required
type str

Plot type:

  • 'trajectory' — treated vs synthetic over time.
  • 'gap' — effect (gap) over time.
  • 'both' — two-panel: trajectory + gap.
  • 'weights' — donor weight bar chart.
  • 'placebo' — placebo ATT distribution.
  • 'placebo_gap' — placebo gap spaghetti plot (Abadie et al. 2010).
  • 'rmspe' — post/pre RMSPE ratio histogram (Abadie et al. 2010).
  • 'conformal' — period-level effects + conformal CIs.
  • 'staggered' — cohort-level ATT comparison.
  • 'factors' — latent factor loadings (gsynth only).
  • 'compare' — overlay multiple results.
'trajectory'
ax matplotlib Axes

Pre-existing axes for single-panel plots.

None
figsize tuple

Figure size. Auto-selected if None.

None
title str

Override the auto-generated title.

None
top_n int

Number of donors to show in weight plots.

15
labels list of str

Labels for type='compare'.

None
**kwargs

Additional arguments passed to individual plotters. Notable:

  • pre_band=True — for type='trajectory' / 'gap' / 'both': overlay a ±1.96 × pre-RMSPE noise envelope.
  • pi_band=True — for type='trajectory': overlay the prediction-interval / conformal CI ribbon around the synthetic counterfactual when the result carries one (sp.scpi / sp.conformal_synth).
{}

Returns:

Type Description
(fig, ax) or (fig, axes)

Examples:

>>> result = sp.synth(df, ..., method='demeaned')
>>> sp.synthplot(result)                    # trajectory
>>> sp.synthplot(result, type='gap')        # gap plot
>>> sp.synthplot(result, type='both')       # two-panel
>>> sp.synthplot(result, type='weights')    # donor weights
>>> sp.synthplot(result, type='placebo')    # placebo distribution

Compare methods:

>>> r1 = sp.synth(df, ..., method='classic')
>>> r2 = sp.synth(df, ..., method='demeaned')
>>> r3 = sp.synth(df, ..., method='sdid')
>>> sp.synthplot([r1, r2, r3], type='compare',
...             labels=['Classic', 'De-meaned', 'SDID'])

demeaned_synth

demeaned_synth(data: DataFrame, outcome: str, unit: str, time: str, treated_unit: Any, treatment_time: Any, covariates: Optional[List[str]] = None, variant: Literal['demeaned', 'detrended'] = 'demeaned', penalization: float = 0.0, placebo: bool = True, alpha: float = 0.05) -> CausalResult

De-meaned / De-trended Synthetic Control Method.

Parameters:

Name Type Description Default
data DataFrame

Long-format panel data.

required
outcome str

Outcome variable name.

required
unit str

Unit identifier column.

required
time str

Time period column.

required
treated_unit any

Identifier of the treated unit.

required
treatment_time any

First treatment period (inclusive).

required
covariates list of str

Additional covariates to match on.

None
variant ('demeaned', 'detrended')
  • 'demeaned' — subtract unit-level pre-treatment means.
  • 'detrended' — subtract unit-level linear time trends.
'demeaned'
penalization float

Ridge penalty on weights.

0.0
placebo bool

Run in-space placebo inference.

True
alpha float

Significance level.

0.05

Returns:

Type Description
CausalResult

Examples:

>>> result = sp.demeaned_synth(df, outcome='gdp', unit='state',
...     time='year', treated_unit='California', treatment_time=1989)
>>> print(result.summary())

robust_synth

robust_synth(data: DataFrame, outcome: str, unit: str, time: str, treated_unit: Any, treatment_time: Any, covariates: Optional[List[str]] = None, variant: Literal['unconstrained', 'elastic_net', 'penalized'] = 'unconstrained', l1_penalty: float = 0.0, l2_penalty: float = 0.01, intercept: bool = True, placebo: bool = True, alpha: float = 0.05) -> CausalResult

Robust / unconstrained Synthetic Control.

Parameters:

Name Type Description Default
data DataFrame

Long-format panel data.

required
outcome str

Outcome variable name.

required
unit str

Unit identifier column.

required
time str

Time period column.

required
treated_unit any

Identifier of the treated unit.

required
treatment_time any

First treatment period.

required
covariates list of str

Additional covariates to match on.

None
variant ('unconstrained', 'elastic_net', 'penalized')
  • 'unconstrained' — no sign / sum constraints; optional intercept.
  • 'elastic_net' — L1 + L2 penalty, no sign constraints.
  • 'penalized' — classic SCM constraints + elastic-net penalty.
'unconstrained'
l1_penalty float

Lasso (L1) penalty strength.

0.0
l2_penalty float

Ridge (L2) penalty strength.

0.01
intercept bool

Fit an intercept (level shift). Only for unconstrained / elastic_net.

True
placebo bool

Run in-space placebo inference.

True
alpha float

Significance level.

0.05

Returns:

Type Description
CausalResult

Examples:

>>> result = sp.robust_synth(df, outcome='gdp', unit='state',
...     time='year', treated_unit='California', treatment_time=1989,
...     variant='unconstrained')

staggered_synth

staggered_synth(data: DataFrame, outcome: str, unit: str, time: str, treatment: str, method: Literal['separate', 'pooled'] = 'separate', penalization: float = 0.0, placebo: bool = True, alpha: float = 0.05) -> CausalResult

Staggered Adoption Synthetic Control.

Parameters:

Name Type Description Default
data DataFrame

Long-format panel data.

required
outcome str

Outcome variable name.

required
unit str

Unit identifier column.

required
time str

Time period column.

required
treatment str

Binary treatment indicator (0/1). Units transition from 0 to 1 at their respective adoption times.

required
method ('separate', 'pooled')
  • 'separate' — fit a separate SCM for each treated unit.
  • 'pooled' — partially pool weights across cohorts with the same adoption time.
'separate'
penalization float

Ridge penalty on donor weights.

0.0
placebo bool

Run placebo inference.

True
alpha float

Significance level.

0.05

Returns:

Type Description
CausalResult

With model_info containing per-unit and per-cohort effects.

Examples:

>>> result = sp.staggered_synth(df, outcome='gdp', unit='state',
...     time='year', treatment='treated')
>>> print(result.summary())

conformal_synth

conformal_synth(data: DataFrame, outcome: str, unit: str, time: str, treated_unit: Any, treatment_time: Any, scm_method: str = 'classic', grid_size: int = 101, grid_range: Optional[Tuple[float, float]] = None, alpha: float = 0.05, penalization: float = 0.0) -> CausalResult

Conformal inference for synthetic control.

Constructs valid confidence intervals by inverting a sequence of conformal tests, one for each hypothesised treatment effect.

Parameters:

Name Type Description Default
data DataFrame

Long-format panel data.

required
outcome str

Outcome variable name.

required
unit str

Unit identifier column.

required
time str

Time period column.

required
treated_unit any

Identifier of the treated unit.

required
treatment_time any

First treatment period.

required
scm_method str

Which SCM variant to use for weight estimation. Currently supports 'classic' (constrained) and 'ridge'.

'classic'
grid_size int

Number of points in the hypothesis grid for CI inversion.

101
grid_range tuple of (float, float)

(min, max) of the hypothesis grid. If None, auto-determined from pre-treatment residual scale.

None
alpha float

Significance level.

0.05
penalization float

Ridge penalty (used when scm_method='ridge').

0.0

Returns:

Type Description
CausalResult

With model_info containing per-period p-values, conformal confidence sets, and the full test inversion grid.

Examples:

>>> result = sp.conformal_synth(df, outcome='gdp', unit='state',
...     time='year', treated_unit='California', treatment_time=1989)
>>> print(result.summary())

scest

scest(data: DataFrame, outcome: str, unit: str, time: str, treated_unit: Any, treatment_time: Any, w_constr: str = 'simplex', lasso_lambda: float = 1.0, ridge_lambda: float = 1.0) -> Dict[str, Any]

Estimate synthetic control weights.

Solves the constrained optimisation problem to find donor weights that best reproduce the treated unit's pre-treatment outcomes. Mirrors the R package's scest() function.

Parameters:

Name Type Description Default
data DataFrame

Long-format panel data.

required
outcome str

Outcome variable column name.

required
unit str

Unit identifier column name.

required
time str

Time period column name.

required
treated_unit scalar

Identifier of the treated unit.

required
treatment_time scalar

First treatment period.

required
w_constr str

Weight constraint:

  • 'simplex' : w >= 0, sum(w) = 1
  • 'lasso' : L1-penalised (allows negative, non-summing)
  • 'ridge' : L2-penalised
  • 'ols' : ordinary least squares (unconstrained)
  • 'ls' : least squares (same as 'ols')
'simplex'
lasso_lambda float

L1 penalty (used when w_constr='lasso').

1.0
ridge_lambda float

L2 penalty (used when w_constr='ridge').

1.0

Returns:

Type Description
dict

Keys:

  • weights : np.ndarray (J,) of estimated donor weights
  • w_constr : echo of constraint type
  • Y_synth_pre : synthetic unit pre-treatment outcomes
  • Y_synth_post : synthetic unit post-treatment outcomes
  • residuals_pre : pre-treatment fit residuals
  • effects : post-treatment gaps (treated - synthetic)
  • pre_rmspe : root mean squared prediction error (pre)
  • donor_names : donor labels
  • sc_data : the prepared data dict from scdata

Examples:

>>> est = sp.scest(df, outcome='gdp', unit='state', time='year',
...     treated_unit='California', treatment_time=1989)
>>> est['pre_rmspe']
0.0213

scdata

scdata(data: DataFrame, outcome: str, unit: str, time: str, treated_unit: Any, treatment_time: Any) -> Dict[str, Any]

Prepare data matrices for synthetic control estimation.

Reshapes a long-format panel into the matrices needed by scest and scpi. Mirrors the R package's scdata() function.

Parameters:

Name Type Description Default
data DataFrame

Long-format panel data.

required
outcome str

Outcome variable column name.

required
unit str

Unit identifier column name.

required
time str

Time period column name.

required
treated_unit scalar

Identifier of the treated unit.

required
treatment_time scalar

First treatment period.

required

Returns:

Type Description
dict

Keys:

  • Y_pre : treated unit pre-treatment outcomes (T0,)
  • Y_post : treated unit post-treatment outcomes (T1,)
  • Y_donors_pre : donor pre-treatment matrix (T0, J)
  • Y_donors_post : donor post-treatment matrix (T1, J)
  • donor_names : list of donor unit labels
  • pre_times : array of pre-treatment time values
  • post_times : array of post-treatment time values
  • times : full array of time values
  • treated_unit : echo of the treated unit label
  • treatment_time: echo of the first treatment period

Examples:

>>> prepared = sp.scdata(df, outcome='gdp', unit='state',
...     time='year', treated_unit='California', treatment_time=1989)
>>> prepared['Y_pre'].shape
(19,)

mc_synth

mc_synth(data: DataFrame, outcome: str, unit: str, time: str, treated_unit: Any, treatment_time: Any, covariates: Optional[List[str]] = None, lambda_reg: Optional[float] = None, max_iter: int = 500, tol: float = 1e-06, cv_folds: int = 5, alpha: float = 0.05, placebo: bool = True, seed: Optional[int] = None) -> CausalResult

Matrix Completion Synthetic Control Method.

Imputes the treated unit's post-treatment counterfactual by solving a nuclear-norm-penalised matrix completion problem on the full panel, following Athey et al. (2021).

Parameters:

Name Type Description Default
data DataFrame

Long-format panel data.

required
outcome str

Outcome variable name.

required
unit str

Unit identifier column.

required
time str

Time period column.

required
treated_unit any

Identifier of the treated unit.

required
treatment_time any

First treatment period (inclusive).

required
covariates list of str

Time-varying covariates to partial out before matrix completion.

None
lambda_reg float

Nuclear norm penalty. If None (default), selected automatically via cross-validation on observed entries.

None
max_iter int

Maximum Soft-Impute iterations.

500
tol float

Convergence tolerance (relative change in Frobenius norm).

1e-6
cv_folds int

Number of CV folds for automatic lambda selection.

5
alpha float

Significance level for confidence intervals.

0.05
placebo bool

Run placebo (permutation) inference by treating each control unit as if it were treated.

True
seed int

Random seed for reproducibility.

None

Returns:

Type Description
CausalResult

With .estimate equal to the average post-treatment effect (ATT), period-level effects in detail, and full diagnostics in model_info.

Notes

The algorithm uses the Soft-Impute / Singular Value Thresholding (SVT) procedure. At each iteration the current completion is projected onto observed entries, combined with the previous imputation at missing entries, then rank-reduced by soft-thresholding the singular values.

Examples:

>>> import statspai as sp
>>> result = sp.mc_synth(df, outcome='gdp', unit='state', time='year',
...                      treated_unit='California',
...                      treatment_time=1989)
>>> print(result.summary())

multi_outcome_synth

multi_outcome_synth(data: DataFrame, outcomes: List[str], unit: str, time: str, treated_unit: Any, treatment_time: Any, method: str = 'concatenated', standardize: bool = True, penalization: float = 0.0, placebo: bool = True, alpha: float = 0.05) -> CausalResult

Multiple Outcomes Synthetic Control Method (Sun 2023).

Finds a single set of donor weights that simultaneously matches the treated unit across all K outcomes in the pre-treatment period.

Parameters:

Name Type Description Default
data DataFrame

Long-format panel data containing all outcome columns.

required
outcomes list of str

Column names for the K outcome variables.

required
unit str

Unit identifier column.

required
time str

Time period column.

required
treated_unit any

Value identifying the treated unit.

required
treatment_time any

First treatment period (inclusive).

required
method ('concatenated', 'averaged')

Weight-estimation strategy.

  • 'concatenated' -- stack all K standardised outcome panels vertically and solve one quadratic programme.
  • 'averaged' -- standardise each outcome, average across K, then solve SCM on the mean series.
'concatenated'
standardize bool

Standardise each outcome to zero mean / unit variance before stacking or averaging (strongly recommended when outcome scales differ).

True
penalization float

Ridge-type penalty added to the diagonal of the donor cross-product matrix (penalization * I). Helps when donors are collinear.

0.0
placebo bool

Run in-space placebo permutations for inference (each donor is pretended to be treated in turn).

True
alpha float

Significance level for confidence intervals and joint test.

0.05

Returns:

Type Description
CausalResult

Unified result object with:

  • estimate : average treatment effect across outcomes (mean of per-outcome ATTs).
  • model_info['per_outcome_effects'] : DataFrame with columns outcome, att, se, pvalue.
  • model_info['weights'] : dict mapping donor names to shared SCM weights.
  • model_info['gap_tables'] : dict of DataFrames (one per outcome) with time-level gaps.
  • model_info['joint_pvalue'] : joint p-value across all K outcomes (Fisher combination of placebo p-values).
  • model_info['Y_synth'] : dict mapping outcome name to full synthetic series.
  • model_info['Y_treated'] : dict mapping outcome name to observed treated series.
  • model_info['times'] : sorted list of all time periods.

Examples:

>>> result = sp.multi_outcome_synth(
...     df,
...     outcomes=['gdp', 'employment', 'wages'],
...     unit='state', time='year',
...     treated_unit='California', treatment_time=1989,
... )
>>> print(result.summary())
>>> result.model_info['per_outcome_effects']
Notes

Sun (2023) shows that under a low-rank factor model the bias of the concatenated estimator shrinks as O(1/sqrt(K)), where K is the number of outcomes. The key requirement is that the outcomes share a common latent-factor structure.

qqsynth

qqsynth(data: DataFrame, outcome: str, unit: str, time: str, treated_unit: Any, treatment_time: Any, n_quantiles: int = 100, placebo: bool = True, alpha: float = 0.05, seed: Optional[int] = None) -> CausalResult

Quantile Synthetic Control (alias for DiSCo with method='quantile').

Applies quantile-on-quantile regression to match quantile functions without the convexity constraints of the mixture approach.

Parameters:

Name Type Description Default
data DataFrame

Panel data in long format.

required
outcome str

Outcome variable column.

required
unit str

Unit identifier column.

required
time str

Time period column.

required
treated_unit any

Identifier of the treated unit.

required
treatment_time any

First treatment period (inclusive).

required
n_quantiles int

Number of quantile grid points.

100
placebo bool

Run placebo permutation inference.

True
alpha float

Significance level.

0.05
seed int

Random seed.

None

Returns:

Type Description
CausalResult

Examples:

>>> result = sp.qqsynth(df, outcome='gdp', unit='state', time='year',
...                     treated_unit='California', treatment_time=1989)
>>> print(result.summary())
See Also

discos : Full distributional synthetic controls with method selection.

discos_test

discos_test(result: CausalResult, test: str = 'ks') -> Dict[str, Any]

Test for distributional treatment effects.

Parameters:

Name Type Description Default
result CausalResult

Output from discos() or qqsynth().

required
test ('ks', 'cvm', 'stochastic_dominance')

'ks': two-sample Kolmogorov-Smirnov test comparing treated and counterfactual quantile functions. 'cvm': Cramér-von Mises test statistic (permutation-based). 'stochastic_dominance': first-order stochastic dominance test.

'ks'

Returns:

Type Description
dict

Keys: 'test', 'statistic', 'pvalue', 'reject', 'alpha', and test-specific fields.

Examples:

>>> result = sp.discos(df, outcome='gdp', unit='state', time='year',
...                    treated_unit='California', treatment_time=1989)
>>> sp.discos_test(result, test='ks')
{'test': 'Kolmogorov-Smirnov', 'statistic': 0.32, 'pvalue': 0.014, ...}

discos_plot

discos_plot(result: CausalResult, type: str = 'quantile_effect', ax=None, figsize: Tuple[int, int] = (10, 6), color: str = '#2C3E50', ci_alpha: float = 0.2, title: Optional[str] = None)

Visualise distributional synthetic control results.

Parameters:

Name Type Description Default
result CausalResult

Output from discos() or qqsynth().

required
type ('quantile_effect', 'quantile_comparison', 'gap', 'weights')

default 'quantile_effect' 'quantile_effect': treatment effect Δ(τ) across quantiles with CIs. 'quantile_comparison': overlay treated vs. counterfactual quantile functions. 'gap': gap plot (treated − synthetic) over time. 'weights': horizontal bar chart of donor weights.

'quantile_effect'
ax Axes

Pre-existing axes for the plot.

None
figsize tuple

Figure size.

(10, 6)
color str

Primary plot colour.

'#2C3E50'
ci_alpha float

Transparency for CI band.

0.2
title str

Plot title override.

None

Returns:

Type Description
(fig, ax)

Examples:

>>> result = sp.discos(df, outcome='gdp', unit='state', time='year',
...                    treated_unit='California', treatment_time=1989)
>>> sp.discos_plot(result, type='quantile_effect')
>>> sp.discos_plot(result, type='quantile_comparison')

stochastic_dominance

stochastic_dominance(result: CausalResult, order: int = 1) -> Dict[str, Any]

Test for stochastic dominance of the treated distribution over the counterfactual distribution.

Parameters:

Name Type Description Default
result CausalResult

Output from discos() or qqsynth().

required
order (1, 2)

Order of stochastic dominance. 1 = first-order (CDF dominance). 2 = second-order (integrated CDF dominance).

1

Returns:

Type Description
dict

Keys: 'order', 'dominates' (bool), 'min_gap', 'max_gap', 'fraction_positive', 'statistic', 'pvalue'.

Examples:

>>> result = sp.discos(df, outcome='gdp', unit='state', time='year',
...                    treated_unit='California', treatment_time=1989)
>>> sp.stochastic_dominance(result, order=1)

bayesian_synth

bayesian_synth(data: DataFrame, outcome: str, unit: str, time: str, treated_unit, treatment_time, covariates: Optional[List[str]] = None, n_iter: int = 2000, n_warmup: int = 1000, n_chains: int = 2, dirichlet_alpha: float = 1.0, seed: Optional[int] = None, alpha: float = 0.05) -> CausalResult

Bayesian Synthetic Control Method.

Estimates the ATT by placing a Dirichlet prior on donor weights and sampling from the posterior via Metropolis-Hastings MCMC. Returns full posterior credible intervals for the treatment effect.

Parameters:

Name Type Description Default
data DataFrame

Panel data in long format with columns for unit, time, and outcome.

required
outcome str

Name of the outcome variable column.

required
unit str

Name of the unit identifier column.

required
time str

Name of the time period column.

required
treated_unit scalar

Value in unit that identifies the treated unit.

required
treatment_time scalar

First period of treatment (inclusive).

required
covariates list of str

Additional pre-treatment predictors to include in the matching objective. Covariates are appended to the pre-treatment outcome series for each unit before fitting.

None
n_iter int

Total MCMC iterations per chain (including warmup).

2000
n_warmup int

Number of warmup (burn-in) iterations for adaptation. Must be strictly less than n_iter.

1000
n_chains int

Number of independent MCMC chains. Multiple chains enable the R-hat convergence diagnostic.

2
dirichlet_alpha float

Concentration parameter for the symmetric Dirichlet prior on donor weights. alpha = 1 gives a uniform prior on the simplex; values < 1 encourage sparsity; values > 1 encourage more uniform weights.

1.0
seed int

Random seed for reproducibility.

None
alpha float

Significance level for credible intervals.

0.05

Returns:

Type Description
CausalResult

With .estimate equal to the posterior mean ATT averaged over all post-treatment periods, .ci giving the equal-tailed credible interval, and rich diagnostics in model_info.

Raises:

Type Description
ValueError

If the panel has fewer than 2 pre-treatment periods, no post-treatment periods, or no valid donor units.

Examples:

>>> result = sp.bayesian_synth(
...     df, outcome='gdp', unit='state', time='year',
...     treated_unit='California', treatment_time=1989,
...     n_iter=4000, n_warmup=2000, n_chains=4, seed=42,
... )
>>> print(result.summary())
Notes

The sampler uses a Dirichlet proposal on the simplex (re-normalised perturbation) with adaptive step-size tuning during warmup targeting an acceptance rate of ~0.35. Samples are thinned by a factor of 2 to reduce autocorrelation.

References

Vives, J. and Martinez, A. (2024). "Bayesian Synthetic Control Methods." Journal of Computational and Graphical Statistics.

causal_impact

causal_impact(data: DataFrame, pre_period: Tuple[Any, Any], post_period: Tuple[Any, Any], outcome: Optional[str] = None, covariates: Optional[List[str]] = None, model: str = 'local_level', n_simulations: int = 1000, alpha: float = 0.05, seed: Optional[int] = None) -> CausalResult

Google CausalImpact-style causal inference for time series.

Fits a Bayesian structural time series model on the pre-intervention period and produces a counterfactual prediction for the post-period. The treatment effect is the difference between observed and counterfactual, with full posterior uncertainty.

Parameters:

Name Type Description Default
data DataFrame

Time-indexed DataFrame. If outcome is None, the first column is the outcome and all remaining columns are controls.

required
pre_period tuple of (start, end)

Pre-intervention period boundaries (inclusive). Values are matched against the DataFrame index.

required
post_period tuple of (start, end)

Post-intervention period boundaries (inclusive).

required
outcome str

Column name of the outcome variable. If None, uses the first column.

None
covariates list of str

Column names to use as controls. If None, all columns except the outcome are used.

None
model ``'local_level'`` or ``'local_linear_trend'``

State-space model type. 'local_level' uses a random-walk latent state; 'local_linear_trend' adds a stochastic slope.

'local_level'
n_simulations int

Number of posterior draws for uncertainty quantification.

1000
alpha float

Significance level for confidence/credible intervals.

0.05
seed int

Random seed for reproducibility.

None

Returns:

Type Description
CausalResult

Unified result object with: - estimate — average treatment effect on the treated (ATT) - detail — per-period effects DataFrame - model_info — counterfactual trajectories, cumulative effects, regression coefficients, posterior draws, and diagnostics

Notes

The implementation follows Brodersen et al. (2015). Regression coefficients are estimated via ridge regression with GCV-selected penalty (an empirical-Bayes analogue of the spike-and-slab prior). Hyperparameters (observation/level/slope noise) are estimated by maximum likelihood through the Kalman filter.

Examples:

>>> import pandas as pd
>>> import statspai as sp
>>> # Wide-format: columns = [outcome, control1, control2, ...]
>>> result = sp.synth.causal_impact(
...     data, pre_period=(1, 70), post_period=(71, 100)
... )
>>> print(result.summary())
References

Brodersen, K.H., Gallusser, F., Koehler, J., Remy, N. and Scott, S.L. (2015). "Inferring causal impact using Bayesian structural time series models." Annals of Applied Statistics, 9(1), 247-274. [@brodersen2015inferring]

bsts_synth

bsts_synth(data: DataFrame, outcome: str, unit: str, time: str, treated_unit: Any, treatment_time: Any, covariates: Optional[List[str]] = None, model: str = 'local_level', n_simulations: int = 1000, alpha: float = 0.05, seed: Optional[int] = None) -> CausalResult

BSTS synthetic control with a panel-data interface.

Converts long-format panel data into the wide format expected by :func:causal_impact, using control-unit outcome series as covariates/regressors. This provides a CausalImpact-style analysis that integrates seamlessly with StatsPAI's other synthetic-control methods.

Parameters:

Name Type Description Default
data DataFrame

Long-format panel data with columns for unit, time, and outcome.

required
outcome str

Outcome variable column name.

required
unit str

Unit identifier column name.

required
time str

Time period column name.

required
treated_unit any

Identifier of the treated unit.

required
treatment_time any

First treatment period (inclusive).

required
covariates list of str

Additional time-varying covariates to include alongside control unit series. Each covariate is averaged across control units per time period and appended as an extra regressor.

None
model ``'local_level'`` or ``'local_linear_trend'``

State-space model type.

'local_level'
n_simulations int

Number of posterior draws.

1000
alpha float

Significance level.

0.05
seed int

Random seed.

None

Returns:

Type Description
CausalResult

Unified result object. model_info contains additional keys treated_unit, treatment_time, donor_units.

Examples:

>>> import statspai as sp
>>> result = sp.synth.bsts_synth(
...     data, outcome='gdp', unit='country', time='year',
...     treated_unit='West Germany', treatment_time=1990,
... )
>>> print(result.summary())
See Also

causal_impact : Wide-format CausalImpact interface.

penalized_synth

penalized_synth(data: DataFrame, outcome: str, unit: str, time: str, treated_unit: Any, treatment_time: Any, covariates: Optional[List[str]] = None, lambda_pen: Optional[float] = None, penalty_type: str = 'pairwise', predictors: Optional[List[str]] = None, placebo: bool = True, alpha: float = 0.05) -> CausalResult

Penalized Synthetic Control estimator (Abadie & L'Hour 2021).

Parameters:

Name Type Description Default
data DataFrame

Long-format panel data with columns for unit, time, outcome, and optionally covariates / predictors.

required
outcome str

Name of the outcome column.

required
unit str

Name of the unit identifier column.

required
time str

Name of the time period column.

required
treated_unit Any

Identifier of the treated unit.

required
treatment_time Any

First treatment period (inclusive).

required
covariates list of str

Covariate columns used only for the pairwise distance penalty. When None the pre-treatment outcome values are used as the covariate vector for distance computation.

None
lambda_pen float

Penalty parameter. None (default) triggers automatic selection via rolling-origin cross-validation on pre-treatment periods.

None
penalty_type ('pairwise', 'max_dev', 'l1_pairwise')

Penalty functional form.

  • 'pairwise'sum_j w_j * ||X1 - Xj||^2 (Abadie & L'Hour original).
  • 'max_dev'max_j { w_j * ||X1 - Xj||^2 }.
  • 'l1_pairwise'sum_j w_j * ||X1 - Xj||_1.
'pairwise'
predictors list of str

Columns whose pre-treatment averages are appended to the covariate vector for distance computation.

None
placebo bool

Run in-space placebo permutation tests.

True
alpha float

Significance level for confidence intervals.

0.05

Returns:

Type Description
CausalResult

With detail set to the effects-by-period DataFrame.

References

Abadie, A. and L'Hour, J. (2021). "A Penalized Synthetic Control Estimator for Disaggregated Data." Journal of the American Statistical Association, 116(536), 1817-1834. [@abadie2021penalized]

cluster_synth

cluster_synth(data: DataFrame, outcome: str, unit: str, time: str, treated_unit: Any, treatment_time: Any, n_clusters: Optional[int] = None, cluster_method: str = 'kmeans', augment: bool = False, max_augment: int = 3, covariates: Optional[List[str]] = None, placebo: bool = True, alpha: float = 0.05, seed: Optional[int] = None) -> CausalResult

Cluster Synthetic Control estimator.

Parameters:

Name Type Description Default
data DataFrame

Long-format panel data.

required
outcome str

Name of the outcome column.

required
unit str

Name of the unit-identifier column.

required
time str

Name of the time-period column.

required
treated_unit any

Identifier of the single treated unit.

required
treatment_time any

First treatment period (inclusive).

required
n_clusters int or None

Number of clusters. None selects automatically via silhouette score (k from 2 to min(J-1, 10)).

None
cluster_method ('kmeans', 'spectral', 'hierarchical')

Clustering algorithm.

'kmeans'
augment bool

If True, augment the selected cluster with the closest donors from other clusters.

False
max_augment int

Maximum number of additional donors when augment is True.

3
covariates list of str or None

Additional columns to include in the clustering feature matrix.

None
placebo bool

Run in-space placebo permutation inference.

True
alpha float

Significance level for confidence intervals.

0.05
seed int or None

Random seed for reproducibility.

None

Returns:

Type Description
CausalResult

sparse_synth

sparse_synth(data: DataFrame, outcome: str, unit: str, time: str, treated_unit: Any, treatment_time: Any, mode: str = 'lasso', lambda_w: Optional[float] = None, lambda_v: Optional[float] = None, covariates: Optional[List[str]] = None, placebo: bool = True, alpha: float = 0.05) -> CausalResult

Sparse Synthetic Control estimator.

Parameters:

Name Type Description Default
data DataFrame

Long-format panel data.

required
outcome str

Outcome variable name.

required
unit str

Unit identifier column.

required
time str

Time period column.

required
treated_unit any

Identifier of the treated unit.

required
treatment_time any

First treatment period (inclusive).

required
mode ('lasso', 'constrained_lasso', 'joint')
  • 'lasso' — L1-penalised weights, no sum-to-one constraint.
  • 'constrained_lasso' — L1 + non-negativity + sum-to-one.
  • 'joint' — Joint V and W optimisation (full SparseSC).
'lasso'
lambda_w float or None

L1 penalty on donor weights. None selects via cross-validation.

None
lambda_v float or None

L1 penalty on feature weights ('joint' mode only). None selects via cross-validation.

None
covariates list of str

Additional covariates to append to the pre-treatment outcome matrix before weight estimation.

None
placebo bool

Run in-space placebo permutation for inference.

True
alpha float

Significance level for confidence interval.

0.05

Returns:

Type Description
CausalResult

Examples:

>>> import statspai as sp
>>> result = sp.sparse_synth(
...     df, outcome='gdp', unit='state', time='year',
...     treated_unit='California', treatment_time=1989,
...     mode='lasso',
... )
>>> result.summary()

kernel_synth

kernel_synth(data: DataFrame, outcome: str, unit: str, time: str, treated_unit, treatment_time, kernel: str = 'rbf', sigma: Optional[float] = None, degree: int = 2, covariates: Optional[List[str]] = None, placebo: bool = True, alpha: float = 0.05) -> CausalResult

Kernel-based Nonlinear Synthetic Control Method.

Standard SCM assumes the counterfactual is a linear combination of donors. This estimator lifts the donor panel into a reproducing kernel Hilbert space (RKHS) and solves for synthetic control weights in that feature space, capturing nonlinear donor relationships.

Parameters:

Name Type Description Default
data DataFrame

Panel data in long format with columns for unit, time, and outcome.

required
outcome str

Name of the outcome variable.

required
unit str

Column identifying panel units.

required
time str

Column identifying time periods.

required
treated_unit

Identifier of the treated unit.

required
treatment_time

First treatment period (inclusive).

required
kernel ``{'rbf', 'polynomial', 'laplacian'}``

Kernel function to use.

``'rbf'``
sigma float or None

Bandwidth for RBF / Laplacian kernels. If None, the median heuristic is used (recommended).

None
degree int

Degree for the polynomial kernel (ignored otherwise).

2
covariates list of str or None

Additional pre-treatment covariates to include in the feature vector. If provided, each donor row is [outcomes | covariates].

None
placebo bool

Whether to run in-space placebo permutation for inference.

True
alpha float

Significance level for the confidence interval.

0.05

Returns:

Type Description
CausalResult

Unified result with ATT estimate, SE, p-value, CI, and period-level effects in detail.

Notes

The optimisation solved is:

.. math::

\min_{w \ge 0,\, \sum w = 1}
\bigl[K(Y_1, Y_1) - 2\,w^\top k(Y_1) + w^\top K\,w\bigr]

where :math:K_{ij} = k(Y_{0,i},\, Y_{0,j}) is the donor kernel matrix and :math:k(Y_1)_j = k(Y_1,\, Y_{0,j}).

References

Scholkopf, B. and Smola, A.J. (2002). "Learning with Kernels."

kernel_ridge_synth

kernel_ridge_synth(data: DataFrame, outcome: str, unit: str, time: str, treated_unit, treatment_time, kernel: str = 'rbf', sigma: Optional[float] = None, degree: int = 2, ridge_lambda: float = 0.01, covariates: Optional[List[str]] = None, placebo: bool = True, alpha: float = 0.05) -> CausalResult

Kernel Ridge Regression Synthetic Control.

Instead of constrained simplex weights, this estimator uses kernel ridge regression to learn the mapping from donors to the treated unit. The ridge penalty lambda prevents overfitting when the number of donors is small relative to pre-treatment periods.

Parameters:

Name Type Description Default
data DataFrame

Panel data in long format.

required
outcome str

Outcome variable name.

required
unit str

Unit identifier column.

required
time str

Time period column.

required
treated_unit

Identifier of the treated unit.

required
treatment_time

First treatment period (inclusive).

required
kernel ``{'rbf', 'polynomial', 'laplacian'}``

Kernel function.

``'rbf'``
sigma float or None

Bandwidth (None = median heuristic).

None
degree int

Polynomial kernel degree.

2
ridge_lambda float

Regularisation parameter. Larger values shrink the coefficient vector toward zero.

0.01
covariates list of str or None

Additional pre-treatment covariates.

None
placebo bool

Run placebo permutation inference.

True
alpha float

Significance level.

0.05

Returns:

Type Description
CausalResult
Notes

The solution is:

.. math::

\beta = (K + \lambda I)^{-1}\, k(Y_1)

and the counterfactual is :math:\hat{Y}_{1,\text{post}} = Y_{0,\text{post}}^\top \beta.

No non-negativity or sum-to-one constraints are imposed, which gives the estimator more flexibility but may produce extrapolation.

synth_compare

synth_compare(data: DataFrame, outcome: str, unit: str, time: str, treated_unit: Any = None, treatment_time: Any = None, methods: Optional[List[str]] = None, placebo: bool = True, alpha: float = 0.05, **kwargs) -> SynthComparison

Run multiple SCM variants and compare them side by side.

Parameters:

Name Type Description Default
data DataFrame

Long-format panel data.

required
outcome str

Outcome variable column name.

required
unit str

Unit identifier column.

required
time str

Time period column.

required
treated_unit any

Identifier of the treated unit.

None
treatment_time any

First treatment period (inclusive).

None
methods list of str

SCM variants to compare. If None (default), all 20 registered methods are attempted, in ascending complexity order: classic, penalized, demeaned, detrended, unconstrained, elastic_net, augmented, sdid, gsynth, mc, discos, scpi, penscm, fdid, sparse, cluster, kernel, kernel_ridge, bayesian, bsts. Pass an explicit subset to reduce runtime.

None
placebo bool

Whether to run placebo inference for each method.

True
alpha float

Significance level for confidence intervals.

0.05
**kwargs

Additional keyword arguments forwarded to synth().

{}

Returns:

Type Description
SynthComparison

Structured comparison object with .results, .comparison_table, .recommended, and .plot().

Examples:

>>> comp = sp.synth_compare(
...     df, outcome='gdp', unit='state', time='year',
...     treated_unit='California', treatment_time=1989,
... )
>>> print(comp.summary())
>>> print(comp.recommended)
'demeaned'
>>> comp.plot()

Compare a subset of methods:

>>> comp = sp.synth_compare(
...     df, outcome='gdp', unit='state', time='year',
...     treated_unit='California', treatment_time=1989,
...     methods=['classic', 'augmented', 'sdid', 'mc'],
... )
See Also

synth_recommend : Quick one-liner returning only the method name. synth : Unified SCM dispatcher.

synth_recommend

synth_recommend(data: DataFrame, outcome: str, unit: str, time: str, treated_unit: Any = None, treatment_time: Any = None, **kwargs) -> str

Quickly recommend the best SCM method for the given data.

Runs synth_compare internally with placebo=False for speed, then returns just the recommended method name.

Parameters:

Name Type Description Default
data DataFrame

Long-format panel data.

required
outcome str

Outcome variable column name.

required
unit str

Unit identifier column.

required
time str

Time period column.

required
treated_unit any

Identifier of the treated unit.

None
treatment_time any

First treatment period (inclusive).

None
**kwargs

Additional keyword arguments forwarded to synth_compare().

{}

Returns:

Type Description
str

Name of the recommended SCM method (e.g., 'classic', 'augmented', 'sdid').

Examples:

>>> best = sp.synth_recommend(
...     df, outcome='gdp', unit='state', time='year',
...     treated_unit='California', treatment_time=1989,
... )
>>> best
'demeaned'

Then use it:

>>> result = sp.synth(
...     df, outcome='gdp', unit='state', time='year',
...     treated_unit='California', treatment_time=1989,
...     method=best,
... )
See Also

synth_compare : Full comparison with all metrics and plots.

synth_power

synth_power(data: DataFrame, outcome: str, unit: str, time: str, treated_unit: Any, treatment_time: Any, effect_sizes: Optional[Sequence[float]] = None, n_simulations: int = 200, alpha: float = 0.05, seed: Optional[int] = None) -> DataFrame

Power analysis for Synthetic Control designs.

Estimates statistical power across a grid of hypothetical effect sizes using placebo-based inference. Identifies the Minimum Detectable Effect (MDE) — the smallest effect where power >= 0.80.

Parameters:

Name Type Description Default
data DataFrame

Long-format panel data.

required
outcome str

Outcome variable name.

required
unit str

Unit identifier column.

required
time str

Time period column.

required
treated_unit any

Identifier of the treated unit.

required
treatment_time any

First treatment period (inclusive).

required
effect_sizes array-like of float

Grid of hypothetical additive effect sizes to evaluate. If None, auto-generates 10 steps from 0 to 3 * pre-treatment SD of the outcome.

None
n_simulations int

Number of Monte-Carlo simulations per effect size.

200
alpha float

Significance level for the placebo test.

0.05
seed int

Random seed for reproducibility.

None

Returns:

Type Description
DataFrame

Columns: effect_size, power, n_rejections, n_simulations, mde_flag.

The mde_flag column is True for the row corresponding to the Minimum Detectable Effect (first row with power >= 0.80).

Notes

The null distribution is the set of RMSPE ratios from in-space placebos (computed once on the original data). For each effect size, the simulation adds delta to the treated unit's post-treatment outcomes and re-computes the RMSPE ratio. A small noise perturbation (10 % of pre-treatment residual SD) is added so that each simulation draw is unique.

This is a novel diagnostic — no existing SCM package provides an equivalent power-planning tool.

Examples:

>>> import statspai as sp
>>> power_df = sp.synth_power(
...     df, outcome='gdp', unit='state', time='year',
...     treated_unit='California', treatment_time=1989,
...     n_simulations=500, seed=42,
... )
>>> power_df
   effect_size  power  n_rejections  n_simulations  mde_flag
0     0.000000   0.04            20            500     False
1     1.234567   0.23           115            500     False
...
8     9.876543   0.82           410            500      True
9    11.111111   0.95           475            500     False
>>> mde_row = power_df[power_df['mde_flag']]
>>> print(f"MDE = {mde_row['effect_size'].values[0]:.2f}")
See Also

synth_mde : Quick MDE extraction. synth_power_plot : Visualise the power curve.

synth_mde

synth_mde(data: DataFrame, outcome: str, unit: str, time: str, treated_unit: Any, treatment_time: Any, power_target: float = 0.8, alpha: float = 0.05, n_simulations: int = 200, seed: Optional[int] = None) -> float

Minimum Detectable Effect for a Synthetic Control design.

Convenience wrapper around :func:synth_power that returns only the MDE (the smallest effect size achieving the target power).

Parameters:

Name Type Description Default
data DataFrame

Long-format panel data.

required
outcome str

Outcome variable name.

required
unit str

Unit identifier column.

required
time str

Time period column.

required
treated_unit any

Identifier of the treated unit.

required
treatment_time any

First treatment period.

required
power_target float

Desired power level.

0.80
alpha float

Significance level for the placebo test.

0.05
n_simulations int

Number of simulations per effect size.

200
seed int

Random seed for reproducibility.

None

Returns:

Type Description
float

Minimum detectable effect size. Returns np.inf if no effect size in the default grid achieves the target power.

Examples:

>>> import statspai as sp
>>> mde = sp.synth_mde(
...     df, outcome='gdp', unit='state', time='year',
...     treated_unit='California', treatment_time=1989,
...     seed=42,
... )
>>> print(f"MDE at 80%% power: {mde:.2f}")
See Also

synth_power : Full power curve with details.

synth_power_plot

synth_power_plot(power_result: DataFrame, ax: Any = None, figsize: tuple = (9, 6), title: Optional[str] = None) -> Any

Plot the power curve from :func:synth_power.

Displays power (y-axis) against effect size (x-axis) with reference lines at power = 0.80 and the MDE.

Parameters:

Name Type Description Default
power_result DataFrame

Output of :func:synth_power. Must contain columns effect_size, power, and mde_flag.

required
ax Axes

Axes to plot on. If None, a new figure is created.

None
figsize tuple

Figure size (width, height) in inches.

(9, 6)
title str

Custom plot title. Defaults to "SCM Power Curve — Minimum Detectable Effect".

None

Returns:

Type Description
Axes

Examples:

>>> import statspai as sp
>>> power_df = sp.synth_power(
...     df, outcome='gdp', unit='state', time='year',
...     treated_unit='California', treatment_time=1989, seed=42,
... )
>>> sp.synth_power_plot(power_df)
See Also

synth_power : Compute the power curve.

synth_report

synth_report(data: DataFrame, outcome: str, unit: str, time: str, treated_unit: Any = None, treatment_time: Any = None, method: str = 'classic', output: str = 'text', sensitivity: bool = True, alpha: float = 0.05, **kwargs) -> str

Generate a comprehensive Synthetic Control analysis report.

Runs synth() for the main estimation and optionally synth_sensitivity() for robustness diagnostics, then formats everything into a structured report.

Parameters:

Name Type Description Default
data DataFrame

Long-format panel data.

required
outcome str

Outcome variable name.

required
unit str

Unit identifier column.

required
time str

Time period column.

required
treated_unit any

Identifier of the treated unit.

None
treatment_time any

First treatment period (inclusive).

None
method str

SCM variant passed to synth().

'classic'
output str

Output format: 'text', 'markdown', or 'latex'.

'text'
sensitivity bool

Whether to include the sensitivity analysis section.

True
alpha float

Significance level for CIs and hypothesis tests.

0.05
**kwargs

Additional keyword arguments forwarded to synth().

{}

Returns:

Type Description
str

Formatted analysis report.

Examples:

>>> import statspai as sp
>>> report = sp.synth_report(
...     df, outcome='cigsale', unit='state', time='year',
...     treated_unit='California', treatment_time=1989,
... )
>>> print(report)
>>> md = sp.synth_report(
...     df, outcome='cigsale', unit='state', time='year',
...     treated_unit='California', treatment_time=1989,
...     output='markdown',
... )

synth_report_to_file

synth_report_to_file(data: DataFrame, outcome: str, unit: str, time: str, treated_unit: Any = None, treatment_time: Any = None, method: str = 'classic', output: str = 'markdown', sensitivity: bool = True, alpha: float = 0.05, filename: str = 'report.md', **kwargs) -> str

Generate an SCM report and write it directly to a file.

Parameters:

Name Type Description Default
data DataFrame

Long-format panel data.

required
outcome str

Outcome variable name.

required
unit str

Unit identifier column.

required
time str

Time period column.

required
treated_unit any

Identifier of the treated unit.

None
treatment_time any

First treatment period (inclusive).

None
method str

SCM variant passed to synth().

'classic'
output str

Output format: 'text', 'markdown', or 'latex'.

'markdown'
sensitivity bool

Whether to include the sensitivity analysis section.

True
alpha float

Significance level.

0.05
filename str

Output file path.

'report.md'
**kwargs

Additional keyword arguments forwarded to synth().

{}

Returns:

Type Description
str

The generated report string (also written to filename).

Examples:

>>> import statspai as sp
>>> sp.synth_report_to_file(
...     df, outcome='cigsale', unit='state', time='year',
...     treated_unit='California', treatment_time=1989,
...     filename='california_scm.md',
... )

synth_to_latex

synth_to_latex(obj: Union[CausalResult, 'SynthComparison', List[CausalResult]], *, caption: Optional[str] = None, label: Optional[str] = None, booktabs: bool = True, show_ci: bool = True, show_weights: bool = False, top_n_weights: int = 5, digits: int = 4, method_names: Optional[Sequence[str]] = None) -> str

Formatted LaTeX table for synthetic-control results.

Single-result mode produces a vertical table with ATT, SE, confidence interval, pre-RMSPE, fit quality, and (optionally) the top-N donor weights. Comparison mode (SynthComparison or list of results) produces a wide table with one column per method, the standard textbook layout for empirical applied work.

Parameters:

Name Type Description Default
obj CausalResult, SynthComparison, or list of CausalResult

Object to render. SynthComparison and lists trigger the side-by-side multi-method layout.

required
caption str

Table caption. Defaults to a sensible auto-generated string.

None
label str

LaTeX label for cross-referencing. Defaults to "tab:synth" (single) or "tab:synth_compare" (multi).

None
booktabs bool

If True, use \toprule / \midrule / \bottomrule (requires \usepackage{booktabs}). Falls back to \hline if False.

True
show_ci bool

Include the confidence-interval row.

True
show_weights bool

Append a panel listing the top-N donor weights.

False
top_n_weights int

How many donors to show per method when show_weights=True.

5
digits int

Number of decimal places.

4
method_names list of str

Override column labels in comparison mode.

None

Returns:

Type Description
str

LaTeX source ready to drop into a paper. Stars use the standard * p<0.1, ** p<0.05, *** p<0.01 convention.

Examples:

>>> result = sp.synth(df, ..., method='augmented')
>>> print(sp.synth_to_latex(result, show_weights=True))

Multi-method comparison:

>>> comp = sp.synth_compare(df, ..., methods=['classic', 'sdid', 'mc'])
>>> print(sp.synth_to_latex(comp, caption='SCM benchmark'))

synth_to_markdown

synth_to_markdown(obj: Union[CausalResult, 'SynthComparison', List[CausalResult]], *, title: Optional[str] = None, show_ci: bool = True, show_weights: bool = False, top_n_weights: int = 5, digits: int = 4, method_names: Optional[Sequence[str]] = None) -> str

GitHub-flavoured Markdown table for synthetic-control results.

Mirrors :func:synth_to_latex in scope but emits a pipe-delimited Markdown table that renders cleanly on GitHub, in pandoc, and in most static-site generators.

Parameters:

Name Type Description Default
obj Union[CausalResult, 'SynthComparison', List[CausalResult]]

See :func:synth_to_latex.

required
title Union[CausalResult, 'SynthComparison', List[CausalResult]]

See :func:synth_to_latex.

required
show_ci Union[CausalResult, 'SynthComparison', List[CausalResult]]

See :func:synth_to_latex.

required
show_weights Union[CausalResult, 'SynthComparison', List[CausalResult]]

See :func:synth_to_latex.

required
top_n_weights Union[CausalResult, 'SynthComparison', List[CausalResult]]

See :func:synth_to_latex.

required
digits Union[CausalResult, 'SynthComparison', List[CausalResult]]

See :func:synth_to_latex.

required
method_names Union[CausalResult, 'SynthComparison', List[CausalResult]]

See :func:synth_to_latex.

required

Returns:

Type Description
str

Markdown source.

synth_to_excel

synth_to_excel(obj: Union[CausalResult, 'SynthComparison', List[CausalResult]], path: str, *, method_names: Optional[Sequence[str]] = None, digits: int = 6) -> str

Multi-sheet Excel workbook for synthetic-control results.

Sheets
  • "Summary" — one row per method (ATT, SE, CI, pre-RMSPE, fit quality, donor counts).
  • "Weights" — donor weights per method (one column per method; missing donors are NaN).
  • "Gap_<method>" — per-period treated / synthetic / gap for each method.
  • "Diagnostics" — scalar diagnostics (pre-RMSPE, post/pre RMSPE ratio, fit quality, n_donors, etc.).

Requires openpyxl (already a soft dependency of pandas Excel I/O). Will raise ModuleNotFoundError with an actionable hint if it is not installed.

Parameters:

Name Type Description Default
obj CausalResult, SynthComparison, or list of CausalResult

Object to export.

required
path str

Destination .xlsx file path.

required
method_names list of str

Override sheet / column labels.

None
digits int

Rounding for floating-point values.

6

Returns:

Type Description
str

Absolute path of the file that was written.

synth_loo

synth_loo(data: DataFrame, outcome: str, unit: str, time: str, treated_unit: Any, treatment_time: Any, penalization: float = 0.0, alpha: float = 0.05) -> DataFrame

Leave-one-out donor sensitivity for Synthetic Control.

Re-fits SCM dropping each donor in turn. Identifies influential donors whose removal shifts the ATT substantially.

Parameters:

Name Type Description Default
data DataFrame

Long-format panel.

required
outcome str

Outcome variable.

required
unit str

Unit identifier column.

required
time str

Time column.

required
treated_unit any

Identifier of the treated unit.

required
treatment_time any

First treatment period.

required
penalization float

Ridge penalty forwarded to SCM.

0.0
alpha float

Significance level for z-based p-values.

0.05

Returns:

Type Description
DataFrame

Columns: dropped_unit, att, se, pvalue, pre_rmse.

Examples:

>>> loo = sp.synth_loo(df, outcome='gdp', unit='state', time='year',
...                    treated_unit='California', treatment_time=1989)
>>> loo.sort_values('att')

synth_time_placebo

synth_time_placebo(data: DataFrame, outcome: str, unit: str, time: str, treated_unit: Any, treatment_time: Any, penalization: float = 0.0, n_placebo_times: Optional[int] = None, alpha: float = 0.05) -> DataFrame

Time-placebo ("backdating") test for Synthetic Control.

Re-fits SCM using fake treatment times drawn from the pre-treatment period. If the method finds large "effects" where none should exist, the original estimate is suspect.

Parameters:

Name Type Description Default
data DataFrame

Long-format panel.

required
outcome str

Outcome variable.

required
unit str

Unit identifier column.

required
time str

Time column.

required
treated_unit any

Identifier of the treated unit.

required
treatment_time any

Real first treatment period.

required
penalization float

Ridge penalty forwarded to SCM.

0.0
n_placebo_times int

Max number of placebo treatment times to try. Default is all feasible pre-treatment times (leaving >= 2 pre-periods for each placebo fit).

None
alpha float

Significance level.

0.05

Returns:

Type Description
DataFrame

Columns: placebo_time, att, se, pvalue.

Examples:

>>> tp = sp.synth_time_placebo(df, outcome='gdp', unit='state',
...     time='year', treated_unit='California', treatment_time=1989)

synth_donor_sensitivity

synth_donor_sensitivity(data: DataFrame, outcome: str, unit: str, time: str, treated_unit: Any, treatment_time: Any, k: Optional[int] = None, n_samples: int = 100, penalization: float = 0.0, seed: Optional[int] = None) -> DataFrame

Donor-pool bootstrap sensitivity for Synthetic Control.

Draws n_samples random subsets of size k from the donor pool and re-fits SCM for each, producing a distribution of ATT estimates.

Parameters:

Name Type Description Default
data DataFrame

Long-format panel.

required
outcome str

Outcome variable.

required
unit str

Unit identifier column.

required
time str

Time column.

required
treated_unit any

Identifier of the treated unit.

required
treatment_time any

First treatment period.

required
k int

Donor subset size. Default is floor(J * 0.75) where J is the total number of donors.

None
n_samples int

Number of random donor subsets to draw.

100
penalization float

Ridge penalty forwarded to SCM.

0.0
seed int

Random seed for reproducibility.

None

Returns:

Type Description
DataFrame

Columns: iteration, donors_used, att, pre_rmse.

Examples:

>>> ds = sp.synth_donor_sensitivity(df, outcome='gdp', unit='state',
...     time='year', treated_unit='California', treatment_time=1989,
...     n_samples=200, seed=42)
>>> ds['att'].describe()

synth_rmspe_filter

synth_rmspe_filter(data: DataFrame, outcome: str, unit: str, time: str, treated_unit: Any, treatment_time: Any, thresholds: Optional[List[float]] = None, penalization: float = 0.0) -> DataFrame

Pre-RMSPE-filtered p-value robustness (Abadie et al. 2010).

Runs placebo SCM on every donor unit, computes each unit's pre-treatment RMSPE, then re-calculates the rank-based p-value after dropping placebos whose pre-RMSPE exceeds a multiple of the treated unit's pre-RMSPE.

Parameters:

Name Type Description Default
data DataFrame

Long-format panel.

required
outcome str

Outcome variable.

required
unit str

Unit identifier column.

required
time str

Time column.

required
treated_unit any

Identifier of the treated unit.

required
treatment_time any

First treatment period.

required
thresholds list of float

Multiples of treated-unit pre-RMSPE used as cut-offs. Default [1, 2, 5, 10, 20, np.inf].

None
penalization float

Ridge penalty.

0.0

Returns:

Type Description
DataFrame

Columns: threshold, n_placebos, pvalue, treated_pre_rmspe.

Examples:

>>> rp = sp.synth_rmspe_filter(df, outcome='gdp', unit='state',
...     time='year', treated_unit='California', treatment_time=1989)

synth_sensitivity

synth_sensitivity(data: DataFrame, outcome: str, unit: str, time: str, treated_unit: Any, treatment_time: Any, penalization: float = 0.0, n_donor_samples: int = 100, seed: Optional[int] = None, alpha: float = 0.05) -> Dict[str, Any]

Run all SCM sensitivity diagnostics in a single call.

Combines leave-one-out, time placebos, donor pool bootstrap, and pre-RMSPE filtering into one bundled report.

Parameters:

Name Type Description Default
data DataFrame

Long-format panel.

required
outcome str

Outcome variable.

required
unit str

Unit identifier column.

required
time str

Time column.

required
treated_unit any

Identifier of the treated unit.

required
treatment_time any

First treatment period.

required
penalization float

Ridge penalty.

0.0
n_donor_samples int

Number of random donor subsets for donor sensitivity.

100
seed int

Random seed.

None
alpha float

Significance level.

0.05

Returns:

Type Description
dict

Keys:

  • 'loo' — leave-one-out DataFrame
  • 'time_placebo' — time placebo DataFrame
  • 'donor_sensitivity' — donor bootstrap DataFrame
  • 'rmspe_filter' — RMSPE-filtered p-values DataFrame
  • 'summary' — formatted string summary

Examples:

>>> sens = sp.synth_sensitivity(df, outcome='gdp', unit='state',
...     time='year', treated_unit='California', treatment_time=1989,
...     n_donor_samples=200, seed=42)
>>> print(sens['summary'])
>>> sens['loo']

synth_sensitivity_plot

synth_sensitivity_plot(sensitivity_result: Dict[str, Any], figsize: Tuple[float, float] = (14, 10), title: Optional[str] = None) -> Any

Multi-panel sensitivity diagnostic plot.

Parameters:

Name Type Description Default
sensitivity_result dict

Output from :func:synth_sensitivity.

required
figsize tuple

Figure size in inches.

(14, 10)
title str

Super-title for the figure.

None

Returns:

Type Description
Figure

Examples:

>>> sens = sp.synth_sensitivity(df, outcome='gdp', unit='state',
...     time='year', treated_unit='California', treatment_time=1989)
>>> fig = sp.synth_sensitivity_plot(sens)
>>> fig.savefig('synth_sensitivity.png', dpi=150)

synth_survival

synth_survival(data: DataFrame, unit: str, time: str, survival: str, treated: str, treat_time: float, alpha: float = 0.05, n_placebos: int = 100, seed: int = 0) -> SyntheticSurvivalResult

Synthetic Survival Control estimator.

Parameters:

Name Type Description Default
data DataFrame

Long panel: one row per (unit, time) with a precomputed Kaplan-Meier survival probability in column survival. Each unit should have the same time grid (or be padded by forward/back-fill before calling — ragged grids are not accepted).

required
unit str

Unit (panel-id) column.

required
time str

Time grid column.

required
survival str

Column containing the survival probability :math:S_i(t) (in :math:(0,1)).

required
treated str

Column containing the name of the single treated unit. Accepts either a boolean column or a dedicated string/int identifier.

required
treat_time float

Time at which treatment starts (times >= treat_time are the post-treatment window).

required
alpha float

Uniform placebo CI level.

0.05
n_placebos int

Number of placebo permutations used to bootstrap the uniform band.

100
seed int
0

Returns:

Type Description
SyntheticSurvivalResult

Fitted counterfactual survival curve, gap trajectory, donor weights, and a placebo-based uniform confidence band.

Examples:

>>> import statspai as sp
>>> r = sp.synth_survival(
...     df, unit="trial_arm", time="month",
...     survival="km_est", treated="treated_arm", treat_time=6,
... )
>>> r.summary()

synth_experimental_design

synth_experimental_design(data: DataFrame, *, unit: str, time: str, outcome: str, k: int, candidates: Optional[Sequence[Any]] = None, donors: Optional[Sequence[Any]] = None, pre_period: Optional[Tuple[Any, Any]] = None, risk: str = 'mspe', concentration_weight: float = 0.0, penalization: float = 0.0, n_random: int = 500, random_state: Optional[int] = None) -> SynthExperimentalDesignResult

Pick k treated units to minimize the expected SC post-ATT variance.

Parameters:

Name Type Description Default
data DataFrame (long format)

Must contain columns [unit, time, outcome].

required
unit str

Column names for the panel.

required
time str

Column names for the panel.

required
outcome str

Column names for the panel.

required
k int

Number of units to select for treatment. Must satisfy 1 <= k <= len(candidates) - 1.

required
candidates sequence

Units eligible for treatment. Defaults to all units.

None
donors sequence

Units available as donors. Defaults to "all units NOT in candidates"; if candidates covers all units we fall back to a leave-one-out protocol where each candidate's donor pool is every other unit.

None
pre_period (start, end)

Closed interval of pre-treatment periods. Defaults to all timestamps in data.

None
risk ('mspe', 'rmse')

Loss functional for ranking candidates.

'mspe'
concentration_weight float

Penalty on donor-weight concentration (Herfindahl): risk_score = loss + lambda * H(w) where H(w) = sum(w_j^2). Abadie-Zhao show that for a fixed pre-MSPE, less-concentrated donors give tighter post-period confidence intervals.

0.0
penalization float

Ridge penalty passed to the simplex solver (Doudchenko & Imbens 2016 style).

0.0
n_random int

Monte-Carlo draws used to estimate baseline_variance (the expected sum-MSPE under random-k selection).

500
random_state int
None

Returns:

Type Description
SynthExperimentalDesignResult
Notes

The practical recipe (Abadie-Zhao 2025/2026, Section 4) is:

  1. For each candidate unit i, solve the simplex SC problem against the donor pool restricted to non-candidates (to avoid coupling risk scores across candidates).
  2. Record the pre-period MSPE as the plug-in estimate of sigma^2_i.
  3. Pick the k candidates with the smallest risk_score: loss_i + lambda * H(w_i).

The implementation degrades gracefully when candidates covers all units: we then use per-candidate leave-one-out donor pools.

Examples:

>>> import statspai as sp
>>> df = sp.utils.dgp_synth(n_units=40, n_periods=20, seed=0)
>>> res = sp.synth_experimental_design(
...     df, unit='unit', time='time', outcome='y',
...     k=5, pre_period=(0, 19), random_state=0,
... )
>>> res.selected
[12, 7, 23, 4, 30]
>>> print(res.summary())

synthdid_estimate

synthdid_estimate(data, y, unit, time, treat_unit, treat_time, **kw)

R-style alias: synthdid::synthdid_estimate.

sc_estimate

sc_estimate(data, y, unit, time, treat_unit, treat_time, **kw)

R-style alias: synthdid::sc_estimate.

did_estimate

did_estimate(data, y, unit, time, treat_unit, treat_time, **kw)

R-style alias: synthdid::did_estimate.

synthdid_placebo

synthdid_placebo(data: DataFrame, y: str, unit: str, time: str, treat_unit: Any, treat_time: Any, method: Literal['sdid', 'sc', 'did'] = 'sdid', **kw) -> DataFrame

Run placebo estimates assigning treatment to each control unit.

Replicates synthdid::synthdid_placebo.

Accepts the same arguments as :func:sdid, plus any extra keyword arguments.

Returns:

Type Description
DataFrame

One row per control unit with columns: unit, estimate, se, pvalue.

synthdid_plot

synthdid_plot(result: CausalResult, ax=None, figsize: tuple = (10, 6), treated_color: str = '#2C3E50', synth_color: str = '#E74C3C', ci_alpha: float = 0.15, title: Optional[str] = None)

Plot observed vs synthetic trajectory.

Replicates synthdid::plot.synthdid_estimate.

Parameters:

Name Type Description Default
result CausalResult

Output of :func:sdid.

required
ax matplotlib Axes
None
figsize tuple
(10, 6)
treated_color str
'#2C3E50'
synth_color str
'#2C3E50'
ci_alpha float
0.15
title str
None

Returns:

Type Description
(fig, ax)

synthdid_units_plot

synthdid_units_plot(result: CausalResult, top_n: int = 10, ax=None, figsize: tuple = (8, 5))

Horizontal bar chart of unit weight contributions.

Replicates synthdid::synthdid_units_plot.

Parameters:

Name Type Description Default
result CausalResult
required
top_n int

Show the top-N donors by weight.

10
ax matplotlib Axes
None
figsize tuple
(8, 5)

Returns:

Type Description
(fig, ax)

synthdid_rmse_plot

synthdid_rmse_plot(result: CausalResult, ax=None, figsize: tuple = (8, 5))

Pre-treatment RMSE of treated vs synthetic trajectory.

Returns:

Type Description
(fig, ax)

california_prop99

california_prop99() -> DataFrame

California Proposition 99 tobacco control dataset.

Returns a balanced panel of per-capita cigarette sales for 39 US states, 1970-2000. California implemented Proposition 99 in 1989.

This is the canonical synthdid example dataset.

Returns:

Type Description
DataFrame

Columns: state, year, packspercapita, treated.

Examples:

>>> df = sp.synth.california_prop99()
>>> result = sp.sdid(df, y='packspercapita', unit='state',
...                  time='year', treat_unit='California',
...                  treat_time=1989)

german_reunification

german_reunification() -> DataFrame

German reunification dataset (simulated).

Returns a balanced panel of GDP per capita for 17 OECD countries, 1960--2003. West Germany is the treated unit; treatment begins in 1990 (reunification).

The simulated trajectories reproduce the key stylised facts: Luxembourg has the highest GDP per capita (~40 000), Portugal the lowest (~10 000), and all countries share a common upward growth trend. Post-1990, West Germany exhibits an approximately 1 500 GDP-per-capita decline relative to its synthetic counterfactual.

References

Abadie, A., Diamond, A. & Hainmueller, J. (2015). "Comparative Politics and the Synthetic Control Method." American Journal of Political Science, 59(2), 495--510. [@abadie2015comparative]

Returns:

Type Description
DataFrame

Columns: country, year, gdppc, treated.

Examples:

>>> import statspai as sp
>>> df = sp.synth.german_reunification()
>>> result = sp.synth.synth(df, y='gdppc', unit='country',
...                         time='year', treat_unit='West Germany',
...                         treat_time=1990)

basque_terrorism

basque_terrorism() -> DataFrame

Basque Country terrorism dataset (simulated).

Returns a balanced panel of GDP per capita (thousands of 1986 USD) for 17 Spanish regions, 1955--1997. The Basque Country is the treated unit; treatment begins in 1970 (onset of ETA terrorism).

The simulated data reproduce the gradual widening of an approximately 10 % GDP gap between the Basque Country and its synthetic counterfactual after 1970.

References

Abadie, A. & Gardeazabal, J. (2003). "The Economic Costs of Conflict: A Case Study of the Basque Country." American Economic Review, 93(1), 113--132. [@abadie2003economic]

Returns:

Type Description
DataFrame

Columns: region, year, gdppc, treated.

Examples:

>>> import statspai as sp
>>> df = sp.synth.basque_terrorism()
>>> result = sp.synth.synth(df, y='gdppc', unit='region',
...                         time='year', treat_unit='Basque Country',
...                         treat_time=1970)

california_tobacco

california_tobacco() -> DataFrame

California Proposition 99 tobacco dataset (simulated, extended).

Returns a balanced panel of per-capita cigarette sales and covariates for 39 US states, 1970--2000. California is the treated unit; treatment begins in 1989 (Proposition 99).

This dataset extends the simpler california_prop99() panel with additional covariates (retail price, log income, youth population share, beer consumption), enabling covariate-matching SCM analyses.

References

Abadie, A., Diamond, A. & Hainmueller, J. (2010). "Synthetic Control Methods for Comparative Case Studies: Estimating the Effect of California's Tobacco Control Program." Journal of the American Statistical Association, 105(490), 493--505. [@abadie2010synthetic]

Returns:

Type Description
DataFrame

Columns: state, year, cigsale, retprice, lnincome, age15to24, beer, treated.

Notes
  • cigsale : per-capita cigarette sales (packs).
  • retprice : average retail price per pack (cents, real).
  • lnincome : log of real per-capita personal income.
  • age15to24 : share of population aged 15--24 (percent).
  • beer : per-capita beer consumption (gallons).

Examples:

>>> import statspai as sp
>>> df = sp.synth.california_tobacco()
>>> result = sp.synth.synth(df, y='cigsale', unit='state',
...                         time='year', treat_unit='California',
...                         treat_time=1989)