Choosing a matching / weighting estimator¶
When your design relies on selection-on-observables (CIA / unconfoundedness) and you have a binary treatment, you have 7+ estimators in StatsPAI. Here's how to choose.
0. TL;DR flowchart¶
Is your covariate set high-dimensional (p > 20)?
YES -> Double ML (sp.dml), meta-learners (sp.S_Learner, etc.)
NO -> continue
Is your target the ATT (effect on the treated)?
YES -> sp.ebalance (entropy balancing) or sp.match(estimand='ATT')
NO -> continue
Is your target the ATE (population average)?
YES -> sp.cbps(estimand='ATE') or sp.aipw
NO -> continue
Do you need OVERLAP-weighted effect (avoiding extrapolation)?
YES -> sp.overlap_weights (ATO)
NO -> rethink — what estimand do you actually want?
1. Entropy balancing (ebal) — the "just works" default for ATT¶
Hainmueller (2012). Exact covariate balance by reweighting, no propensity-score modelling needed.
r = sp.ebalance(df, y='y', treat='d',
covariates=['X1', 'X2', 'X3'],
moments=1) # balance means; moments=2 adds variances
Pros: no PSM model specification; exact balance by construction; no King-Nielsen issue. Cons: targets ATT only; can be sensitive to extreme weights.
2. Nearest-neighbor matching¶
Beware: King & Nielsen (2019) show that PSM-based nearest-neighbor matching can increase imbalance. Prefer Mahalanobis or coarsened exact matching (CEM):
r = sp.match(df, y='y', treat='d', covariates=[...],
distance='mahalanobis', # NOT 'propensity'
method='nearest', n_matches=3)
3. Covariate Balancing Propensity Score (CBPS)¶
Imai-Ratkovic (2014). Fits the propensity score to balance covariates directly, not to maximise likelihood.
r = sp.cbps(df, y='y', treat='d', covariates=[...],
estimand='ATE', # or 'ATT'
variant='over') # 'over' (overidentified) is preferred
More robust to PS misspecification than IPW.
4. Overlap weights (ATO)¶
Li-Morgan-Zaslavsky (2018). Weights each unit by its propensity of receiving the "other" treatment, yielding effects on the overlap population — the subpopulation where both treatments are plausible.
Avoids extreme weights from near-zero / near-one propensities.
5. Doubly-robust estimators¶
AIPW combines an outcome model and a propensity-score model — correct if either is right.
For high-dimensional covariates, use Double ML (Chernozhukov et al. 2018):
r = sp.dml(df, y='y', treat='d', covariates=[...],
ml_model='lasso', # or 'rf', 'xgb'
cross_fitting_folds=5)
DML is the state of the art for observational ATE with many controls.
6. Meta-learners (for heterogeneous effects)¶
If you want not just the ATE but a CATE function τ(X):
from statspai.metalearners import S_Learner, T_Learner, X_Learner, DR_Learner
dr = DR_Learner(outcome_model='rf', ps_model='lr')
dr.fit(df[cov_cols], df['d'], df['y'])
cate = dr.predict(df_new[cov_cols])
See the meta-learner guide for diagnostics (CATE calibration, policy value).
7. Common mistakes¶
| Mistake | Fix |
|---|---|
| Including post-treatment variables in covariates | Drop them — never condition on consequences |
| Including colliders as covariates | Use a DAG (sp.DAG) to check adjustment sets |
| Reporting results without checking overlap | Always plot PS distributions (sp.psplot) |
| Reporting ATE when you computed ATT | Check estimand in the call / result |
| Using PSM nearest-neighbor (King-Nielsen 2019) | Use distance='mahalanobis' or method='cem' |
| Not trimming extreme weights | Use trim=0.01 or overlap weights |
8. Mandatory diagnostics¶
r = sp.ebalance(df, y='y', treat='d', covariates=[...])
# 1. Balance before/after
sp.love_plot(r) # SMDs before and after weighting
sp.ps_balance(r) # formal balance statistics
# 2. Overlap / common support
sp.overlap_plot(r)
sp.trimming(r, threshold=0.01)
# 3. Sensitivity to unobserved confounding
sp.sensemakr(r, benchmark_covariates=['X1']) # Cinelli-Hazlett
sp.oster_bounds(r) # Oster 2019
sp.evalue(r) # VanderWeele-Ding E-value
9. Reading the output¶
r.estimate # Point estimate (ATT / ATE / ATO)
r.se # Bootstrap or analytical SE
r.ci # CI
r.tidy() # Main row + per-unit weights if detail available
r.glance() # method, nobs, estimand, ESS (effective sample size)
r.detail # If present: balance table with SMDs
10. Estimand cheat sheet¶
| Estimand | What it is | Recommended estimator |
|---|---|---|
| ATT | Average effect on the treated | ebalance, match(ATT) |
| ATE | Average effect on the population | cbps(ATE), aipw, dml |
| ATO | Effect on the overlap population | overlap_weights |
| ATC | Average effect on the controls | match(estimand='ATC') |
| CATE(x) | Conditional on covariates X=x | Meta-learners, causal forest |
| LATE | Effect on compliers | IV (not matching) |
For Agents¶
Pre-conditions - binary treatment 0/1 - covariates are pre-treatment (temporally prior to D) - enough control units for each treated unit under the chosen method (k:1 matching) - covariates numeric; categoricals one-hot or handled by caliper/mahalanobis
Identifying assumptions - Unconfoundedness / CIA: Y(d) ⊥ D | X - Overlap / common support: treated X-values are in the control X-support - SUTVA: no interference between matched units - Covariates are selected before looking at outcomes (no post-treatment conditioning)
Failure modes → recovery
| Symptom | Exception | Remedy | Try next |
|---|---|---|---|
| Covariate imbalance after matching (max |SMD| > 0.1) | statspai.AssumptionViolation |
Re-match with stricter caliper, add interactions, or switch to sp.ebalance (entropy balancing). | sp.ebalance |
| Poor propensity score overlap (density plots, treated mass where controls are sparse) | statspai.AssumptionViolation |
Apply sp.trimming (Crump 2009) or redefine the estimand to the overlap region. | sp.trimming |
| Too few matched controls per treated unit | statspai.DataInsufficient |
Relax caliper, allow with-replacement, or use entropy balancing / overlap weights. | sp.ebalance |
| Results highly sensitive to match specification | statspai.AssumptionWarning |
Report sp.rosenbaum_bounds (sensitivity to unobserved confounding) and compare multiple matching methods. | sp.rosenbaum_bounds |
Alternatives (ranked)
- sp.ebalance
- sp.cbps
- sp.optimal_match
- sp.sbw
- sp.ipw
Typical minimum N: 200