`statspai.dml`¶

dml ¶

Double/Debiased Machine Learning module for StatsPAI.

Implements the Chernozhukov et al. (2018) framework with separate per-model estimator classes (PLR / IRM / PLIV / IIVM) sharing a common cross-fitting infrastructure.

Public entry points:

:func:dml — dispatcher, selects the model via model= string.
:class:DoubleML — legacy façade, delegates to a per-model class.
:class:DoubleMLPLR, :class:DoubleMLIRM, :class:DoubleMLPLIV, :class:DoubleMLIIVM — direct per-model entry points.

References

Chernozhukov, V., Chetverikov, D., Demirer, M., Duflo, E., Hansen, C., Newey, W., and Robins, J. (2018). "Double/Debiased Machine Learning for Treatment and Structural Parameters." Econometrics Journal, 21(1), C1-C68. [@chernozhukov2018double]

DoubleML ¶

Legacy façade. Prefer the per-model classes directly for new code: :class:DoubleMLPLR, :class:DoubleMLIRM, :class:DoubleMLPLIV, :class:DoubleMLIIVM. Kept for backward compatibility.

For most workflows call the functional entry point :func:dml instead; this class wraps the same dispatcher behind a .fit() method for code that predates the functional API.

Examples:

>>> import numpy as np
>>> import pandas as pd
>>> import statspai as sp
>>> rng = np.random.default_rng(42)
>>> n = 400
>>> x1 = rng.normal(size=n)
>>> x2 = rng.normal(size=n)
>>> d = 0.5 * x1 + rng.normal(size=n)
>>> y = 1.0 * d + x1 + 0.5 * x2 + rng.normal(size=n)
>>> df = pd.DataFrame({'y': y, 'd': d, 'x1': x1, 'x2': x2})
>>> est = sp.DoubleML(df, y='y', treat='d', covariates=['x1', 'x2'],
...                   model='plr', ml_g='linear', ml_m='linear',
...                   n_folds=2)
>>> est.model
'plr'
>>> res = est.fit()
>>> isinstance(res, sp.CausalResult)
True

References

[@chernozhukov2018double]

DoubleMLPLR ¶

Bases: _DoubleMLBase

Partially linear regression DML (continuous or binary D, no IV).

Direct entry point for the partially linear model Y = theta * D + g(X) + eps. Usually reached through the dispatcher sp.dml(..., model='plr'); the class is exposed for callers who want to construct the estimator explicitly.

Examples:

>>> import numpy as np
>>> import pandas as pd
>>> import statspai as sp
>>> rng = np.random.default_rng(0)
>>> n = 300
>>> x1 = rng.normal(size=n)
>>> x2 = rng.normal(size=n)
>>> d = 0.5 * x1 + rng.normal(size=n)
>>> y = 1.5 * d + x1 + 0.5 * x2 + rng.normal(size=n)
>>> df = pd.DataFrame({"y": y, "d": d, "x1": x1, "x2": x2})
>>> est = sp.DoubleMLPLR(
...     df, y="y", treat="d", covariates=["x1", "x2"], n_folds=3,
... )
>>> res = est.fit()
>>> bool(np.isfinite(res.estimate))
True

DoubleMLIRM ¶

Bases: _DoubleMLBase

Interactive regression DML — binary D, ATE via AIPW.

Direct entry point for the interactive regression model with a binary treatment; estimates the ATE via the cross-fitted doubly-robust (AIPW) score. Usually reached through the dispatcher sp.dml(..., model='irm').

Examples:

>>> import numpy as np
>>> import pandas as pd
>>> import statspai as sp
>>> rng = np.random.default_rng(0)
>>> n = 400
>>> x1 = rng.normal(size=n)
>>> x2 = rng.normal(size=n)
>>> ps = 1 / (1 + np.exp(-(0.5 * x1)))
>>> d = (rng.uniform(size=n) < ps).astype(int)
>>> y = 1.0 * d + x1 + 0.5 * x2 + rng.normal(size=n)
>>> df = pd.DataFrame({"y": y, "d": d, "x1": x1, "x2": x2})
>>> est = sp.DoubleMLIRM(
...     df, y="y", treat="d", covariates=["x1", "x2"], n_folds=3,
... )
>>> res = est.fit()
>>> bool(np.isfinite(res.estimate))
True

DoubleMLPLIV ¶

Bases: _DoubleMLBase

Partially linear IV DML — endogenous D with continuous/binary Z.

Direct entry point for the partially linear IV model Y = theta * D + g(X) + eps with E[eps | Z, X] = 0. Usually reached through the dispatcher sp.dml(..., model='pliv'); an instrument column must be supplied.

Examples:

>>> import numpy as np
>>> import pandas as pd
>>> import statspai as sp
>>> rng = np.random.default_rng(0)
>>> n = 400
>>> x1 = rng.normal(size=n)
>>> x2 = rng.normal(size=n)
>>> z = rng.normal(size=n)
>>> d = 0.8 * z + 0.5 * x1 + rng.normal(size=n)
>>> y = 1.5 * d + x1 + 0.5 * x2 + rng.normal(size=n)
>>> df = pd.DataFrame({"y": y, "d": d, "z": z, "x1": x1, "x2": x2})
>>> est = sp.DoubleMLPLIV(
...     df, y="y", treat="d", covariates=["x1", "x2"],
...     instrument="z", n_folds=3,
... )
>>> res = est.fit()
>>> bool(np.isfinite(res.estimate))
True

DoubleMLIIVM ¶

Bases: _DoubleMLBase

Interactive IV DML — binary D, binary Z, LATE via Wald.

Direct entry point for the interactive IV model with a binary treatment and binary instrument; estimates the LATE (complier effect) via the Neyman-orthogonal ratio of two doubly-robust scores. Usually reached through sp.dml(..., model='iivm').

Examples:

>>> import numpy as np
>>> import pandas as pd
>>> import statspai as sp
>>> rng = np.random.default_rng(0)
>>> n = 400
>>> x1 = rng.normal(size=n)
>>> x2 = rng.normal(size=n)
>>> z = rng.integers(0, 2, size=n)
>>> d = ((0.6 * z + 0.3 * x1 + rng.normal(size=n)) > 0.3).astype(int)
>>> y = 1.0 * d + x1 + 0.5 * x2 + rng.normal(size=n)
>>> df = pd.DataFrame({"y": y, "d": d, "z": z, "x1": x1, "x2": x2})
>>> est = sp.DoubleMLIIVM(
...     df, y="y", treat="d", covariates=["x1", "x2"],
...     instrument="z", n_folds=3,
... )
>>> res = est.fit()
>>> bool(np.isfinite(res.estimate))
True

DMLAveragingResult ¶

Bases: CausalResult

CausalResult extended with per-candidate and weight details.

Attributes stored in model_info:

candidates — list of candidate labels.
theta_k — per-candidate :math:\hat\theta (only meaningful for the non-stacking weight rules).
se_k — per-candidate SE.
mse_k — per-candidate nuisance risk (g + m).
weights — averaging or stacking weights.
weights_g / weights_m — separate stacking weights per nuisance under weight_rule="short_stacking".
weight_rule — how the weights were computed.

Examples:

>>> import numpy as np
>>> import pandas as pd
>>> import statspai as sp
>>> from sklearn.linear_model import LassoCV, RidgeCV
>>> rng = np.random.default_rng(0)
>>> n = 300
>>> X = rng.normal(size=(n, 5))
>>> d = X[:, 0] + rng.normal(size=n)
>>> y = 1.0 * d + X[:, 0] + 0.5 * X[:, 1] + rng.normal(size=n)
>>> df = pd.DataFrame({"y": y, "d": d,
...                    **{f"x{j}": X[:, j] for j in range(5)}})
>>> candidates = [(LassoCV(), LassoCV(), "lasso"),
...               (RidgeCV(), RidgeCV(), "ridge")]
>>> r = sp.dml_model_averaging(df, y="y", treat="d",
...                            covariates=[f"x{j}" for j in range(5)],
...                            candidates=candidates, n_folds=3)
>>> isinstance(r, sp.DMLAveragingResult)
True
>>> sorted(r.model_info["weights_m"])  # CLS stacking weights
['lasso', 'ridge']

DMLPanelResult `dataclass` ¶

Bases: ResultProtocolMixin

Output of :func:dml_panel.

Attributes:

Name	Type	Description
`estimate`	`float`	Debiased treatment effect β̂.
`se`	`float`	Cluster-robust SE at the unit level.
`ci_lower, ci_upper`	`float`
`p_value`	`float`
`t_stat`	`float`
`n_units`	`int`
`n_obs`	`int`
`n_folds`	`int`
`include_time_fe`	`bool`
`ml_g_name`	`str`	Short name of the outcome nuisance learner.
`ml_m_name`	`str`	Short name of the treatment nuisance learner.
`method`	`str`	Always `"dml_panel"`.
`diagnostics`	`dict`	Populated with `{'y_resid_std', 'd_resid_std', 'corr_yd_resid', 'within_r2', 'omega_cluster'}`.

Examples:

>>> import numpy as np
>>> import pandas as pd
>>> import statspai as sp
>>> rng = np.random.default_rng(0)
>>> rows = []
>>> for i in range(50):
...     alpha_i = rng.normal()
...     for t in range(5):
...         x1, x2 = rng.normal(), rng.normal()
...         union = 0.4 * x1 + alpha_i + rng.normal()
...         lw = (0.8 * union + x1 + 0.5 * x2 + alpha_i
...               + 0.1 * t + rng.normal())
...         rows.append({'pid': i, 'year': t, 'log_wage': lw,
...                      'union': union, 'exper': x1, 'educ': x2})
>>> df = pd.DataFrame(rows)
>>> res = sp.dml_panel(df, y='log_wage', treat='union',
...                    covariates=['exper', 'educ'], unit='pid',
...                    time='year', include_time_fe=True, n_folds=3)
>>> isinstance(res, sp.DMLPanelResult)
True
>>> res.method
'dml_panel'

DMLSensitivityResult `dataclass` ¶

Bases: ResultProtocolMixin

Output of :func:dml_sensitivity.

Attributes:

Name	Type	Description
`estimate`	`float`	Original DML point estimate.
`se`	`float`	Original DML standard error.
`rv_q`	`float`	Robustness value at the user's q-threshold (default `q=1` ⇒ strength of confounder needed to shrink estimate to zero).
`rv_qa`	`float`	Confounder strength needed to push the (1-α)·100% lower CI across zero. Strictly less than or equal to `rv_q`.
`bias_bound`	`float`	Maximum \|bias\| under the user-specified `cf_y, cf_d`.
`adjusted_estimate_low`	`float`
`adjusted_estimate_high`	`float`	Bias-adjusted estimate range under the (cf_y, cf_d) scenario.
`benchmarks`	`DataFrame`	For each benchmark covariate `X_k`: `cf_y_bench`, `cf_d_bench`, and the implied bias / adjusted estimate when a confounder is assumed to be `k_y, k_d` times as strong as `X_k` in the residualised regression. Empty if `benchmark` not provided.
`s`	`float`	Scaling factor :math:`S = \sigma_{Y\text{resid}} / \sigma_{D\text{resid}}` used in the bias formula.
`q`	`float`
`alpha`	`float`
`method`	`str`	Always `"DML-OVB (Chernozhukov-Cinelli-Newey 2022)"`.

Examples:

>>> import numpy as np
>>> import pandas as pd
>>> import statspai as sp
>>> rng = np.random.default_rng(42)
>>> n = 400
>>> x1 = rng.normal(size=n)
>>> x2 = rng.normal(size=n)
>>> d = 0.5 * x1 + rng.normal(size=n)
>>> y = 1.0 * d + x1 + 0.5 * x2 + rng.normal(size=n)
>>> df = pd.DataFrame({'y': y, 'd': d, 'x1': x1, 'x2': x2})
>>> fit = sp.dml(df, y='y', treat='d', covariates=['x1', 'x2'],
...              model='plr', ml_g='linear', ml_m='linear', n_folds=2)
>>> sens = sp.dml_sensitivity(fit, q=1.0, cf_y=0.05, cf_d=0.05)
>>> isinstance(sens, sp.DMLSensitivityResult)
True
>>> round(sens.rv_q, 2)  # confounder strength needed to zero out theta
0.49

plot ¶

plot(ax: Any = None, levels: Sequence[float] = (0.0, 0.5, 1.0), figsize: Any = (6.0, 5.0)) -> Any

Plot bias-contour grid for hypothetical (cf_y, cf_d) pairs.

Plots the |bias|/|θ̂| contour as a function of the confounder strength on Y and D. The user can read off the RV directly. Mirrors the sensemakr plotting convention.

DMLDiagnostics `dataclass` ¶

Bundled DML diagnostics returned by :func:dml_diagnostics.

Holds the overlap table, residual-balance table, score-density moments, and the orthogonality test for a fitted DML estimator. Use .summary() for the text report and .plot() for the 2x2 diagnostic panel.

Examples:

>>> import numpy as np
>>> import pandas as pd
>>> import statspai as sp
>>> rng = np.random.default_rng(42)
>>> n = 400
>>> x1 = rng.normal(size=n)
>>> x2 = rng.normal(size=n)
>>> d = 0.5 * x1 + rng.normal(size=n)
>>> y = 1.0 * d + x1 + 0.5 * x2 + rng.normal(size=n)
>>> df = pd.DataFrame({'y': y, 'd': d, 'x1': x1, 'x2': x2})
>>> fit = sp.dml(df, y='y', treat='d', covariates=['x1', 'x2'],
...              model='plr', ml_g='linear', ml_m='linear',
...              n_folds=2)
>>> diag = sp.dml_diagnostics(fit)
>>> isinstance(diag, sp.DMLDiagnostics)
True
>>> diag.method
'PLR'

plot ¶

plot(figsize: Any = (10.0, 8.0), bins: int = 30) -> Any

Render a 2×2 publication-style diagnostic panel.

Top-left : overlap histogram (propensity for IRM, |D-resid| for PLR). Top-right : score-density histogram with N(0,σ̂²) overlay. Bottom-left: balance bar chart (corr with each residualised covariate). Bottom-right: Q-Q plot of standardised score vs N(0,1).

dml ¶

dml(data: DataFrame, y: str, treat: str, covariates: List[str], model: str = 'plr', instrument: Optional[Union[str, List[str]]] = None, ml_g: Optional[Any] = None, ml_m: Optional[Any] = None, ml_r: Optional[Any] = None, n_folds: int = 5, n_rep: int = 1, alpha: float = 0.05, random_state: int = 42, sample_weight: Optional[Any] = None, fold_indices: Optional[Any] = None, score: Optional[str] = None, normalize_ipw: bool = False, trimming_threshold: float = 0.01) -> CausalResult

Estimate causal effect using Double/Debiased Machine Learning.

Parameters:

Name	Type	Description	Default
`data`	`DataFrame`		required
`y`	`str`	Outcome variable.	required
`treat`	`str`	Treatment variable.	required
`covariates`	`list of str`	High-dimensional controls X.	required
`model`	`str`	DML model: `'plr'` : partially linear (continuous or binary D) `'irm'` : interactive regression (binary D; ATE via AIPW) `'pliv'` : partially linear IV (endogenous D with instrument Z) `'iivm'` : interactive IV (binary D, binary Z; LATE via Wald ratio)	`'plr'`
`instrument`	`str`	Scalar instrument Z. Required for `model in {'pliv', 'iivm'}`.	`None`
`ml_g`	`sklearn estimator or str`	Learners for outcome / treatment / instrument nuisance. Pass any scikit-learn-compatible estimator, or one of the convenience string aliases (case-insensitive): `'rf'` — random forest (200 trees) `'gbm'` — gradient boosting (100 estimators, depth 3) `'lasso'` / `'ridge'` — penalised linear (CV-tuned) `'linear'` / `'ols'` — plain linear regression `'logistic'` — logistic regression (classifier roles only) `'xgb'` / `'lgbm'` — XGBoost / LightGBM (optional install) For binary treatment (`model='irm'`) `ml_m` is auto-coerced to the classifier variant; same for `ml_r` under `'iivm'`. `None` falls back to the per-model gradient-boosting default.	`None`
`ml_m`	`sklearn estimator or str`	Learners for outcome / treatment / instrument nuisance. Pass any scikit-learn-compatible estimator, or one of the convenience string aliases (case-insensitive): `'rf'` — random forest (200 trees) `'gbm'` — gradient boosting (100 estimators, depth 3) `'lasso'` / `'ridge'` — penalised linear (CV-tuned) `'linear'` / `'ols'` — plain linear regression `'logistic'` — logistic regression (classifier roles only) `'xgb'` / `'lgbm'` — XGBoost / LightGBM (optional install) For binary treatment (`model='irm'`) `ml_m` is auto-coerced to the classifier variant; same for `ml_r` under `'iivm'`. `None` falls back to the per-model gradient-boosting default.	`None`
`ml_r`	`sklearn estimator or str`	Learners for outcome / treatment / instrument nuisance. Pass any scikit-learn-compatible estimator, or one of the convenience string aliases (case-insensitive): `'rf'` — random forest (200 trees) `'gbm'` — gradient boosting (100 estimators, depth 3) `'lasso'` / `'ridge'` — penalised linear (CV-tuned) `'linear'` / `'ols'` — plain linear regression `'logistic'` — logistic regression (classifier roles only) `'xgb'` / `'lgbm'` — XGBoost / LightGBM (optional install) For binary treatment (`model='irm'`) `ml_m` is auto-coerced to the classifier variant; same for `ml_r` under `'iivm'`. `None` falls back to the per-model gradient-boosting default.	`None`
`n_folds`	`int`		`5`
`n_rep`	`int`	Repeated cross-fitting splits (median aggregation).	`1`
`alpha`	`float`		`0.05`
`random_state`	`int`	Seed for the cross-fitting fold assignment. Repeat splits use `random_state + rep` so a single `random_state` fully determines the result.	`42`
`sample_weight`	`ndarray \| Series \| str`	Per-observation weights (e.g., survey/probability weights). Pass either a 1-D array of length `len(data)` or a column name present in `data`. Supported for `model in {'plr', 'irm', 'pliv', 'iivm'}`.	`None`
`fold_indices`	`array - like or str`	Explicit cross-fit fold labels for `model='plr'`. Pass either a 1-D array of length `len(data)` or a column name. When provided, `n_rep` must be 1 and the labels must define exactly `n_folds` non-empty folds.	`None`
`score`	`str`	Orthogonal score variant (DoubleML-compatible). Model-specific: `model='plr'` : `'partialling out'` (default) or `'IV-type'` `model='irm'` : `'ATE'` (default) or `'ATTE'` (effect on the treated) `None` selects each model's default; the defaults reproduce the historical StatsPAI output exactly. Invalid for `pliv` / `iivm` beyond their single native score.	`None`
`normalize_ipw`	`bool`	Self-normalize the inverse-propensity weights (Hájek-style) for the IPW-based models `irm` / `iivm`. Matches DoubleML's `normalize_ipw`. Ignored (and rejected) for `plr` / `pliv`.	`False`
`trimming_threshold`	`float`	Symmetric propensity-score trimming for `irm` / `iivm`: the estimated propensity is clipped to `[t, 1 - t]`. The default `0.01` reproduces the historical clip and matches DoubleML's `trimming_threshold` with `trimming_rule='truncate'`.	`0.01`

Returns:

Type	Description
`CausalResult`

Notes

Numerical parity: all four models are pinned against the upstream doubleml-for-py reference implementation in tests/external_parity/test_dml_python_parity.py — the partialling-out models (plr, pliv) agree to machine precision under shared learners and folds. See the guide "sp.dml and the DoubleML reference implementation" in the docs for the full parity table and methodology.

Declared scope boundaries (each fails loudly with a workaround in the error message — see the guide's "Scope and known limitations"):

pliv / iivm accept a single scalar instrument; for multiple excluded instruments build a first-stage index with sp.scalar_iv_projection first.
One treatment column per call (no multi-treatment joint inference; combine per-treatment calls with sp.romano_wolf for simultaneous inference).
DML2 (pooled-moment) procedure only; PLR exposes the partialling-out score only.
fold_indices is supported for model='plr' only.
Cluster-robust DML inference lives in sp.dml_panel (unit-clustered panel PLR), not in this cross-sectional entry point.

Examples:

>>> # Partially Linear Regression
>>> result = sp.dml(df, y='wage', treat='training',
...                 covariates=['age', 'edu', 'exp'])

>>> # Interactive Regression (binary treatment, ATE)
>>> result = sp.dml(df, y='wage', treat='D', covariates=X_cols,
...                 model='irm')

>>> # Partially Linear IV — endogenous D, instrument Z
>>> result = sp.dml(df, y='earnings', treat='schooling',
...                 covariates=['age', 'father_edu'],
...                 model='pliv', instrument='quarter_of_birth')

>>> # Interactive IV — binary D, binary Z (LATE)
>>> result = sp.dml(df, y='earnings', treat='college',
...                 covariates=['age', 'ability'],
...                 model='iivm', instrument='lottery_win')

dml_model_averaging ¶

dml_model_averaging(data: DataFrame, y: str, treat: str, covariates: Sequence[str], candidates: Optional[List[Tuple[Any, Any, str]]] = None, n_folds: int = 5, seed: int = 0, weight_rule: str = 'short_stacking', alpha: float = 0.05, sample_weight: Optional[Any] = None) -> DMLAveragingResult

Model-averaging / stacking DML-PLR estimator.

Parameters:

Name	Type	Description	Default
`data`	`DataFrame`		required
`y`	`str`	Outcome column.	required
`treat`	`str`	Continuous-or-binary treatment column.	required
`covariates`	`list of str`	Covariate columns `X`.	required
`candidates`	`list of (ml_g, ml_m, label)`	Candidate nuisance learners. `ml_g` regresses `y` on `X`; `ml_m` regresses `treat` on `X`. Defaults to a Lasso/Ridge/ RandomForest/GradientBoosting roster.	`None`
`n_folds`	`int`	Cross-fitting folds per candidate.	`5`
`seed`	`int`		`0`
`weight_rule`	`('short_stacking', 'single_best', 'inverse_risk', 'equal')`	How to combine candidate nuisance predictions or estimates. `"short_stacking"` (default; Ahrens et al. 2025 eq. 7) — solve constrained least squares on cross-fitted predictions for each nuisance separately (`ŷ` and `D̂`), produce stacked nuisances, plug into the PLR moment equation. `"single_best"` — Ahrens et al. (2025, fn. 8): pick the candidate with lowest joint nuisance MSE. `"inverse_risk"` — :math:`w_k \propto 1/(\text{MSE}_g + \text{MSE}_m)`. Convenience baseline; not in the paper. `"equal"` — :math:`w_k = 1/K`. Convenience baseline; not in the paper. For the non-stacking rules (`inverse_risk` / `equal` / `single_best`) the function computes per-candidate :math:`\hat\theta_k` and reports the weighted average with a between-candidate-covariance-corrected SE; for `"short_stacking"` it reports the standard PLR sandwich SE on the stacked-nuisance score (Neyman orthogonality is preserved).	`"short_stacking"`
`alpha`	`float`	Two-sided CI level.	`0.05`
`sample_weight`	`ndarray \| Series \| str`	Per-observation weights. If supplied, every nuisance fit uses `sample_weight=` (with a graceful fallback warning if the learner does not accept it), the CLS stacking objective becomes weighted least squares, and the PLR moment + sandwich variance use weighted sums. The MSE used for `inverse_risk` / `single_best` weighting is also the weighted MSE.	`None`

Returns:

Type	Description
`DMLAveragingResult`	With the weighted :math:`\hat\theta`, SE, CI, and per-candidate diagnostics under `result.model_info`.

Examples:

>>> import numpy as np
>>> import pandas as pd
>>> import statspai as sp
>>> from sklearn.linear_model import LassoCV, RidgeCV
>>> rng = np.random.default_rng(0)
>>> n = 300
>>> X = rng.normal(size=(n, 5))
>>> d = X[:, 0] + rng.normal(size=n)
>>> y = 1.0 * d + X[:, 0] + 0.5 * X[:, 1] + rng.normal(size=n)
>>> df = pd.DataFrame({"y": y, "d": d,
...                    **{f"x{j}": X[:, j] for j in range(5)}})
>>> candidates = [(LassoCV(), LassoCV(), "lasso"),
...               (RidgeCV(), RidgeCV(), "ridge")]
>>> r = sp.dml_model_averaging(df, y="y", treat="d",
...                            covariates=[f"x{j}" for j in range(5)],
...                            candidates=candidates, n_folds=3)
>>> type(r).__name__
'DMLAveragingResult'
>>> sorted(r.model_info["weights_g"])  # CLS stacking weights for E[Y|X]
['lasso', 'ridge']
>>> sorted(r.model_info["weights_m"])  # CLS stacking weights for E[D|X]
['lasso', 'ridge']
>>> print(r.summary())

dml_panel ¶

dml_panel(data: DataFrame, y: str, treat: str, covariates: List[str], *, unit: str, time: Optional[str] = None, ml_g: Optional[Any] = None, ml_m: Optional[Any] = None, n_folds: int = 5, alpha: float = 0.05, include_time_fe: bool = False, binary_treatment: bool = False, seed: int = 0, sample_weight: Optional[Any] = None) -> DMLPanelResult

Long-panel Double/Debiased ML with unit FE and cluster-robust SE.

Estimates β in

.. math::

Y_{it} = \alpha_i + \lambda_t + \beta D_{it} + g(X_{it}) + \varepsilon_{it}

by (1) within-transforming y, treat and covariates to absorb unit (and optionally time) fixed effects; (2) running cross-fit PLR on the demeaned data with folds that split units; (3) computing the Neyman-orthogonal score and cluster-robust SE at the unit level.

Parameters:

Name	Type	Description	Default
`data`	`DataFrame`	Long-format panel; must contain `y`, `treat`, all `covariates`, `unit`, and `time` (if given).	required
`y`	`str`	Column names.	required
`treat`	`str`	Column names.	required
`covariates`	`list of str`	High-dimensional controls X_it. Pass `[]` to fit a pure FE model with ML-free residuals.	required
`unit`	`str`	Unit identifier column. Used both for FE absorption and cross-fit fold assignment.	required
`time`	`str`	Time identifier column; required when `include_time_fe=True`.	`None`
`ml_g`	`sklearn-style estimators`	Nuisance learners. Default: GradientBoosting with 200 trees, depth 3, lr 0.05 — same convention as :func:`sp.dml` PLR.	`None`
`ml_m`	`sklearn-style estimators`	Nuisance learners. Default: GradientBoosting with 200 trees, depth 3, lr 0.05 — same convention as :func:`sp.dml` PLR.	`None`
`n_folds`	`int`	Cross-fit folds over units. Must be >= 2 and <= n_units.	`5`
`alpha`	`float`		`0.05`
`include_time_fe`	`bool`	If True, also subtract time means (two-way within transform).	`False`
`binary_treatment`	`bool`	Deprecated. Previously routed binary D through a classifier path that mixed within-demeaned features with raw {0,1} labels — the resulting propensity had no clean interpretation as :math:`E[\tilde D_{it} \mid \tilde X_{it}]`. The flag is now ignored: :func:`dml_panel` always residualises the within-transformed D̃ with a regressor (PLR is agnostic to D's type, and within-transformed binary D is no longer binary). For DR-style ATE on binary D in panels, use :func:`sp.dml(..., model='irm')` with unit dummies in the covariate set, :func:`sp.etwfe`, or :func:`sp.callaway_santanna`. A :class:`DeprecationWarning` is emitted when this argument is passed; passing `D ∈ {0,1}` while `binary_treatment=True` is validated to catch incidental misuse.	`False`
`seed`	`int`	RNG seed for fold assignment.	`0`
`sample_weight`	`ndarray \| Series \| str`	Per-observation weights (e.g., survey/probability weights). Pass either a 1-D array of length `len(data)` or a column name in `data`. The within transform becomes a weighted within transform (subtract weighted unit / time means), and the PLR moment + cluster-robust SE use weighted sums. Required if your survey design carries informative sampling probabilities.	`None`

Returns:

Type	Description
class:`DMLPanelResult`

Notes

The cluster-robust SE follows Liang-Zeger 1986 at the unit level (the coarser of the two dimensions): stacked scores are summed within unit before squaring. This is the appropriate clustering level when shocks at higher frequencies than the unit are plausibly correlated (cf. Cameron-Miller 2015 §3.2).

Identification requires no unobserved time-varying confounders — only time-invariant unit heterogeneity + high-dim observed X_it. Violations of strict exogeneity of D (e.g. dynamic-feedback) are not handled here; use :func:sp.msm or :func:sp.gformula_ice.

Examples:

>>> import numpy as np
>>> import pandas as pd
>>> import statspai as sp
>>> rng = np.random.default_rng(0)
>>> rows = []
>>> for i in range(50):
...     alpha_i = rng.normal()  # time-invariant unit heterogeneity
...     for t in range(5):
...         x1, x2 = rng.normal(), rng.normal()
...         union = 0.4 * x1 + alpha_i + rng.normal()
...         lw = (0.8 * union + x1 + 0.5 * x2 + alpha_i
...               + 0.1 * t + rng.normal())
...         rows.append({'pid': i, 'year': t, 'log_wage': lw,
...                      'union': union, 'exper': x1, 'educ': x2})
>>> df = pd.DataFrame(rows)
>>> res = sp.dml_panel(
...     df, y='log_wage', treat='union',
...     covariates=['exper', 'educ'],
...     unit='pid', time='year', include_time_fe=True, n_folds=3,
... )
>>> res.method
'dml_panel'
>>> print(res.summary())

dml_sensitivity ¶

dml_sensitivity(result: Any, q: float = 1.0, cf_y: Optional[float] = None, cf_d: Optional[float] = None, benchmark_covariates: Optional[Sequence[str]] = None, k_y: float = 1.0, k_d: float = 1.0) -> DMLSensitivityResult

Compute DML-OVB sensitivity for a fitted DML CausalResult.

Parameters:

Name	Type	Description	Default
`result`	`CausalResult`	From :func:`statspai.dml.dml` (PLR or IRM). Must carry the post-fit residuals via `model_info['_y_resid']`, `model_info['_d_resid']`, and the design matrix for benchmarks.	required
`q`	`float`	Bias threshold as a fraction of \|θ̂\|. `q=1` ⇒ confounder needed to shrink estimate to zero; `q=0.5` ⇒ half the estimate.	`1.0`
`cf_y`	`float`	Hypothesized partial-R² of an unobserved confounder with the residualised outcome and treatment. If both are given, the report includes a bias bound and adjusted-estimate range.	`None`
`cf_d`	`float`	Hypothesized partial-R² of an unobserved confounder with the residualised outcome and treatment. If both are given, the report includes a bias bound and adjusted-estimate range.	`None`
`benchmark_covariates`	`list of str`	Subset of the original covariates to benchmark against. For each `X_k`, the benchmark sets `cf_y_bench, cf_d_bench` to the partial R² that `X_k` itself contributes (multiplied by `k_y, k_d` to express "what if a confounder were k× as strong as `X_k`?").	`None`
`k_y`	`float`	Multipliers for the benchmark strengths.	`1.0`
`k_d`	`float`	Multipliers for the benchmark strengths.	`1.0`

Returns:

Type	Description
`DMLSensitivityResult`

Examples:

>>> import numpy as np
>>> import pandas as pd
>>> import statspai as sp
>>> rng = np.random.default_rng(42)
>>> n = 400
>>> x1 = rng.normal(size=n)
>>> x2 = rng.normal(size=n)
>>> d = 0.5 * x1 + rng.normal(size=n)
>>> y = 1.0 * d + x1 + 0.5 * x2 + rng.normal(size=n)
>>> df = pd.DataFrame({'y': y, 'd': d, 'x1': x1, 'x2': x2})
>>> fit = sp.dml(df, y='y', treat='d', covariates=['x1', 'x2'],
...              model='plr', ml_g='linear', ml_m='linear',
...              n_folds=2)
>>> sens = sp.dml_sensitivity(fit, q=1.0, cf_y=0.05, cf_d=0.05)
>>> round(sens.estimate, 2)
0.98
>>> round(sens.rv_q, 2)  # confounder strength to zero out theta
0.49
>>> round(sens.bias_bound, 2)  # |bias| under cf_y = cf_d = 0.05
0.07

dml_diagnostics ¶

dml_diagnostics(result: Any, clip: float = 0.02) -> DMLDiagnostics

Build a :class:DMLDiagnostics report from a DML CausalResult.

Parameters:

Name	Type	Description	Default
`result`	`CausalResult`	Result returned by :func:`statspai.dml.dml`. Must include the post-fit residuals (`model_info['_y_resid']`, `model_info['_d_resid']`); for IRM, additionally the propensity `model_info['diagnostics']['pscore_min']` etc. are surfaced.	required
`clip`	`float`	For IRM-style overlap: count units with propensity within `[0, clip] ∪ [1-clip, 1]` as overlap-violating.	`0.02`

Returns:

Type	Description
`DMLDiagnostics`

Examples:

>>> import numpy as np
>>> import pandas as pd
>>> import statspai as sp
>>> rng = np.random.default_rng(42)
>>> n = 400
>>> x1 = rng.normal(size=n)
>>> x2 = rng.normal(size=n)
>>> d = 0.5 * x1 + rng.normal(size=n)
>>> y = 1.0 * d + x1 + 0.5 * x2 + rng.normal(size=n)
>>> df = pd.DataFrame({'y': y, 'd': d, 'x1': x1, 'x2': x2})
>>> fit = sp.dml(df, y='y', treat='d', covariates=['x1', 'x2'],
...              model='plr', ml_g='linear', ml_m='linear',
...              n_folds=2)
>>> diag = sp.dml_diagnostics(fit)
>>> diag.n_obs
400
>>> diag.method
'PLR'
>>> round(diag.orth_pvalue, 2)  # E[psi] = 0 not rejected
1.0
>>> text = diag.summary()  # overlap / score / balance report

statspai.dml¶

dml ¶

DoubleML ¶

DoubleMLPLR ¶

DoubleMLIRM ¶

DoubleMLPLIV ¶

DoubleMLIIVM ¶

DMLAveragingResult ¶

DMLPanelResult dataclass ¶

DMLSensitivityResult dataclass ¶

plot ¶

DMLDiagnostics dataclass ¶

plot ¶

dml ¶

dml_model_averaging ¶

dml_panel ¶

dml_sensitivity ¶

dml_diagnostics ¶

`statspai.dml`¶

DMLPanelResult `dataclass` ¶

DMLSensitivityResult `dataclass` ¶

DMLDiagnostics `dataclass` ¶