`statspai.mediation`¶

mediation ¶

Causal Mediation Analysis module for StatsPAI.

Implements modern causal mediation analysis following Imai, Keele, and Tingley (2010), decomposing total treatment effects into: - Average Causal Mediation Effect (ACME): indirect effect through mediator - Average Direct Effect (ADE): direct effect not through mediator

Also supports the classical Baron-Kenny approach for comparison.

References

Imai, K., Keele, L., and Tingley, D. (2010). "A General Approach to Causal Mediation Analysis." Psychological Methods, 15(4), 309-334. [@imai2010general]

Baron, R.M. and Kenny, D.A. (1986). "The Moderator-Mediator Variable Distinction in Social Psychological Research." Journal of Personality and Social Psychology, 51(6), 1173-1182. [@baron1986moderator]

MediationAnalysis ¶

Causal Mediation Analysis estimator.

Examples:

>>> import numpy as np
>>> import pandas as pd
>>> import statspai as sp
>>> rng = np.random.default_rng(42)
>>> n = 300
>>> treat = rng.binomial(1, 0.5, n)
>>> x = rng.normal(0, 1, n)
>>> m = 0.4 * treat + 0.3 * x + rng.normal(0, 1, n)
>>> y = 0.5 * treat + 0.6 * m + 0.2 * x + rng.normal(0, 1, n)
>>> df = pd.DataFrame({"treat": treat, "x": x, "m": m, "y": y})
>>> ma = sp.MediationAnalysis(df, y="y", treat="treat", mediator="m",
...                           covariates=["x"], n_boot=100, seed=0)
>>> res = ma.fit()
>>> res.estimand
'ACME'
>>> res.method
'Causal Mediation Analysis'

fit ¶

fit() -> CausalResult

Estimate mediation effects with bootstrap inference.

MediateSensitivityResult `dataclass` ¶

Bases: ResultProtocolMixin

plot ¶

plot(ax: Any = None, *, fill: bool = True, annotate: bool = True, figsize: Any = (7.0, 4.5), **kwargs: Any) -> Any

Publication-style sensitivity plot.

Shows ACME(ρ) vs the mediator-outcome confounder strength ρ, with a coloured fill for the region of nullability (any ρ in [ρ_at_zero, 1] flips the ACME sign), the ρ at which the ACME crosses zero (annotated), and reference lines at ρ=0 (i.e. sequential-ignorability) and ACME=0.

Parameters:

Name	Type	Description	Default
`ax`	`matplotlib Axes`		`None`
`fill`	`bool`	Fill the {ACME(ρ) > 0} region in light blue and the {ACME(ρ) < 0} region in light red, à la sensemakr.	`True`
`annotate`	`bool`	Annotate ρ_at_zero, baseline ACME, and an interpretive note.	`True`

FourWayResult `dataclass` ¶

Bases: ResultProtocolMixin

Result container for :func:four_way_decomposition.

Examples:

>>> import statspai as sp
>>> import pandas as pd
>>> treat = np.tile([0.0, 1.0], 50)
>>> mediator = 0.5 + 0.4 * treat + np.repeat(np.linspace(-1, 1, 50), 2)
>>> y = 1 + 2 * treat + 3 * mediator + 4 * treat * mediator
>>> df = pd.DataFrame({"y": y, "a": treat, "m": mediator})
>>> res = sp.four_way_decomposition(df, y="y", treat="a", mediator="m")
>>> type(res).__name__
'FourWayResult'
>>> round(float(res.total_effect), 1)
6.8

to_dict ¶

to_dict() -> dict

JSON-safe dict of every field (agent-native serialization).

mediate_interventional ¶

mediate_interventional(data: DataFrame, y: str, treat: str, mediator: str, covariates: Optional[List[str]] = None, tv_confounders: Optional[List[str]] = None, n_mc: int = 500, n_boot: int = 500, alpha: float = 0.05, pvalue_method: str = 'bootstrap_sign', seed: int = 42) -> CausalResult

Interventional (in)direct effects (VanderWeele, Vansteelandt, Robins 2014).

Decomposes the total effect into:

IIE (Interventional Indirect Effect): E[Y(1, G_{M|1})] - E[Y(1, G_{M|0})], i.e. the effect of shifting M's post-treatment distribution from its D=0 draw to its D=1 draw, while holding D fixed at 1.
IDE (Interventional Direct Effect): E[Y(1, G_{M|0})] - E[Y(0, G_{M|0})].
Total = IIE + IDE = E[Y(1, G_{M|1})] - E[Y(0, G_{M|0})].

Here :math:G_{M|d} is the random draw from the marginal distribution of :math:M under treatment :math:D = d (integrated over covariates).

Interventional effects are identified under the standard mediation assumptions minus the cross-world independence requirement — which makes them valid even when there are treatment-induced mediator-outcome confounders (tv_confounders). Natural effects are not generally identified in that case.

Parameters:

Name	Type	Description	Default
`data`	`DataFrame`		required
`y`	`str`	Outcome variable.	required
`treat`	`str`	Binary treatment (0/1).	required
`mediator`	`str`	Mediator variable.	required
`covariates`	`list of str`	Baseline (pre-treatment) covariates.	`None`
`tv_confounders`	`list of str`	Treatment-induced mediator-outcome confounders (variables affected by D that confound the M-Y relationship). These enter the outcome model but not the M-marginalization.	`None`
`n_mc`	`int`	Monte Carlo draws of M for the stochastic intervention.	`500`
`n_boot`	`int`	Nonparametric bootstrap replications.	`500`
`alpha`	`float`	Significance level.	`0.05`
`pvalue_method`	`(bootstrap_sign, wald)`	How the per-effect p-value is computed: `'bootstrap_sign'` (default, matches :func:`sp.mediate`): the bootstrap-CI-inversion p-value, i.e. twice the fraction of bootstrap replications on the opposite side of zero from the point estimate. Respects skewed bootstrap distributions. `'wald'`: the conventional 2*(1-Φ(\|θ̂/ŝe\|)) p-value, using the bootstrap SE. Consistent with the Wald pvalue convention used across the rest of StatsPAI's causal-inference surface (e.g. `sp.aipw`, `sp.dml`).	`'bootstrap_sign'`
`seed`	`int`	Random seed.	`42`

Returns:

Type	Description
`CausalResult`	`estimate` is IIE; full decomposition lives in `detail`.

Examples:

>>> import numpy as np
>>> import pandas as pd
>>> import statspai as sp
>>> rng = np.random.default_rng(42)
>>> n = 300
>>> treat = rng.binomial(1, 0.5, n)
>>> x = rng.normal(0, 1, n)
>>> m = 0.4 * treat + 0.3 * x + rng.normal(0, 1, n)
>>> y = 0.5 * treat + 0.6 * m + 0.2 * x + rng.normal(0, 1, n)
>>> df = pd.DataFrame({"treat": treat, "x": x, "m": m, "y": y})
>>> res = sp.mediate_interventional(
...     df, y="y", treat="treat", mediator="m", covariates=["x"],
...     n_mc=100, n_boot=100, seed=0,
... )
>>> res.estimand
'IIE'
>>> res.method
'Interventional Mediation Analysis'
>>> len(res.detail["effect"])     # IIE, IDE, Total
3

Notes

Linear outcome model assumption. The current implementation hard-codes an OLS outcome regression Y ~ D + M + X_base + X_tv. This permits the Monte-Carlo integration to collapse analytically via the linearity of OLS predictions in the treatment-induced confounder block (X_tv), giving an O(n_mc + n) cost instead of O(n × n_mc). Non-linear outcome models (gradient boosting, neural nets, etc.) would break this vectorisation and are not currently supported — passing a custom learner is not exposed via the API.

References

VanderWeele, T.J., Vansteelandt, S. and Robins, J.M. (2014). "Effect decomposition in the presence of an exposure-induced mediator-outcome confounder." Epidemiology, 25(2), 300-306. [@vanderweele2014effect]

mediate_sensitivity ¶

mediate_sensitivity(data: DataFrame, y: str, treat: str, mediator: str, covariates: Optional[list] = None, rho_range: tuple = (-0.9, 0.9), n_grid: int = 41) -> MediateSensitivityResult

Sensitivity analysis for causal mediation.

For each candidate ρ (correlation between mediator and outcome errors), compute a bias-adjusted ACME. The method follows Imai, Keele & Yamamoto (2010):

Fit the mediator model: M = α₀ + α₁ T + α₂ X + ε_M.
Fit the outcome model: Y = β₀ + β₁ T + β₂ M + β₃ X + ε_Y.
For each ρ, the bias in the ACME estimate is approximately ρ · σ_M · σ_Y / σ²_M (from the omitted-variable formula). Subtract this bias from the naïve ACME.

Parameters:

Name	Type	Description	Default
`rho_range`	`(lo, hi)`		`(-0.9, 0.9)`
`n_grid`	`int`		`41`

Examples:

>>> import statspai as sp
>>> df = sp.cps_wage()
>>> s = sp.mediate_sensitivity(df, y='log_wage', treat='union',
...                            mediator='tenure',
...                            covariates=['education', 'experience'],
...                            n_grid=11)
>>> s.rho_grid.size
11
>>> bool(s.rho_grid.min() == -0.9 and s.rho_grid.max() == 0.9)
True
>>> import numpy as np
>>> bool(np.isfinite(s.acme_at_zero))  # baseline ACME at rho=0
True

References

[@imai2010general]

four_way_decomposition ¶

four_way_decomposition(data: DataFrame, y: str, treat: str, mediator: str, covariates: Optional[Sequence[str]] = None, a0: float = 0.0, a1: float = 1.0, m0: float = 0.0) -> FourWayResult

Parametric four-way decomposition of TE = CDE + INT_ref + INT_med + PIE.

Parameters:

Name	Type	Description	Default
`data`	`DataFrame`		required
`y`	`str`		required
`treat`	`str`		required
`mediator`	`str`		required
`covariates`	`sequence of str`		`None`
`a0`	`float`	Reference and comparison levels of the treatment (default 0, 1).	`0.0`
`a1`	`float`	Reference and comparison levels of the treatment (default 0, 1).	`0.0`
`m0`	`float`	Mediator reference level at which CDE is evaluated.	`0.0`

Returns:

Type	Description
`FourWayResult`

Examples:

>>> import statspai as sp
>>> import pandas as pd
>>> treat = np.tile([0.0, 1.0], 50)
>>> mediator = 0.5 + 0.4 * treat + np.repeat(np.linspace(-1, 1, 50), 2)
>>> y = 1 + 2 * treat + 3 * mediator + 4 * treat * mediator
>>> df = pd.DataFrame({"y": y, "a": treat, "m": mediator})
>>> res = sp.four_way_decomposition(df, y="y", treat="a", mediator="m")
>>> round(float(res.cde), 1)
2.0
>>> round(float(res.pie), 1)
1.2

statspai.mediation¶

mediation ¶

MediationAnalysis ¶

fit ¶

MediateSensitivityResult dataclass ¶

plot ¶

FourWayResult dataclass ¶

to_dict ¶

mediate_interventional ¶

mediate_sensitivity ¶

four_way_decomposition ¶

`statspai.mediation`¶

MediateSensitivityResult `dataclass` ¶

FourWayResult `dataclass` ¶