Skip to content

statspai.robustness

robustness

Robustness analysis tools.

  • spec_curve: Specification Curve Analysis (Simonsohn et al. 2020)
  • robustness_report: Automated battery of robustness checks
  • subgroup_analysis: Subgroup heterogeneity analysis with forest plot

SpecCurveResult dataclass

Holds all specification curve outputs.

results_df instance-attribute

results_df: DataFrame

One row per specification with columns: spec_id, estimate, se, ci_lower, ci_upper, pvalue, significant, plus one column per choice dimension.

summary

summary(alpha: float = 0.05) -> str

Return a formatted text summary.

to_latex

to_latex(caption: str = 'Specification Curve Summary') -> str

Export summary to a LaTeX table.

to_dataframe

to_dataframe() -> DataFrame

Return the full results DataFrame.

plot

plot(alpha: float = 0.05, color_sig: str = '#2C3E50', color_nonsig: str = '#BDC3C7', figsize: Optional[Tuple[float, float]] = None, title: Optional[str] = None, sort_by: str = 'estimate')

Draw the canonical two-panel specification curve plot.

Top panel: sorted point estimates with 95 % CIs, coloured by significance. Bottom panel: indicator matrix showing which analytical choices produced each specification.

Parameters:

Name Type Description Default
alpha float

Significance threshold for colouring.

0.05
color_sig str

Colours for significant / non-significant estimates.

'#2C3E50'
color_nonsig str

Colours for significant / non-significant estimates.

'#2C3E50'
figsize tuple

Figure size (width, height). Auto-sized if None.

None
title str

Title for the top panel.

None
sort_by str

Column to sort specifications by. Default 'estimate'.

'estimate'

Returns:

Type Description
fig, axes : matplotlib Figure and array of Axes

RobustnessResult dataclass

Container for robustness report results.

results_df instance-attribute

results_df: DataFrame

One row per robustness check.

summary

summary() -> str

Formatted text summary.

to_latex

to_latex(caption: str = 'Robustness Checks') -> str

Export to LaTeX table.

plot

plot(figsize: Optional[Tuple[float, float]] = None, title: Optional[str] = None, color: str = '#2C3E50', baseline_color: str = '#E74C3C')

Forest-plot style visualization of robustness checks.

Returns:

Type Description
fig, ax : matplotlib Figure and Axes

SubgroupResult dataclass

Container for subgroup heterogeneity analysis.

results_df instance-attribute

results_df: DataFrame

Columns: group_var, group_val, estimate, se, ci_lower, ci_upper, pvalue, nobs, label.

het_tests instance-attribute

het_tests: Dict[str, Dict[str, float]]

Per group_var: chi2, pvalue, df.

to_latex

to_latex(caption: str = 'Subgroup Heterogeneity Analysis') -> str

Export to LaTeX.

plot

plot(figsize: Optional[Tuple[float, float]] = None, title: Optional[str] = None, color: str = '#2C3E50', overall_color: str = '#E74C3C')

Forest plot of subgroup estimates.

Returns:

Type Description
(fig, ax)

SensitivityDashboard dataclass

Result of a unified sensitivity analysis.

Always contains an e_value entry; other entries are optional depending on what the estimator provides.

FrontierSensitivityResult dataclass

Container for frontier sensitivity analysis.

subgroup_analysis

subgroup_analysis(data: DataFrame, formula: str, x: str, by: Dict[str, str], robust: str = 'hc1') -> SubgroupResult

Run subgroup heterogeneity analysis with forest plot.

Estimate the effect of x on y within each subgroup defined by the variables in by, and test for heterogeneity using interaction-based Wald tests.

Parameters:

Name Type Description Default
data DataFrame

Analysis dataset.

required
formula str

Regression formula, e.g. "wage ~ education + experience".

required
x str

Key explanatory variable.

required
by dict[str, str]

Mapping of display namecolumn name for grouping. Example: {'Gender': 'female', 'Region': 'region'}.

required
robust str

Standard error type for subgroup regressions.

'hc1'

Returns:

Type Description
SubgroupResult

Container with .plot(), .summary(), .to_latex(), .results_df.

Examples:

>>> import statspai as sp
>>> result = sp.subgroup_analysis(
...     data=df,
...     formula="wage ~ education + experience",
...     x='education',
...     by={'Gender': 'female', 'Region': 'region'},
... )
>>> result.plot()
>>> print(result.summary())

copula_sensitivity

copula_sensitivity(estimate: float, se: float, *, sigma_u: float = 1.0, sigma_y: float = 1.0, rho_grid: Optional[Sequence[float]] = None, alpha: float = 0.05) -> FrontierSensitivityResult

Gaussian-copula sensitivity to unobserved confounding.

Under a Gaussian copula with correlation rho between U (one latent unit-level confounder) and Y (outcome), the bias in an OLS / DML point estimate scales linearly with rho:

bias(rho) = rho * sigma_u * sigma_y / sigma_D²  ≈ rho * sigma_u * sigma_y

under the normalisation sigma_D = 1. The adjusted estimate is estimate - bias(rho); we sweep rho on a grid to find the breakpoint rho* that zeros the effect.

Parameters:

Name Type Description Default
estimate float
required
se float
required
sigma_u float

Standard deviations of the latent confounder and the outcome. With default values the bias coefficient is numerically equal to rho, matching Chernozhukov-Cinelli-Hazlett's "percentile scaling."

1.0
sigma_y float

Standard deviations of the latent confounder and the outcome. With default values the bias coefficient is numerically equal to rho, matching Chernozhukov-Cinelli-Hazlett's "percentile scaling."

1.0
rho_grid sequence of float

Correlation grid. Defaults to np.linspace(-0.5, 0.5, 21).

None
alpha float
0.05

Returns:

Type Description
FrontierSensitivityResult
References

Balgi, Braun, Peña & Daoud (arXiv:2508.08752, 2025). [@balgi2025sensitivity]

survival_sensitivity

survival_sensitivity(log_hr: float, se_log_hr: float, *, gamma_grid: Optional[Sequence[float]] = None, baseline_survival_t: float = 0.5, alpha: float = 0.05) -> FrontierSensitivityResult

Nonparametric sensitivity for survival / hazard-ratio outcomes.

Extends Rosenbaum's Gamma bounds to hazard ratios and converts them into shifted survival differences at a chosen time t.

Given an observed log hazard ratio log_hr with SE se_log_hr, bound the worst-case log-HR at sensitivity parameter Γ:

log_hr_worst(Γ) = log_hr − log(Γ),  log_hr_best(Γ) = log_hr + log(Γ)

and translate the worst case into a survival shift at time t using the proportional-hazards identity S_1(t) = S_0(t) ^ exp(log_hr).

Parameters:

Name Type Description Default
log_hr float
required
se_log_hr float
required
gamma_grid sequence of float

Gamma (≥ 1) values. Defaults to np.linspace(1.0, 3.0, 21).

None
baseline_survival_t float

Baseline S_0(t) used to report Δ survival at time t.

0.5
alpha float
0.05
References

Hu & Westling (arXiv:2511.01412, 2025). [@hu2025nonparametric]

calibrate_confounding_strength

calibrate_confounding_strength(estimate: float, se: float, *, observed_r2_outcome: float, observed_r2_treatment: float, alpha: float = 0.05, target_estimate: float = 0.0) -> FrontierSensitivityResult

Calibrate the strength of an unobserved confounder required to explain the observed effect to a target value.

Follows the Cinelli-Hazlett (2020) and Zhang et al. (2025) "ml-calibrated E-value" generalisation: given observed-covariate partial-R² with the outcome and treatment, the amount of residual variation an unobserved U would need to share with Y and D to shift the effect to target_estimate.

Parameters:

Name Type Description Default
estimate float
required
se float
required
observed_r2_outcome float in (0, 1)

Partial-R² of the observed covariate(s) with Y (resp. D). Used to benchmark "1x as confounding as observed" / "2x" etc.

required
observed_r2_treatment float in (0, 1)

Partial-R² of the observed covariate(s) with Y (resp. D). Used to benchmark "1x as confounding as observed" / "2x" etc.

required
alpha float
0.05
target_estimate float

Effect value to explain away.

0.0
References

Baitairian et al. (arXiv:2510.16560, 2025). [@baitairian2025calibrating] Cinelli & Hazlett (JRSS-B 2020).