`statspai.power`¶

power ¶

Power and sample size calculations for causal inference and epidemiological designs.

Supports RCT, DID, RD, IV, cluster RCT, and OLS — plus epidemiological study designs (two-proportion, log-rank/survival, case-control) — with power curves, minimum detectable effect (MDE), and sample-size solving.

PowerResult ¶

Bases: ResultProtocolMixin

Container for power analysis results.

Attributes:

Name	Type	Description
`power`	`float or ndarray`	Computed power value(s).
`n`	`int, float, or np.ndarray`	Sample size(s) used.
`effect_size`	`float or ndarray`	Effect size(s) used.
`design`	`str`	Name of the research design.
`params`	`dict`	All parameters passed to the power function.

Examples:

>>> import statspai as sp
>>> res = sp.power_rct(n=500, effect_size=0.3)
>>> isinstance(res, sp.PowerResult)
True
>>> res.design
'rct'
>>> round(float(res.power), 4)
0.9184

summary ¶

summary() -> str

Return a formatted summary string.

plot ¶

plot(ax: Any = None, figsize: tuple[float, float] = (8, 5), **kwargs: Any) -> Any

Plot power curve.

Works when n or effect_size was supplied as an array / range.

Parameters:

Name	Type	Default
`ax`	`matplotlib Axes`	`None`
`figsize`	`tuple`	`(8, 5)`
`**kwargs`	passed to ``ax.plot``	`{}`

Returns:

Type	Description
`matplotlib Axes`

power_rct ¶

power_rct(n: Any, effect_size: Any, alpha: float = 0.05, ratio: float = 1.0, sigma: float = 1.0) -> PowerResult

Power for a two-arm Randomised Controlled Trial.

Parameters:

Name	Type	Description	Default
`n`	`int or array - like`	Total sample size (treatment + control).	required
`effect_size`	`float or array - like`	Standardised effect size (delta / sigma).	required
`alpha`	`float`	Significance level (two-sided).	`0.05`
`ratio`	`float`	Treatment / control allocation ratio (1 = equal allocation).	`1.0`
`sigma`	`float`	Outcome standard deviation (default 1 for standardised effect).	`1.0`

Returns:

Type	Description
`PowerResult`

Examples:

>>> import statspai as sp
>>> res = sp.power_rct(n=500, effect_size=0.3)
>>> round(float(res.power), 4)
0.9184

power_did ¶

power_did(n: Any, effect_size: Any, n_periods: int, n_treated_periods: int, rho: float = 0.5, alpha: float = 0.05, sigma: float = 1.0) -> PowerResult

Power for Difference-in-Differences.

Follows Burlig, Preonas & Woerman (2020) accounting for serial correlation and the number of pre/post periods.

Parameters:

Name	Type	Description	Default
`n`	`int or array - like`	Total number of units (treated + control).	required
`effect_size`	`float or array - like`	Standardised effect size.	required
`n_periods`	`int`	Total number of time periods.	required
`n_treated_periods`	`int`	Number of post-treatment periods.	required
`rho`	`float`	First-order autocorrelation of errors (0–1).	`0.5`
`alpha`	`float`	Significance level (two-sided).	`0.05`
`sigma`	`float`	Error standard deviation.	`1.0`

Returns:

Type	Description
`PowerResult`

Examples:

>>> import statspai as sp
>>> res = sp.power_did(n=1000, effect_size=0.1, n_periods=10,
...                    n_treated_periods=5)
>>> round(float(res.power), 4)
0.5683

power_rd ¶

power_rd(n: Any, effect_size: Any, bandwidth: Optional[float] = None, kernel: str = 'triangular', density_at_cutoff: float = 1.0, alpha: float = 0.05, sigma: float = 1.0) -> PowerResult

Power for Regression Discontinuity designs.

Following Cattaneo, Titiunik & Vazquez-Bare (2019).

Parameters:

Name	Type	Description	Default
`n`	`int or array - like`	Total sample size in the data.	required
`effect_size`	`float or array - like`	Standardised effect size at the cutoff.	required
`bandwidth`	`float or None`	Bandwidth around the cutoff. If None, defaults to 0.5 (half the running-variable range on each side).	`None`
`kernel`	`('triangular', 'uniform', 'epanechnikov')`	Kernel used for local weighting.	`'triangular'`
`density_at_cutoff`	`float`	Estimated density of the running variable at the cutoff (default 1.0 for a uniform running variable on [0,1]).	`1.0`
`alpha`	`float`	Significance level.	`0.05`
`sigma`	`float`	Conditional outcome std dev near the cutoff.	`1.0`

Returns:

Type	Description
`PowerResult`

References

cattaneo2019power

Examples:

>>> import statspai as sp
>>> res = sp.power_rd(n=2000, effect_size=0.25)
>>> round(float(res.power), 4)
0.9283

power_iv ¶

power_iv(n: Any, effect_size: Any, first_stage_f: Optional[float] = None, r2_z: Optional[float] = None, alpha: float = 0.05, sigma: float = 1.0) -> PowerResult

Power for Instrumental Variables / 2SLS estimation.

Accounts for the power penalty from a weak first stage.

Parameters:

Name	Type	Description	Default
`n`	`int or array - like`	Sample size.	required
`effect_size`	`float or array - like`	Standardised effect size of the endogenous variable.	required
`first_stage_f`	`float or None`	First-stage F-statistic. If provided, power is adjusted for instrument weakness: effective_power ~ power_ols * F / (F + 1).	`None`
`r2_z`	`float or None`	R-squared of the first-stage regression. Alternative to first_stage_f; if both are given, first_stage_f takes precedence.	`None`
`alpha`	`float`	Significance level.	`0.05`
`sigma`	`float`	Error standard deviation.	`1.0`

Returns:

Type	Description
`PowerResult`

References

stock2005testing

Examples:

>>> import statspai as sp
>>> res = sp.power_iv(n=1000, effect_size=0.2, first_stage_f=20)
>>> round(float(res.power), 4)
0.9524

power_cluster_rct ¶

power_cluster_rct(n_clusters: Any, cluster_size: float, effect_size: Any, icc: float, alpha: float = 0.05, sigma: float = 1.0) -> PowerResult

Power for a Cluster-Randomised Controlled Trial.

Parameters:

Name	Type	Description	Default
`n_clusters`	`int or array - like`	Total number of clusters (treatment + control).	required
`cluster_size`	`int or float`	Average number of individuals per cluster.	required
`effect_size`	`float or array - like`	Standardised effect size.	required
`icc`	`float`	Intra-cluster correlation coefficient.	required
`alpha`	`float`	Significance level.	`0.05`
`sigma`	`float`	Individual-level outcome standard deviation.	`1.0`

Returns:

Type	Description
`PowerResult`

Examples:

>>> import statspai as sp
>>> res = sp.power_cluster_rct(n_clusters=40, cluster_size=30,
...                            effect_size=0.3, icc=0.05)
>>> round(float(res.power), 4)
0.913

power_ols ¶

power_ols(n: Any, effect_size: Any, n_covariates: int = 0, r2_other: float = 0.0, alpha: float = 0.05, sigma: float = 1.0) -> PowerResult

Power for OLS regression (single coefficient of interest).

Parameters:

Name	Type	Description	Default
`n`	`int or array - like`	Sample size.	required
`effect_size`	`float or array - like`	Standardised effect of the variable of interest.	required
`n_covariates`	`int`	Number of other covariates in the model.	`0`
`r2_other`	`float`	R-squared attributable to other covariates (reduces residual variance and thus improves power).	`0.0`
`alpha`	`float`	Significance level.	`0.05`
`sigma`	`float`	Outcome standard deviation.	`1.0`

Returns:

Type	Description
`PowerResult`

Examples:

>>> import statspai as sp
>>> res = sp.power_ols(n=500, effect_size=0.2)
>>> round(float(res.power), 4)
0.9939

mde ¶

mde(design: str, *, n: Any = None, power_target: float = 0.8, **kwargs: Any) -> PowerResult

Compute the Minimum Detectable Effect (MDE) for a given design.

Inverts the power function to find the smallest effect size that achieves power_target at the given sample size.

Parameters:

Name	Type	Description	Default
`design`	`str`	Research design (see :func:`power`).	required
`n`	`int`	Sample size. For `'cluster_rct'` this is n_clusters.	`None`
`power_target`	`float`	Desired power (default 0.80).	`0.8`
`**kwargs`	`Any`	Design-specific parameters.	`{}`

Returns:

Type	Description
`PowerResult`	With `.effect_size` set to the MDE.

Examples:

>>> import statspai as sp
>>> sp.mde("did", n=1000, n_periods=10, n_treated_periods=5)

power_two_proportions ¶

power_two_proportions(n: Optional[ArrayLike] = None, p1: float = 0.5, p2: float = 0.5, *, ratio: float = 1.0, alpha: float = 0.05, alternative: str = 'two-sided', power_target: Optional[float] = None) -> PowerResult

Power (or sample size) to detect a difference between two proportions.

Parameters:

Name	Type	Description	Default
`n`	`int, array-like, or None`	Total sample size (both groups). Pass `None` together with `power_target` to solve for the smallest `n` achieving that power.	`None`
`p1`	`float`	Outcome probabilities in group 1 (reference) and group 2.	`0.5`
`p2`	`float`	Outcome probabilities in group 1 (reference) and group 2.	`0.5`
`ratio`	`float`	Allocation ratio `n2 / n1` (1.0 = equal allocation).	`1.0`
`alpha`	`float`	Significance level.	`0.05`
`alternative`	`('two-sided', 'one-sided')`	Test sidedness.	`"two-sided"`
`power_target`	`float`	Desired power; when supplied with `n=None` the function returns the required total sample size.	`None`

Returns:

Type	Description
`PowerResult`

Notes

Uses the unpooled-variance normal approximation to a two-sample test of proportions: power = Phi(|p1 - p2| / se - z_alpha) with se = sqrt(p1(1-p1)/n1 + p2(1-p2)/n2).

Examples:

>>> import statspai as sp
>>> res = sp.power_two_proportions(n=400, p1=0.5, p2=0.65)
>>> isinstance(res, sp.PowerResult)
True
>>> res.design
'two_proportions'
>>> round(float(res.power), 4)
0.8665
>>> # Solve for the total sample size achieving 80% power.
>>> int(sp.power_two_proportions(p1=0.5, p2=0.65, power_target=0.8).n)
334

power_logrank ¶

power_logrank(n: Optional[ArrayLike] = None, hazard_ratio: float = 0.5, prob_event: float = 1.0, *, ratio: float = 1.0, alpha: float = 0.05, alternative: str = 'two-sided', power_target: Optional[float] = None) -> PowerResult

Power (or sample size) for a two-arm log-rank / survival comparison.

Implements the Schoenfeld (1983) formula: power depends on the number of observed events, D = n * prob_event, and the log hazard ratio.

Parameters:

Name	Type	Description	Default
`n`	`int, array-like, or None`	Total sample size. `None` + `power_target` solves for `n`.	`None`
`hazard_ratio`	`float`	Hazard ratio between the two arms (must be > 0, != 1).	`0.5`
`prob_event`	`float`	Probability that a randomly chosen subject is observed to have the event during follow-up (the overall event rate). `D = n*prob_event`.	`1.0`
`ratio`	`float`	Allocation ratio `n2 / n1`.	`1.0`
`alpha`	`float`	Significance level.	`0.05`
`alternative`	`('two-sided', 'one-sided')`		`"two-sided"`
`power_target`	`float`	Desired power; solve for `n` when supplied with `n=None`.	`None`

Returns:

Type	Description
`PowerResult`

Notes

With allocation share p = ratio/(1+ratio), the required number of events is D = (z_alpha + z_beta)^2 / (p(1-p) (ln HR)^2) and the power for a given D is Phi(sqrt(D p(1-p)) |ln HR| - z_alpha).

Examples:

>>> import statspai as sp
>>> res = sp.power_logrank(n=300, hazard_ratio=0.6, prob_event=0.7)
>>> res.design
'logrank'
>>> round(float(res.power), 4)
0.9592
>>> # Sample size for 80% power at a hazard ratio of 0.6.
>>> int(sp.power_logrank(hazard_ratio=0.6, prob_event=0.7,
...                      power_target=0.8).n)
172

power_case_control ¶

power_case_control(n_cases: Optional[ArrayLike] = None, odds_ratio: float = 2.0, exposure_prevalence: float = 0.3, *, ratio: float = 1.0, alpha: float = 0.05, alternative: str = 'two-sided', power_target: Optional[float] = None) -> PowerResult

Power (or number of cases) for an unmatched case-control study.

Parameters:

Name	Type	Description	Default
`n_cases`	`int, array-like, or None`	Number of cases. `None` + `power_target` solves for the number of cases.	`None`
`odds_ratio`	`float`	Exposure odds ratio to detect (must be > 0, != 1).	`2.0`
`exposure_prevalence`	`float`	Exposure prevalence among controls (the source-population exposure probability), in (0, 1).	`0.3`
`ratio`	`float`	Number of controls per case.	`1.0`
`alpha`	`float`	Significance level.	`0.05`
`alternative`	`('two-sided', 'one-sided')`		`"two-sided"`
`power_target`	`float`	Desired power; solve for the number of cases when `n_cases=None`.	`None`

Returns:

Type	Description
`PowerResult`

Notes

The control exposure prevalence p0 and the odds ratio imply a case exposure prevalence p1 = (OR p0) / (1 + p0 (OR - 1)). Power is then a two-proportion comparison between cases (n_cases) and controls (ratio * n_cases).

Examples:

>>> import statspai as sp
>>> res = sp.power_case_control(n_cases=200, odds_ratio=2.0,
...                             exposure_prevalence=0.3)
>>> res.design
'case_control'
>>> round(float(res.power), 4)
0.9213
>>> # Number of cases for 80% power at an odds ratio of 2.0.
>>> int(sp.power_case_control(odds_ratio=2.0, exposure_prevalence=0.3,
...                           power_target=0.8).n)
138

statspai.power¶

power ¶

PowerResult ¶

summary ¶

plot ¶

power_rct ¶

power_did ¶

power_rd ¶

power_iv ¶

power_cluster_rct ¶

power_ols ¶

mde ¶

power_two_proportions ¶

power_logrank ¶

power_case_control ¶

`statspai.power`¶