statspai.power¶
power ¶
Power and sample size calculations for causal inference and epidemiological designs.
Supports RCT, DID, RD, IV, cluster RCT, and OLS — plus epidemiological study designs (two-proportion, log-rank/survival, case-control) — with power curves, minimum detectable effect (MDE), and sample-size solving.
PowerResult ¶
Container for power analysis results.
Attributes:
| Name | Type | Description |
|---|---|---|
power |
float or ndarray
|
Computed power value(s). |
n |
int, float, or np.ndarray
|
Sample size(s) used. |
effect_size |
float or ndarray
|
Effect size(s) used. |
design |
str
|
Name of the research design. |
params |
dict
|
All parameters passed to the power function. |
plot ¶
Plot power curve.
Works when n or effect_size was supplied as an array / range.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
ax
|
matplotlib Axes
|
|
None
|
figsize
|
tuple
|
|
(8, 5)
|
**kwargs
|
passed to ``ax.plot``
|
|
{}
|
Returns:
| Type | Description |
|---|---|
matplotlib Axes
|
|
power_rct ¶
Power for a two-arm Randomised Controlled Trial.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
n
|
int or array - like
|
Total sample size (treatment + control). |
required |
effect_size
|
float or array - like
|
Standardised effect size (delta / sigma). |
required |
alpha
|
float
|
Significance level (two-sided). |
0.05
|
ratio
|
float
|
Treatment / control allocation ratio (1 = equal allocation). |
1.0
|
sigma
|
float
|
Outcome standard deviation (default 1 for standardised effect). |
1.0
|
Returns:
| Type | Description |
|---|---|
PowerResult
|
|
power_did ¶
Power for Difference-in-Differences.
Follows Burlig, Preonas & Woerman (2020) accounting for serial correlation and the number of pre/post periods.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
n
|
int or array - like
|
Total number of units (treated + control). |
required |
effect_size
|
float or array - like
|
Standardised effect size. |
required |
n_periods
|
int
|
Total number of time periods. |
required |
n_treated_periods
|
int
|
Number of post-treatment periods. |
required |
rho
|
float
|
First-order autocorrelation of errors (0–1). |
0.5
|
alpha
|
float
|
Significance level (two-sided). |
0.05
|
sigma
|
float
|
Error standard deviation. |
1.0
|
Returns:
| Type | Description |
|---|---|
PowerResult
|
|
power_rd ¶
power_rd(n, effect_size, bandwidth=None, kernel='triangular', density_at_cutoff=1.0, alpha=0.05, sigma=1.0)
Power for Regression Discontinuity designs.
Following Cattaneo, Titiunik & Vazquez-Bare (2019).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
n
|
int or array - like
|
Total sample size in the data. |
required |
effect_size
|
float or array - like
|
Standardised effect size at the cutoff. |
required |
bandwidth
|
float or None
|
Bandwidth around the cutoff. If None, defaults to 0.5 (half the running-variable range on each side). |
None
|
kernel
|
(triangular, uniform, epanechnikov)
|
Kernel used for local weighting. |
'triangular'
|
density_at_cutoff
|
float
|
Estimated density of the running variable at the cutoff (default 1.0 for a uniform running variable on [0,1]). |
1.0
|
alpha
|
float
|
Significance level. |
0.05
|
sigma
|
float
|
Conditional outcome std dev near the cutoff. |
1.0
|
Returns:
| Type | Description |
|---|---|
PowerResult
|
|
power_iv ¶
Power for Instrumental Variables / 2SLS estimation.
Accounts for the power penalty from a weak first stage.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
n
|
int or array - like
|
Sample size. |
required |
effect_size
|
float or array - like
|
Standardised effect size of the endogenous variable. |
required |
first_stage_f
|
float or None
|
First-stage F-statistic. If provided, power is adjusted for instrument weakness: effective_power ~ power_ols * F / (F + 1). |
None
|
r2_z
|
float or None
|
R-squared of the first-stage regression. Alternative to first_stage_f; if both are given, first_stage_f takes precedence. |
None
|
alpha
|
float
|
Significance level. |
0.05
|
sigma
|
float
|
Error standard deviation. |
1.0
|
Returns:
| Type | Description |
|---|---|
PowerResult
|
|
power_cluster_rct ¶
Power for a Cluster-Randomised Controlled Trial.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
n_clusters
|
int or array - like
|
Total number of clusters (treatment + control). |
required |
cluster_size
|
int or float
|
Average number of individuals per cluster. |
required |
effect_size
|
float or array - like
|
Standardised effect size. |
required |
icc
|
float
|
Intra-cluster correlation coefficient. |
required |
alpha
|
float
|
Significance level. |
0.05
|
sigma
|
float
|
Individual-level outcome standard deviation. |
1.0
|
Returns:
| Type | Description |
|---|---|
PowerResult
|
|
power_ols ¶
Power for OLS regression (single coefficient of interest).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
n
|
int or array - like
|
Sample size. |
required |
effect_size
|
float or array - like
|
Standardised effect of the variable of interest. |
required |
n_covariates
|
int
|
Number of other covariates in the model. |
0
|
r2_other
|
float
|
R-squared attributable to other covariates (reduces residual variance and thus improves power). |
0.0
|
alpha
|
float
|
Significance level. |
0.05
|
sigma
|
float
|
Outcome standard deviation. |
1.0
|
Returns:
| Type | Description |
|---|---|
PowerResult
|
|
mde ¶
Compute the Minimum Detectable Effect (MDE) for a given design.
Inverts the power function to find the smallest effect size that achieves power_target at the given sample size.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
design
|
str
|
Research design (see :func: |
required |
n
|
int
|
Sample size. For |
None
|
power_target
|
float
|
Desired power (default 0.80). |
0.8
|
**kwargs
|
Design-specific parameters. |
{}
|
Returns:
| Type | Description |
|---|---|
PowerResult
|
With |
Examples:
power_two_proportions ¶
power_two_proportions(n: Optional[ArrayLike] = None, p1: float = 0.5, p2: float = 0.5, *, ratio: float = 1.0, alpha: float = 0.05, alternative: str = 'two-sided', power_target: Optional[float] = None) -> PowerResult
Power (or sample size) to detect a difference between two proportions.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
n
|
int, array-like, or None
|
Total sample size (both groups). Pass |
None
|
p1
|
float
|
Outcome probabilities in group 1 (reference) and group 2. |
0.5
|
p2
|
float
|
Outcome probabilities in group 1 (reference) and group 2. |
0.5
|
ratio
|
float
|
Allocation ratio |
1.0
|
alpha
|
float
|
Significance level. |
0.05
|
alternative
|
('two-sided', 'one-sided')
|
Test sidedness. |
"two-sided"
|
power_target
|
float
|
Desired power; when supplied with |
None
|
Returns:
| Type | Description |
|---|---|
PowerResult
|
|
Notes
Uses the unpooled-variance normal approximation to a two-sample test of
proportions: power = Phi(|p1 - p2| / se - z_alpha) with
se = sqrt(p1(1-p1)/n1 + p2(1-p2)/n2).
power_logrank ¶
power_logrank(n: Optional[ArrayLike] = None, hazard_ratio: float = 0.5, prob_event: float = 1.0, *, ratio: float = 1.0, alpha: float = 0.05, alternative: str = 'two-sided', power_target: Optional[float] = None) -> PowerResult
Power (or sample size) for a two-arm log-rank / survival comparison.
Implements the Schoenfeld (1983) formula: power depends on the number of
observed events, D = n * prob_event, and the log hazard ratio.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
n
|
int, array-like, or None
|
Total sample size. |
None
|
hazard_ratio
|
float
|
Hazard ratio between the two arms (must be > 0, != 1). |
0.5
|
prob_event
|
float
|
Probability that a randomly chosen subject is observed to have the
event during follow-up (the overall event rate). |
1.0
|
ratio
|
float
|
Allocation ratio |
1.0
|
alpha
|
float
|
Significance level. |
0.05
|
alternative
|
('two-sided', 'one-sided')
|
|
"two-sided"
|
power_target
|
float
|
Desired power; solve for |
None
|
Returns:
| Type | Description |
|---|---|
PowerResult
|
|
Notes
With allocation share p = ratio/(1+ratio), the required number of
events is D = (z_alpha + z_beta)^2 / (p(1-p) (ln HR)^2) and the power
for a given D is Phi(sqrt(D p(1-p)) |ln HR| - z_alpha).
power_case_control ¶
power_case_control(n_cases: Optional[ArrayLike] = None, odds_ratio: float = 2.0, exposure_prevalence: float = 0.3, *, ratio: float = 1.0, alpha: float = 0.05, alternative: str = 'two-sided', power_target: Optional[float] = None) -> PowerResult
Power (or number of cases) for an unmatched case-control study.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
n_cases
|
int, array-like, or None
|
Number of cases. |
None
|
odds_ratio
|
float
|
Exposure odds ratio to detect (must be > 0, != 1). |
2.0
|
exposure_prevalence
|
float
|
Exposure prevalence among controls (the source-population exposure probability), in (0, 1). |
0.3
|
ratio
|
float
|
Number of controls per case. |
1.0
|
alpha
|
float
|
Significance level. |
0.05
|
alternative
|
('two-sided', 'one-sided')
|
|
"two-sided"
|
power_target
|
float
|
Desired power; solve for the number of cases when |
None
|
Returns:
| Type | Description |
|---|---|
PowerResult
|
|
Notes
The control exposure prevalence p0 and the odds ratio imply a case
exposure prevalence p1 = (OR p0) / (1 + p0 (OR - 1)). Power is then a
two-proportion comparison between cases (n_cases) and controls
(ratio * n_cases).