statspai.survival¶
survival ¶
Survival and duration analysis models.
CoxResult ¶
Bases: EconometricResults
Result from Cox proportional hazards estimation.
Extends EconometricResults with survival-specific methods:
.plot(), .ph_test(), .baseline_hazard(), .concordance.
baseline_hazard ¶
Baseline cumulative hazard (Breslow estimator).
Returns:
| Type | Description |
|---|---|
DataFrame
|
Columns |
ph_test ¶
Test the proportional hazards assumption via Schoenfeld residuals.
Computes the correlation of scaled Schoenfeld residuals with time for each covariate and reports a chi-squared test.
Returns:
| Type | Description |
|---|---|
DataFrame
|
Columns: |
plot ¶
Plot survival-related curves.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
kind
|
str
|
|
'survival'
|
ax
|
Axes
|
|
None
|
KMResult ¶
Kaplan-Meier survival analysis result.
Attributes:
| Name | Type | Description |
|---|---|---|
survival_table |
DataFrame
|
Life table with |
median_survival |
float or dict
|
Median survival time (per group if groups present). |
plot ¶
Plot Kaplan-Meier survival curves with confidence bands.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
ax
|
Axes
|
|
None
|
CumIncResult
dataclass
¶
Cumulative incidence functions for competing risks.
Attributes:
| Name | Type | Description |
|---|---|---|
cif_table |
DataFrame
|
Long table with columns |
causes |
list
|
The competing-cause labels (excluding the |
gray_test |
dict or None
|
|
alpha |
float
|
Significance level used for the confidence bands. |
FineGrayResult
dataclass
¶
Fine-Gray proportional subdistribution hazards model result.
Attributes:
| Name | Type | Description |
|---|---|---|
params |
ndarray
|
Estimated coefficients (log subdistribution hazard ratios). |
bse |
ndarray
|
Standard errors (model-based, from the inverse information). |
covariates |
list of str
|
Covariate names aligned with |
cause |
int
|
The cause of interest whose subdistribution was modelled. |
n_obs, n_events |
int
|
Sample size and number of cause-of-interest events. |
cox ¶
cox(formula: str = None, data: DataFrame = None, duration: str = None, event: str = None, x: list = None, ties: str = 'efron', strata: str = None, robust: str = 'nonrobust', cluster: str = None, hazard_ratio: bool = True, alpha: float = 0.05) -> CoxResult
Cox Proportional Hazards model via partial likelihood.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
formula
|
str
|
Formula of the form |
None
|
data
|
DataFrame
|
Input data. |
None
|
duration
|
str
|
Column name for follow-up time (overrides formula LHS). |
None
|
event
|
str
|
Column name for event indicator (1 = event, 0 = censored). |
None
|
x
|
list of str
|
Covariate column names (overrides formula RHS). |
None
|
ties
|
str
|
Tie-handling method: |
``'efron'``
|
strata
|
str
|
Column name for stratification variable. |
None
|
robust
|
str
|
|
``'nonrobust'``
|
cluster
|
str
|
Column name for cluster-robust SE. |
None
|
hazard_ratio
|
bool
|
If True, report hazard ratios in the summary alongside coefficients. |
True
|
alpha
|
float
|
Significance level for confidence intervals. |
0.05
|
Returns:
| Type | Description |
|---|---|
CoxResult
|
Result object extending |
Examples:
kaplan_meier ¶
kaplan_meier(data: DataFrame, duration: str, event: str, group: str = None, alpha: float = 0.05) -> KMResult
Kaplan-Meier non-parametric survival function estimator.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
DataFrame
|
Input data. |
required |
duration
|
str
|
Column name for duration / follow-up time. |
required |
event
|
str
|
Column name for event indicator (1 = event, 0 = censored). |
required |
group
|
str
|
Column name for group variable (stratification). |
None
|
alpha
|
float
|
Significance level for confidence intervals (Greenwood formula). |
0.05
|
Returns:
| Type | Description |
|---|---|
KMResult
|
Object with |
Examples:
survreg ¶
survreg(formula: str = None, data: DataFrame = None, duration: str = None, event: str = None, x: list = None, dist: str = 'weibull', robust: str = 'nonrobust', cluster: str = None, alpha: float = 0.05) -> EconometricResults
Parametric survival model (AFT parameterization).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
formula
|
str
|
Formula |
None
|
data
|
DataFrame
|
|
None
|
duration
|
str
|
Follow-up time column (or formula LHS). |
None
|
event
|
str
|
Event indicator column. |
None
|
x
|
list of str
|
Covariate columns (or formula RHS). |
None
|
dist
|
str
|
Distribution: |
``'weibull'``
|
robust
|
str
|
|
``'nonrobust'``
|
cluster
|
str
|
|
None
|
alpha
|
float
|
|
0.05
|
Returns:
| Type | Description |
|---|---|
EconometricResults
|
Fitted parametric survival model. Parameters include covariates
and |
Examples:
logrank_test ¶
Log-rank test for equality of survival distributions across groups.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
DataFrame
|
|
required |
duration
|
str
|
Column names. |
required |
event
|
str
|
Column names. |
required |
group
|
str
|
Column names. |
required |
Returns:
| Type | Description |
|---|---|
dict
|
Keys: |
Examples:
cox_frailty ¶
cox_frailty(formula: str, data: DataFrame, cluster: str, alpha: float = 0.05, maxiter: int = 50, tol: float = 1e-06) -> FrailtyResult
Cox proportional hazards with shared gamma frailty.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
formula
|
str
|
|
required |
data
|
DataFrame
|
|
required |
cluster
|
str
|
Column identifying clusters (e.g. hospital, site). |
required |
causal_survival_forest ¶
causal_survival_forest(data: DataFrame, time: str, event: str, treat: str, covariates: Sequence[str], horizon: Optional[float] = None, n_trees: int = 200, min_leaf: int = 5, max_depth: Optional[int] = None, propensity_bounds: tuple = (0.05, 0.95), random_state: int = 42, alpha: float = 0.05) -> CausalSurvivalForestResult
Fit a causal survival forest and return the RMST ATE plus CATE.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
DataFrame
|
|
required |
time
|
str
|
Observed time-to-event column (min of true event time and censoring time). |
required |
event
|
str
|
Event indicator (1 = event observed, 0 = censored). |
required |
treat
|
str
|
Binary treatment indicator. |
required |
covariates
|
sequence of str
|
|
required |
horizon
|
float
|
RMST horizon tau. Defaults to the 80th percentile of observed times. |
None
|
n_trees
|
int
|
Number of trees in the forest. |
200
|
min_leaf
|
int
|
Minimum samples per leaf. |
5
|
max_depth
|
int
|
Maximum tree depth. |
None
|
propensity_bounds
|
tuple
|
Clip estimated propensity for stability. |
(0.05, 0.95)
|
random_state
|
int
|
|
42
|
alpha
|
float
|
|
0.05
|
Returns:
| Type | Description |
|---|---|
CausalSurvivalForestResult
|
|
cuminc ¶
cuminc(data: DataFrame, duration: str, event: str, group: Optional[str] = None, alpha: float = 0.05) -> CumIncResult
Cumulative incidence functions for competing risks (Aalen-Johansen).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
DataFrame
|
Input data. |
required |
duration
|
str
|
Column name for the follow-up time. |
required |
event
|
str
|
Column name for the event indicator. |
required |
group
|
str
|
Column name for a grouping variable. When supplied, CIFs are estimated per group and Gray's K-sample test is reported per cause. |
None
|
alpha
|
float
|
Significance level for the confidence bands. |
0.05
|
Returns:
| Type | Description |
|---|---|
CumIncResult
|
With |
Notes
The cumulative incidence for a single cause is not 1 - KM applied to
that cause; treating competing events as censoring over-states the risk.
The Aalen-Johansen estimator weights each cause-specific increment by the
overall (all-cause) survival probability, so the CIFs of all causes plus
the overall survival sum to one at every time.
Examples:
finegray ¶
finegray(data: DataFrame, duration: str, event: str, x: Sequence[str], cause: int = 1, alpha: float = 0.05, max_iter: int = 50, tol: float = 1e-07) -> FineGrayResult
Fine & Gray (1999) proportional subdistribution hazards model.
Models the effect of covariates on the cumulative incidence of cause
through its subdistribution hazard, so coefficients exponentiate to
subdistribution hazard ratios that map monotonically to the CIF
(unlike cause-specific Cox coefficients).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
DataFrame
|
Input data. |
required |
duration
|
str
|
Follow-up-time column. |
required |
event
|
str
|
Event indicator: |
required |
x
|
sequence of str
|
Covariate column names. |
required |
cause
|
int
|
Cause of interest (default |
1
|
alpha
|
float
|
Significance level for confidence intervals. |
0.05
|
max_iter
|
(int, float)
|
Newton-Raphson controls. |
50
|
tol
|
(int, float)
|
Newton-Raphson controls. |
50
|
Returns:
| Type | Description |
|---|---|
FineGrayResult
|
With |
Notes
Subjects who fail from a competing cause are retained in the risk set with
time-decaying inverse-probability-of-censoring weights
w_i(t) = Ĝ(t) / Ĝ(T_i) (Ĝ = KM estimate of the censoring survival).
The weighted partial likelihood is maximised by Newton-Raphson with the
Breslow tie approximation. Standard errors are model-based (inverse
information); a fully robust sandwich variance that accounts for
estimating Ĝ is not yet implemented.