statspai.proximal¶
proximal ¶
Proximal Causal Inference (Tchetgen Tchetgen et al. 2020).
Identifies the ATE in the presence of an unmeasured confounder :math:U,
using two proxies of :math:U:
- :math:
Z— "treatment-inducing confounding proxy" (independent of Y | D, U) - :math:
W— "outcome-inducing confounding proxy" (independent of D | U)
plus measured covariates :math:X.
ProximalCausalInference ¶
Class wrapper for :func:proximal.
NegativeControlResult
dataclass
¶
Unified result for negative-control procedures.
ProxyScoreResult
dataclass
¶
Per-candidate proxy score for PCI.
proximal ¶
proximal(data: DataFrame, y: str, treat: str, proxy_z: List[str], proxy_w: List[str], covariates: Optional[List[str]] = None, bridge: str = 'linear', n_boot: int = 0, alpha: float = 0.05, seed: Optional[int] = None) -> CausalResult
Proximal causal inference via linear 2SLS on the outcome bridge.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
DataFrame
|
|
required |
y
|
str
|
Outcome variable. |
required |
treat
|
str
|
Treatment variable (binary or continuous). |
required |
proxy_z
|
list of str
|
Treatment-inducing confounding proxy variable(s) (Z). These serve as instruments for the outcome proxy W. |
required |
proxy_w
|
list of str
|
Outcome-inducing confounding proxy variable(s) (W). Endogenous regressors in the linear bridge. |
required |
covariates
|
list of str
|
Measured baseline covariates X (exogenous controls). |
None
|
bridge
|
linear
|
Functional form of the outcome-confounding bridge. Only
Kernel-based bridges (Mastouri et al. 2021) and sieve/RKHS
non-parametric bridges (Deaner 2018) are planned for a future
release and will be accepted values of this argument.
Passing any other string raises |
'linear'
|
n_boot
|
int
|
If > 0, nonparametric bootstrap SE (rows, not cluster-robust). If 0, use closed-form 2SLS sandwich SE (homoskedastic). |
0
|
alpha
|
float
|
|
0.05
|
seed
|
int
|
|
None
|
Returns:
| Type | Description |
|---|---|
CausalResult
|
|
Examples:
negative_control_outcome ¶
negative_control_outcome(data: DataFrame, nco: str, treat: str, covariates: Optional[Sequence[str]] = None, alpha: float = 0.05) -> NegativeControlResult
Lipsitch-style NCO calibration.
Fit an OLS of the negative-control outcome nco on treat
and optional covariates. A coefficient significantly different
from zero signals residual confounding that the measured covariates
failed to control for.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
DataFrame
|
|
required |
nco
|
str
|
Negative-control outcome — a variable plausibly unaffected by the true treatment but sharing confounders with the real Y. |
required |
treat
|
str
|
Treatment indicator or exposure variable. |
required |
covariates
|
sequence of str
|
Measured confounders to condition on. |
None
|
alpha
|
float
|
|
0.05
|
Returns:
| Type | Description |
|---|---|
NegativeControlResult
|
|
negative_control_exposure ¶
negative_control_exposure(data: DataFrame, y: str, nce: str, covariates: Optional[Sequence[str]] = None, alpha: float = 0.05) -> NegativeControlResult
Regress outcome on a negative-control exposure.
A significant coefficient on nce — which by design is assumed to
not causally affect y — indicates residual confounding along the
exposure axis (selection, measurement error, etc.).
double_negative_control ¶
double_negative_control(data: DataFrame, y: str, treat: str, nce: str, nco: str, covariates: Optional[Sequence[str]] = None, alpha: float = 0.05) -> NegativeControlResult
Double negative control estimator (Miao et al. 2018; Shi et al. 2020).
Under the linear / index model::
Y = α0 + α_D D + α_U U + α_X X + ε_Y
NCO = β0 + β_U U + β_X X + ε_W
E[U | NCE, X, D] linear in (NCE, X, D)
(plus standard independence/exclusion conditions), the ATE is point-identified by IV-regressing Y on (D, NCO, X) using (D, NCE, X) as instruments: NCE instruments for the proxy NCO, breaking the dependence on U. The coefficient on D is the de-biased ATE.
This is implemented as a just-identified 2SLS. The fitted ATE is asymptotically unbiased under the assumptions above and consistent with Shi et al. (2020, §3) closed-form.
proximal_regression ¶
proximal_regression(data: DataFrame, y: str, treat: str, z_proxy: str, w_proxy: str, covariates: Optional[Sequence[str]] = None, alpha: float = 0.05, propensity_bounds: tuple = (0.02, 0.98)) -> ProximalRegResult
Doubly-robust regression-based PCI estimator for the ATE.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
DataFrame
|
|
required |
y
|
str
|
Outcome column. |
required |
treat
|
str
|
Binary treatment column. |
required |
z_proxy
|
str
|
Treatment-inducing confounding proxy Z. |
required |
w_proxy
|
str
|
Outcome-inducing confounding proxy W. |
required |
covariates
|
sequence of str
|
Measured covariates X. |
None
|
alpha
|
float
|
|
0.05
|
propensity_bounds
|
(float, float)
|
|
(0.02, 0.98)
|
Returns:
| Type | Description |
|---|---|
ProximalRegResult
|
|
fortified_pci ¶
fortified_pci(data: DataFrame, y: str, treat: str, proxy_z: List[str], proxy_w: List[str], covariates: Optional[List[str]] = None, alpha: float = 0.05, n_boot: int = 200, seed: int = 0) -> CausalResult
Fortified Proximal Causal Inference (doubly-robust PCI).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
DataFrame
|
|
required |
y
|
str
|
|
required |
treat
|
str
|
|
required |
proxy_z
|
list of str
|
Treatment-side proxies (instruments for W). |
required |
proxy_w
|
list of str
|
Outcome-side proxies (endogenous bridge regressors). |
required |
covariates
|
list of str
|
|
None
|
alpha
|
float
|
|
0.05
|
n_boot
|
int
|
Bootstrap reps for SE. |
200
|
seed
|
int
|
|
0
|
Returns:
| Type | Description |
|---|---|
CausalResult
|
ATE estimate that is doubly robust to bridge / outcome misspecification. |
References
Yu, Shi & Tchetgen Tchetgen (2025). Fortified Proximal Causal Inference with Many Invalid Proxies. arXiv 2506.13152. [@yu2025fortified]
bidirectional_pci ¶
bidirectional_pci(data: DataFrame, y: str, treat: str, proxy_z: List[str], proxy_w: List[str], covariates: Optional[List[str]] = None, alpha: float = 0.05, n_boot: int = 200, seed: int = 0) -> CausalResult
Bidirectional PCI: simultaneous outcome + treatment bridge.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
same as
|
:func: |
required |
y
|
same as
|
:func: |
required |
treat
|
same as
|
:func: |
required |
proxy_z
|
same as
|
:func: |
required |
proxy_w
|
same as
|
:func: |
required |
covariates
|
same as
|
:func: |
required |
alpha
|
float
|
|
0.05
|
n_boot
|
int
|
|
200
|
seed
|
int
|
|
0
|
Returns:
| Type | Description |
|---|---|
CausalResult
|
ATE estimate from the bidirectional moment condition. |
pci_mtp ¶
pci_mtp(data: DataFrame, y: str, treat: str, proxy_z: List[str], proxy_w: List[str], delta: float, covariates: Optional[List[str]] = None, alpha: float = 0.05, n_boot: int = 200, seed: int = 0) -> CausalResult
PCI for Modified Treatment Policies (continuous-shift effect).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
DataFrame
|
|
required |
y
|
str
|
Outcome and continuous treatment. |
required |
treat
|
str
|
Outcome and continuous treatment. |
required |
proxy_z
|
list of str
|
Standard PCI proxies. |
required |
proxy_w
|
list of str
|
Standard PCI proxies. |
required |
delta
|
float
|
MTP shift; estimand is E[Y(D + δ)] - E[Y(D)]. |
required |
covariates
|
list of str
|
|
None
|
alpha
|
float
|
|
0.05
|
n_boot
|
int
|
|
200
|
seed
|
int
|
|
0
|
Returns:
| Type | Description |
|---|---|
CausalResult
|
|
select_pci_proxies ¶
select_pci_proxies(data: DataFrame, y: str, treat: str, candidates: List[str], covariates: Optional[List[str]] = None, top_k: int = 2) -> ProxyScoreResult
Score and rank candidate proxies for PCI.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
DataFrame
|
|
required |
y
|
str
|
|
required |
treat
|
str
|
|
required |
candidates
|
list of str
|
All variables that could plausibly serve as proxies. |
required |
covariates
|
list of str
|
|
None
|
top_k
|
int
|
Number of top candidates to recommend per side. |
2
|
Returns:
| Type | Description |
|---|---|
ProxyScoreResult
|
|