Proximal causal inference — the full family¶
When you cannot observe the confounder, Proximal Causal Inference (PCI) uses two proxies of the unobserved confounder to identify the ATE. StatsPAI ships the base estimator plus four 2025-2026 frontier variants that are robust to bridge-function misspecification, policy-level interventions, bidirectional confounding, and short→long-term extrapolation.
This guide walks through the whole PCI family with a single running example
and tells you which variant to reach for in which situation. Every function
below lives at top level: sp.proximal, sp.fortified_pci,
sp.bidirectional_pci, sp.pci_mtp, sp.double_negative_control,
sp.proximal_surrogate_index, sp.select_pci_proxies.
The setup¶
You have outcome Y, treatment D, observed covariates X, and an
unobserved confounder U. Back-door adjustment on X does not
identify the ATE because U → D and U → Y after conditioning on X.
Proximal identification (Miao-Geng-Tchetgen, 2018; Tchetgen et al., 2024)
works if you can find two proxies of U:
Z(treatment-inducing confounding proxy) — depends onU, possibly onD; conditional on(U, D, X),Zis independent ofY.W(outcome-inducing confounding proxy) — depends onU, possibly onY; conditional on(U, X),Wis independent ofD.
Plus: a completeness condition linking (D, Z) to W (roughly, W
carries enough information about U to predict it nonparametrically given
(D, X)).
Choosing proxies is the hard part — see sp.select_pci_proxies below.
Canonical estimator: sp.proximal¶
Two-stage-least-squares on the outcome bridge. Good default when bridges
are plausibly linear and you have one Z and one W.
r = sp.proximal(
data=df, y="Y", treat="D",
proxy_z=["Z"], proxy_w=["W"],
covariates=["X1", "X2"],
bridge="linear", # or "loglinear"
n_boot=500,
)
r.summary()
Reference: Miao, Geng, Tchetgen Tchetgen (2018), Biometrika 105(4); Cui, Pu, Miao, Kennedy, Tchetgen (2024), JASA.
Fortified PCI — sp.fortified_pci (Yang & Schwartz, 2025)¶
Standard PCI fails hard when either bridge is misspecified. Fortified PCI adds a stability constraint: at each proxy choice, we fit both bridges with a shared penalty that drives the solution toward an overlap region where both moments are jointly satisfied. Empirically this is the PCI variant most robust to moderate bridge-function misspecification (Yu, Shi & Tchetgen Tchetgen 2025 Tables 2-4).
r = sp.fortified_pci(
data=df, y="Y", treat="D",
proxy_z=["Z"], proxy_w=["W"],
covariates=["X1", "X2"],
)
When to reach for it: your point estimate from sp.proximal moves
noticeably when you add/remove a single covariate, or when bridge="linear"
vs bridge="loglinear" give very different answers.
Citation: Yu, Shi & Tchetgen Tchetgen (2025), arXiv:2506.13152.
Bidirectional PCI — sp.bidirectional_pci (Shi, Miao & Tchetgen, 2025)¶
Standard PCI solves the outcome bridge (regress Y on D, W, X) and
then inverts via Z. Bidirectional PCI fits the outcome bridge and the
treatment bridge simultaneously in a single two-way regression, which:
- Reduces finite-sample bias when either bridge is weakly identified.
- Produces a natural doubly-robust GMM-style objective.
- Makes the identification assumptions symmetric — useful when you cannot
decide which proxy is
Zand which isW.
r = sp.bidirectional_pci(
data=df, y="Y", treat="D",
proxy_z=["Z"], proxy_w=["W"],
covariates=["X1", "X2"],
)
Citation: Min, Zhang & Luo (2025), arXiv:2507.13965.
PCI for modified treatment policies — sp.pci_mtp (Park & Ying, 2025)¶
Suppose you do not want the ATE under a static "D=1 vs D=0" contrast;
you want the effect of shifting the treatment distribution by some
amount δ — e.g., "raise the dose by 10% for everyone." PCI-MTP
identifies that modified treatment policy effect under the same two-
proxy structure as base PCI.
r = sp.pci_mtp(
data=df, y="Y", treat="dose",
proxy_z=["Z"], proxy_w=["W"],
delta=0.10, # ← shift treatment by +10%
covariates=["X1", "X2"],
)
Why it matters: in dose-response or continuous-treatment settings, the "contrast between fixed levels" estimand is often uninteresting. MTP answers the policy-relevant question "what happens if we nudge treatment a bit?" under unobserved confounding.
Citation: Olivas-Martinez, Gilbert & Rotnitzky (2025), arXiv:2512.12038.
Double negative controls — sp.double_negative_control¶
A simpler PCI special case used in epidemiology: one negative control
exposure (NCE, a treatment-like variable we know cannot affect Y) plus
one negative control outcome (NCO, an outcome-like variable we know
cannot be affected by D). These together identify the treatment effect
under additive structural assumptions.
r = sp.double_negative_control(
data=df, y="Y", treat="D",
nce="prior_dental_visits", # placebo treatment
nco="baseline_weight", # placebo outcome
covariates=["age", "sex"],
)
Use case: lower-tech than full PCI — good for initial analyses when
you have only one of each type of proxy. References:
sp.negative_control_outcome and sp.negative_control_exposure for the
single-control versions.
Long-term surrogate + PCI — sp.proximal_surrogate_index (Imbens-Kallus-Mao-Wang, 2025)¶
You ran a randomised experiment for 3 months but care about the 2-year
outcome. The classical solution (Athey-Chetty-Imbens-Kang, NBER WP 26463,
2019 surrogate index) needs the surrogates to fully mediate the long-term
effect — a strong assumption. Imbens, Kallus, Mao & Wang show that
combining the surrogate index with a PCI
layer on the observational data lets you drop the full-mediation
requirement: short-term surrogates play the role of W, observational
proxies play the role of Z, and the two together identify the long-term
ATE.
r = sp.proximal_surrogate_index(
experimental=df_exp, # short-term RCT
observational=df_obs, # long-follow-up observational cohort
treatment="feature_flag",
surrogates=["dau_90d", "retention_90d"], # → W
proxies=["pre_dau", "pre_purchase"], # → Z
long_term_outcome="revenue_24mo",
covariates=["country", "cohort"],
)
This is also available as the surrogate_pci bridge in sp.bridge — see
Bridging theorems.
Citation: Imbens, Kallus, Mao & Wang (2025). "Long-term Causal Inference Under Persistent Confounding via Data Combination." Journal of the Royal Statistical Society Series B 87(2), 362-388. arXiv:2202.07234.
Picking the proxies: sp.select_pci_proxies¶
When you have a list of candidate proxy variables, this helper scores each on two PCI-relevant axes:
Zscore — how strongly is the candidate predicted byDafter partialling outX? (Want this high for a goodZ.)Wscore — how strongly does the candidate predictYafter partialling outD, X? (Want this high for a goodW.)
It returns a ranked table of candidates so you can choose the top-scorers for each role.
ranks = sp.select_pci_proxies(
data=df, y="Y", treat="D",
candidates=["V1", "V2", "V3", "V4", "V5"],
covariates=["X1", "X2"],
top_k=2,
)
print(ranks.z_table) # best Z candidates
print(ranks.w_table) # best W candidates
When to use which — decision guide¶
Got exactly one Z and one W, bridges look roughly linear
→ sp.proximal (default)
Bridges sensitive to specification choice
→ sp.fortified_pci
Unsure which proxy should play Z vs W
→ sp.bidirectional_pci
Continuous treatment / want "shift policy" effect
→ sp.pci_mtp(delta=...)
Only have a negative control exposure + negative control outcome
→ sp.double_negative_control
Want long-term effect from short-term experiment
→ sp.proximal_surrogate_index
Have a pile of candidate proxies, not sure which to use
→ sp.select_pci_proxies first, then the above
Diagnostics every PCI analysis should report¶
- Bridge completeness: can
ZpredictW? (If not, identification fails.) Runsp.regress("W ~ Z + D + X", data=df)and check the F-stat. - Proxy independence:
Z ⊥ Y | D, U, X. You cannot test this directly (becauseUis unobserved), but you can test the weakerZ ⊥ Y | D, Xand interpret a non-zero coefficient as suggestive ofU-mediated dependence. - Bridge stability: rerun with
bridge="linear"andbridge="loglinear"— substantial disagreement suggests misspecification. - Sensitivity via
sp.bridge(kind="cb_ipw", ...)on the(D, X)pair — if the back-door-on-X-only estimate equals the PCI estimate, your unobserved confounder might not be doing much work in this sample.
This guide is current for StatsPAI ≥ 1.5.0. All functions are stable
and registered in sp.list_functions().
For Agents¶
Pre-conditions - at least one treatment-side proxy Z (independent of outcome given U, X) - at least one outcome-side proxy W (independent of treatment given U, X) - proxy_z and proxy_w measure the same unmeasured confounder U from different angles - n ≥ 1000 — 2SLS on proxies is noisy
Identifying assumptions - Existence of an outcome bridge function h(w, a, x) that recovers E[Y(a) | U, X] - Z and W are conditionally independent given U and (A, X) - Z ⊥ Y | U, A, X (exclusion on Z) - W ⊥ A | U, X (exclusion on W) - Z is relevant for W given A, X (bridge first stage)
Failure modes → recovery
| Symptom | Exception | Remedy | Try next |
|---|---|---|---|
| First-stage (Z → W) too weak | statspai.AssumptionWarning |
Try richer Z or more proxies; without first-stage strength the bridge is underidentified. | sp.iv |
| Proxies collapse to nearly-constant | statspai.DataInsufficient |
Proxy variation insufficient — redesign measurement or fall back to sensitivity (sp.sensemakr). | sp.sensemakr |
| Estimate highly sensitive to bridge specification | statspai.AssumptionWarning |
Report multiple bridge families; compare with sp.negative_control_outcome / _exposure. | sp.negative_control_outcome |
Alternatives (ranked)
- sp.negative_control_outcome
- sp.negative_control_exposure
- sp.double_negative_control
- sp.iv
- sp.sensemakr
Typical minimum N: 1000