Skip to content

statspai.deepiv

deepiv

DeepIV: Deep Learning Instrumental Variables (Hartford et al. 2017).

Uses a two-stage neural network approach: Stage 1: Mixture Density Network estimates P(T | Z, X) Stage 2: Response network minimises counterfactual loss using Monte-Carlo samples from the learned treatment distribution.

References

Hartford, J., Lewis, G., Leyton-Brown, K., & Taddy, M. (2017). "Deep IV: A Flexible Approach for Counterfactual Prediction." Proceedings of the 34th International Conference on Machine Learning (ICML). [@hartford2017deep]

DeepIV

Deep Instrumental Variables estimator (Hartford et al. 2017).

Follows the same defaults as Microsoft EconML's DeepIVEstimator. See the module docstring for a discussion of the biased vs unbiased gradient trade-off and when to prefer modern alternatives.

Parameters:

Name Type Description Default
data DataFrame
required
y str

Outcome variable.

required
treat str

Endogenous treatment variable (continuous).

required
instruments list of str

Excluded instruments.

required
covariates list of str

Exogenous controls.

required
n_components int

Gaussian mixture components in the MDN.

10
hidden_layers tuple of int

Hidden layer sizes.

(128, 64)
first_stage_epochs int

Stage 1 training epochs.

100
second_stage_epochs int

Stage 2 training epochs.

100
n_samples int

MC samples per observation used to form the Stage-2 residual.

1
n_gradient_samples int

Independent additional samples for the gradient path. 0 reproduces EconML's default (biased but variance-regularized); >= 1 activates Hartford et al.'s paired-sample unbiased gradient estimator.

0
batch_size int
256
learning_rate float
0.001
alpha float

Significance level.

0.05
random_state int
42
verbose bool
False

fit

fit() -> CausalResult

Fit the DeepIV model and return causal effect estimates.

Returns:

Type Description
CausalResult

effect

effect(t0: float, t1: float, X: Optional[ndarray] = None) -> ndarray

Estimate E[Y(t1) - Y(t0) | X] for given treatment levels.

Parameters:

Name Type Description Default
t0 float

Baseline treatment value (original scale).

required
t1 float

Counterfactual treatment value (original scale).

required
X ndarray

Covariates. If None, uses training data.

None

Returns:

Type Description
ndarray

Individual-level treatment effects.

deepiv

deepiv(data: DataFrame, y: str, treat: str, instruments: List[str], covariates: List[str], n_components: int = 10, hidden_layers: Tuple[int, ...] = (128, 64), first_stage_epochs: int = 100, second_stage_epochs: int = 100, n_samples: int = 1, n_gradient_samples: int = 0, batch_size: int = 256, learning_rate: float = 0.001, alpha: float = 0.05, random_state: int = 42, verbose: bool = False) -> CausalResult

Estimate causal effects using Deep Instrumental Variables.

Parameters:

Name Type Description Default
data DataFrame

Input data.

required
y str

Outcome variable name.

required
treat str

Endogenous treatment variable name (continuous).

required
instruments list of str

Excluded instrument variable names.

required
covariates list of str

Exogenous control variable names.

required
n_components int

Number of Gaussian mixture components in the MDN (stage 1).

10
hidden_layers tuple of int

Hidden layer sizes for both networks.

(128, 64)
first_stage_epochs int

Training epochs for the treatment model (MDN).

100
second_stage_epochs int

Training epochs for the response model.

100
n_samples int

Number of MC samples per observation used to form the Stage-2 residual (y - h(t, x)).

1
n_gradient_samples int

Number of independent additional samples used for the gradient path. When 0 (default), a single set of samples is used, matching Microsoft EconML's default behaviour and producing a biased but variance-regularized gradient estimator. When > 0, two independent sample sets are drawn and the unbiased paired-sample estimator mean((y - h(p, x)) * (y - h(p', x))) from Hartford et al. Section 3.2 is used instead. Set to 1 or higher if you specifically need unbiased gradients (e.g. for asymptotic consistency arguments).

0
batch_size int

Mini-batch size.

256
learning_rate float

Adam learning rate.

1e-3
alpha float

Significance level for confidence interval.

0.05
random_state int

Random seed for reproducibility.

42
verbose bool

Print training progress.

False

Returns:

Type Description
CausalResult
Notes

Use the single-sample default (n_gradient_samples=0) for most applications — it matches EconML, trains roughly 2x faster, and the implicit variance regularization often helps in small samples.

Use n_gradient_samples >= 1 when you need the unbiased gradient (e.g. for theoretical guarantees or when training with very large batches where the bias dominates).

For high-dimensional covariates or weak instruments, consider DeepGMM / DFIV / DualIV instead — see the module docstring for references.

Examples:

>>> result = sp.deepiv(
...     df, y='lwage', treat='educ',
...     instruments=['nearc4'],
...     covariates=['exper', 'expersq'],
... )
>>> print(result.summary())
>>> # Custom architecture with unbiased gradient
>>> result = sp.deepiv(
...     df, y='sales', treat='price',
...     instruments=['cost_shifter'],
...     covariates=['demand_controls'],
...     n_components=20,
...     hidden_layers=(256, 128, 64),
...     first_stage_epochs=200,
...     n_gradient_samples=1,   # paired-sample unbiased gradient
... )