`statspai.causal_discovery`¶

causal_discovery ¶

Causal Discovery: Learning causal structure from observational data.

Algorithms

NOTEARS : NO TEARS continuous optimisation for DAG learning (Zheng et al. 2018). Formulates structure learning as a smooth optimisation problem with an acyclicity constraint.
PC Algorithm : Constraint-based causal discovery using conditional independence tests (Spirtes, Glymour, Scheines 2000). Learns a CPDAG (completed partially directed acyclic graph).

References

Zheng, X., Aragam, B., Ravikumar, P., & Xing, E. P. (2018). DAGs with NO TEARS: Continuous Optimization for Structure Learning. Advances in Neural Information Processing Systems, 31. [@zheng2018dags]

Spirtes, P., Glymour, C., & Scheines, R. (2000). Causation, Prediction, and Search (2nd ed.). MIT Press. [@spirtes2000causation]

NOTEARS ¶

NOTEARS: Continuous optimization for DAG structure learning.

Parameters:

Name	Type	Default
`data`	`DataFrame`	required
`variables`	`list of str`	`None`
`lambda1`	`float`	`0.1`
`max_iter`	`int`	`100`
`h_tol`	`float`	`1e-08`
`rho_max`	`float`	`1e+16`
`w_threshold`	`float`	`0.3`
`random_state`	`int`	`42`

Examples:

>>> import numpy as np
>>> import pandas as pd
>>> import statspai as sp
>>> from statspai.causal_discovery.notears import NOTEARS
>>> rng = np.random.default_rng(0)
>>> n = 500
>>> X = rng.normal(size=n)
>>> Z = 0.8 * X + rng.normal(size=n) * 0.5
>>> M = 0.7 * Z + rng.normal(size=n) * 0.5
>>> Y = 0.6 * M + rng.normal(size=n) * 0.5
>>> df = pd.DataFrame({'X': X, 'Z': Z, 'M': M, 'Y': Y})
>>> est = NOTEARS(data=df, variables=['X', 'Z', 'M', 'Y'])
>>> result = est.fit()
>>> bool(result['n_edges'] >= 0)
True

References

[@zheng2018dags]

fit ¶

fit() -> Dict[str, Any]

Run NOTEARS and return the learned DAG.

summary ¶

summary() -> str

Print a summary of the learned DAG.

PCAlgorithm ¶

PC Algorithm for causal discovery.

Parameters:

Name	Type	Default
`data`	`DataFrame`	required
`variables`	`list of str`	`None`
`alpha`	`float`	`0.05`
`max_cond_size`	`int`	`None`
`ci_test`	`str`	`'fisherz'`

Examples:

>>> import numpy as np
>>> import pandas as pd
>>> import statspai as sp
>>> from statspai.causal_discovery.pc import PCAlgorithm
>>> rng = np.random.default_rng(0)
>>> n = 500
>>> X = rng.normal(size=n)
>>> Z = 0.8 * X + rng.normal(size=n) * 0.5
>>> M = 0.7 * Z + rng.normal(size=n) * 0.5
>>> Y = 0.6 * M + rng.normal(size=n) * 0.5
>>> df = pd.DataFrame({'X': X, 'Z': Z, 'M': M, 'Y': Y})
>>> est = PCAlgorithm(data=df, variables=['X', 'Z', 'M', 'Y'])
>>> result = est.fit()
>>> bool(result['n_edges'] >= 0)
True

References

[@spirtes2000causation]

fit ¶

fit() -> Dict[str, Any]

Run the PC algorithm and return learned structure.

summary ¶

summary() -> str

Print a summary of the learned structure.

LiNGAMResult `dataclass` ¶

Bases: ResultProtocolMixin

Result of a :func:lingam (DirectLiNGAM) fit.

Attributes:

Name	Type	Description
`order`	`list of int`	Causal order, most exogenous variable first (column indices).
`adjacency`	`ndarray`	`B[i, j]` is the direct structural effect of variable `j` on `i`.
`names`	`list of str`	Variable names, aligned with the columns of `adjacency`.
`residuals`	`ndarray`	`(n, k)` structural residuals after removing recovered effects.

Examples:

>>> import statspai as sp
>>> import numpy as np, pandas as pd
>>> rng = np.random.default_rng(1)
>>> n = 400
>>> x0 = rng.uniform(-1, 1, n) ** 3
>>> x1 = 1.5 * x0 + rng.exponential(0.4, n) - 0.4
>>> df = pd.DataFrame({"x0": x0, "x1": x1})
>>> res = sp.lingam(df)
>>> isinstance(res, sp.LiNGAMResult)
True
>>> res.to_frame().shape                # adjacency as a labelled DataFrame
(2, 2)
>>> bool(len(res.edges(threshold=0.5)) >= 1)
True

GESResult `dataclass` ¶

Bases: ResultProtocolMixin

Result of :func:ges — a CPDAG (Markov equivalence class).

Attributes:

Name	Type	Description
`adjacency`	`ndarray`	`(p, p)` CPDAG adjacency; `adj[i, j]` and `adj[j, i]` both nonzero denotes an undirected edge `i --- j`.
`names`	`list of str`	Variable names, aligned with `adjacency` rows/columns.
`bic`	`float`	Total BIC of the recovered graph (lower is better).

Examples:

>>> import statspai as sp
>>> import numpy as np, pandas as pd
>>> rng = np.random.default_rng(0)
>>> x = rng.normal(size=200)
>>> y = 2.0 * x + rng.normal(size=200)
>>> df = pd.DataFrame({"x": x, "y": y})
>>> res = sp.ges(df)
>>> isinstance(res, sp.GESResult)
True
>>> bool(res.to_frame().shape == (2, 2))
True

FCIResult `dataclass` ¶

Bases: ResultProtocolMixin

Partial Ancestral Graph (PAG) learned by :func:fci.

Attributes:

Name	Type	Description
`variables`	`list of str`	Variable names (graph nodes).
`skeleton`	`DataFrame`	Undirected adjacency matrix over `variables`.
`pag_left, pag_right`	`DataFrame`	Edge marks on the i-side and j-side of each edge (i, j).
`edges`	`list of tuple`	Human-readable `(i, label, j)` edges, e.g. `("X", "-->", "Y")`.
`separating_sets`	`dict`	CI-test separating sets keyed by variable-name pairs.
`n_obs`	`int`	Number of complete observations used.
`alpha`	`float`	Significance level of the CI tests.
`ci_test`	`str`	Name of the conditional-independence test.

Examples:

>>> import statspai as sp
>>> import numpy as np, pandas as pd
>>> rng = np.random.default_rng(0)
>>> n = 500
>>> x = rng.normal(size=n)
>>> m = x + rng.normal(size=n)
>>> y = m + rng.normal(size=n)
>>> data = pd.DataFrame({"X": x, "M": m, "Y": y})
>>> res = sp.fci(data)
>>> bool(res.skeleton.shape == (3, 3))
True

ICPResult `dataclass` ¶

Bases: ResultProtocolMixin

Result of an Invariant Causal Prediction run (:func:icp).

Attributes:

Name	Type	Description
`parents`	`set of str`	Provably-causal parents of `Y` -- the intersection of every subset accepted by the invariance test. Empty when no informative subset is accepted (the conservative answer).
`accepted_subsets`	`list of frozenset`	All candidate subsets that passed the level-alpha invariance test.
`rejection_reason`	`dict`	Maps each rejected subset to a human-readable reason string.
`alpha`	`float`	Family-wise significance level used.
`coefficients`	`dict`	`var -> (lo, hi)` 95% confidence interval for each parent's coefficient in the pooled OLS fit on `parents`.
`method`	`str`	`"linear"` or `"nonlinear"`.

Examples:

>>> import statspai as sp
>>> import numpy as np
>>> import pandas as pd
>>> rng = np.random.default_rng(0)
>>> n = 400
>>> env = np.r_[np.zeros(n // 2, dtype=int), np.ones(n // 2, dtype=int)]
>>> x1 = 3.0 * env + rng.normal(size=n)
>>> y = 1.5 * x1 + rng.normal(size=n)
>>> x2 = y + rng.normal(size=n)
>>> X = pd.DataFrame({"X1": x1, "X2": x2})
>>> res = sp.icp(X, y, env)
>>> isinstance(res, sp.ICPResult)
True
>>> sorted(res.parents)
['X1']

PCMCIResult `dataclass` ¶

Bases: ResultProtocolMixin

PCMCI output — lag-specific adjacency + discovered links.

Returned by :func:pcmci. Bundles the lag-specific p-value tensor (p_matrix), the partial-correlation strengths (val_matrix), the boolean adjacency decision tensor, and the effective sample size. Call :meth:discovered_links for a tidy DataFrame of the significant lagged links and :meth:summary for a one-screen report.

Examples:

A two-variable system where lagged GDP drives inflation:

>>> import numpy as np
>>> import pandas as pd
>>> import statspai as sp
>>> rng = np.random.default_rng(0)
>>> T = 120
>>> gdp = np.zeros(T)
>>> inflation = np.zeros(T)
>>> eg = rng.normal(0, 1, T)
>>> ei = rng.normal(0, 1, T)
>>> for t in range(1, T):
...     gdp[t] = 0.5 * gdp[t - 1] + eg[t]
...     inflation[t] = 0.4 * gdp[t - 1] + 0.3 * inflation[t - 1] + ei[t]
>>> df = pd.DataFrame({"gdp": gdp, "inflation": inflation})
>>> res = sp.pcmci(df, tau_max=2, pc_alpha=0.05)
>>> isinstance(res, sp.PCMCIResult)
True
>>> links = res.discovered_links()
>>> list(links.columns)
['source', 'target', 'lag', 'partial_corr', 'p_value']
>>> bool(((links["source"] == "gdp") &
...       (links["target"] == "inflation")).any())
True

discovered_links ¶

discovered_links() -> DataFrame

Return a DataFrame of significant links sorted by strength.

LPCMCIResult `dataclass` ¶

Bases: ResultProtocolMixin

Output of :func:lpcmci.

Examples:

>>> import numpy as np
>>> import pandas as pd
>>> import statspai as sp
>>> rng = np.random.default_rng(31)
>>> T = 200
>>> X = np.zeros(T); Y = np.zeros(T); Z = np.zeros(T)
>>> for t in range(1, T):
...     X[t] = 0.5 * X[t - 1] + rng.normal(0, 0.5)
...     Y[t] = 0.4 * X[t - 1] + 0.3 * Y[t - 1] + rng.normal(0, 0.5)
...     Z[t] = 0.6 * Y[t - 1] + rng.normal(0, 0.5)
>>> df = pd.DataFrame({"X": X, "Y": Y, "Z": Z})
>>> res = sp.lpcmci(df, variables=["X", "Y", "Z"], tau_max=2, alpha=0.05)
>>> isinstance(res, sp.LPCMCIResult)
True
>>> res.edge_types[1, 0, 1]        # X --> Y at lag 1
'-->'
>>> list(res.to_frame().columns)
['lag', 'from', 'to', 'type', 'p_value']

to_frame ¶

to_frame() -> DataFrame

Long-format edges DataFrame.

DYNOTEARSResult `dataclass` ¶

Bases: ResultProtocolMixin

Output of :func:dynotears.

Examples:

>>> import statspai as sp
>>> import numpy as np
>>> import pandas as pd
>>> rng = np.random.default_rng(0)
>>> T = 60
>>> x = np.zeros(T); z = np.zeros(T); w = np.zeros(T)
>>> for t in range(1, T):
...     x[t] = 0.6 * x[t - 1] + rng.normal(0, 0.3)
...     z[t] = 0.5 * x[t - 1] + rng.normal(0, 0.3)
...     w[t] = 0.4 * z[t] + rng.normal(0, 0.3)
>>> df = pd.DataFrame({"x": x, "z": z, "w": w})
>>> res = sp.dynotears(df, lag=1, threshold=0.1)
>>> res.variables
['x', 'z', 'w']
>>> res.lag
1
>>> edges = res.to_frame()
>>> bool(set(["lag", "from", "to", "coef"]).issubset(
...     edges.columns
... )) if len(edges) else True
True

to_frame ¶

to_frame() -> DataFrame

Long-format edges DataFrame (|coef| > threshold).

pc_algorithm ¶

pc_algorithm(data: DataFrame, variables: Optional[List[str]] = None, alpha: float = 0.05, max_cond_size: Optional[int] = None, ci_test: str = 'fisherz', forbidden: Optional[List[Tuple[str, str]]] = None, required: Optional[List[Tuple[str, str]]] = None) -> Dict[str, Any]

Learn causal structure using the PC algorithm.

Parameters:

Name	Type	Description	Default
`data`	`DataFrame`	Observational data (n_samples x d_variables).	required
`variables`	`list of str`	Column names to use. If None, uses all numeric columns.	`None`
`alpha`	`float`	Significance level for conditional independence tests. Lower alpha = sparser graph (fewer edges).	`0.05`
`max_cond_size`	`int`	Maximum conditioning set size. If None, goes up to d-2.	`None`
`ci_test`	`str`	Conditional independence test: 'fisherz' (partial correlation) or 'hsic' (kernel-based, non-linear).	`'fisherz'`
`forbidden`	`list of (str, str)`	Background knowledge: edges that must NOT appear in the final graph (treated as undirected — both `(a, b)` and `(b, a)` are forbidden when either is given). The skeleton phase keeps these absent regardless of CI test outcomes.	`None`
`required`	`list of (str, str)`	Background knowledge: directed edges `a -> b` that must appear in the CPDAG. The skeleton phase preserves them regardless of CI rejection, and the orientation phase pins their direction.	`None`

Returns:

Type Description

dict

'skeleton' : pd.DataFrame Undirected adjacency matrix (0/1). 'cpdag' : pd.DataFrame CPDAG adjacency matrix. cpdag[i,j] = 1 means i -> j. If both cpdag[i,j] = 1 and cpdag[j,i] = 1, the edge is undirected (i -- j). 'edges' : list of tuples Directed edges as (parent, child) tuples. 'undirected_edges' : list of tuples Undirected edges as (node1, node2) tuples. 'separating_sets' : dict {(i, j): set} of separating sets for removed edges. 'variables' : list of str 'n_edges' : int 'n_obs' : int 'alpha' : float 'ci_test' : str

Examples:

>>> import numpy as np
>>> import pandas as pd
>>> import statspai as sp
>>> rng = np.random.default_rng(0)
>>> n = 500
>>> X = rng.normal(size=n)
>>> Z = 0.8 * X + rng.normal(size=n) * 0.5
>>> M = 0.7 * Z + rng.normal(size=n) * 0.5
>>> Y = 0.6 * M + rng.normal(size=n) * 0.5
>>> df = pd.DataFrame({'X': X, 'Z': Z, 'M': M, 'Y': Y})
>>> result = sp.pc_algorithm(df, variables=['X', 'Z', 'M', 'Y'])
>>> bool(result['n_edges'] >= 0)  # CPDAG edge count
True

nonlinear_icp ¶

nonlinear_icp(X: DataFrame | ndarray, y: ndarray, environment: ndarray, alpha: float = 0.05, **kw: Any) -> ICPResult

Alias for icp(..., method='nonlinear') -- Heinze-Deml et al. 2018.

The nonlinear variant swaps the linear mean/variance invariance test for a two-sample Kolmogorov-Smirnov test on the residual distributions, so it detects departures from invariance beyond the first two moments.

Examples:

>>> import statspai as sp
>>> import numpy as np
>>> import pandas as pd
>>> rng = np.random.default_rng(0)
>>> n = 400
>>> env = np.r_[np.zeros(n // 2, dtype=int), np.ones(n // 2, dtype=int)]
>>> x1 = 3.0 * env + rng.normal(size=n)
>>> y = 1.5 * x1 + rng.normal(size=n)
>>> x2 = y + rng.normal(size=n)
>>> X = pd.DataFrame({"X1": x1, "X2": x2})
>>> res = sp.nonlinear_icp(X, y, env, alpha=0.05)
>>> sorted(res.parents)
['X1']

partial_corr_pvalue ¶

partial_corr_pvalue(x: ndarray, y: ndarray, Z: Optional[ndarray] = None) -> float

Partial-correlation p-value for H0: X ⟂ Y | Z.

Residualises X and Y on Z via OLS, then applies a Fisher-z transform to the residual correlation with df = n - |Z| - 2.

Examples:

Two variables sharing a common driver z are marginally correlated but conditionally independent given z:

>>> import numpy as np
>>> import statspai as sp
>>> rng = np.random.default_rng(0)
>>> z = rng.normal(size=200)
>>> x = z + rng.normal(size=200)
>>> y = z + rng.normal(size=200)
>>> bool(sp.partial_corr_pvalue(x, y) < 0.05)        # marginally dependent
True
>>> bool(sp.partial_corr_pvalue(x, y, z) > 0.05)     # independent given z
True

to_networkx ¶

to_networkx(adjacency: ndarray, names: Sequence[str], directed: bool = True, threshold: float = 0.0) -> Any

Build a :class:networkx.DiGraph (or :class:networkx.Graph) from an adjacency matrix. Edge weight equals the matrix entry.

Requires the optional networkx dependency.

to_dot ¶

to_dot(adjacency: ndarray, names: Sequence[str], directed: bool = True, threshold: float = 0.0, title: Optional[str] = None, digits: int = 2) -> str

Render a Graphviz DOT-format string for the DAG.

Edge labels are weights rounded to digits; positive edges are drawn solid, negative edges dashed (matches the bnlearn convention).

plot_dag ¶

plot_dag(adjacency: ndarray, names: Sequence[str], *, directed: bool = True, threshold: float = 0.0, layout: str = 'circular', ax: Optional[Any] = None, edge_labels: bool = False, title: Optional[str] = None, figsize: Tuple[float, float] = (6.0, 6.0), node_color: str = '#e8f0fe', pos_edge_color: str = '#1f77b4', neg_edge_color: str = '#d62728', digits: int = 2) -> tuple[Any, Any]

Draw the DAG with Matplotlib + NetworkX.

Parameters:

Name	Type	Description	Default
`adjacency`	`(k, k) ndarray`		required
`names`	`sequence of str`		required
`directed`	`bool`		`True`
`threshold`	`float`	Drop edges with `\|w\| <= threshold`.	`0.0`
`layout`	`('circular', 'spring', 'kamada_kawai', 'shell', 'graphviz')`	`"graphviz"` requires pygraphviz; falls back to "spring" when missing.	`"circular"`
`ax`	`matplotlib Axes`		`None`
`edge_labels`	`bool`	Annotate edges with the weight (rounded to `digits`).	`False`
`title`	`str`		`None`
`figsize`	`(w, h)`		`(6.0, 6.0)`
`node_color`	`str`		`'#e8f0fe'`
`pos_edge_color`	`str`		`'#e8f0fe'`
`neg_edge_color`	`str`		`'#e8f0fe'`
`digits`	`int`		`2`

Returns:

Type	Description
`(fig, ax)`

edge_list ¶

edge_list(adjacency: ndarray, names: Sequence[str], threshold: float = 0.0, directed: bool = True) -> List[Tuple[str, str, float]]

Extract a sorted [(parent, child, weight), ...] list.

Edges with |w| ≤ threshold are dropped. Output is sorted by descending |weight| for stable display.

shd ¶

shd(estimated: ndarray, truth: ndarray, threshold: float = 0.0) -> int

Structural Hamming Distance between two adjacency matrices.

Counts the number of edge insertions / deletions / reversals required to transform :math:\hat A into the true DAG. Both inputs are binarised at |·| > threshold first.

Reference: Tsamardinos, Brown, Aliferis (2006). "The max-min hill-climbing Bayesian network structure learning algorithm." Machine Learning 65(1): 31-78. DOI: 10.1007/s10994-006-6889-7.

statspai.causal_discovery¶

causal_discovery ¶

NOTEARS ¶

fit ¶

summary ¶

PCAlgorithm ¶

fit ¶

summary ¶

LiNGAMResult dataclass ¶

GESResult dataclass ¶

FCIResult dataclass ¶

ICPResult dataclass ¶

PCMCIResult dataclass ¶

discovered_links ¶

LPCMCIResult dataclass ¶

to_frame ¶

DYNOTEARSResult dataclass ¶

to_frame ¶

pc_algorithm ¶

nonlinear_icp ¶

partial_corr_pvalue ¶

to_networkx ¶

to_dot ¶

plot_dag ¶

edge_list ¶

shd ¶

`statspai.causal_discovery`¶

LiNGAMResult `dataclass` ¶

GESResult `dataclass` ¶

FCIResult `dataclass` ¶

ICPResult `dataclass` ¶

PCMCIResult `dataclass` ¶

LPCMCIResult `dataclass` ¶

DYNOTEARSResult `dataclass` ¶