Skip to content

statspai.causal_discovery

causal_discovery

Causal Discovery: Learning causal structure from observational data.

Algorithms
  • NOTEARS : NO TEARS continuous optimisation for DAG learning (Zheng et al. 2018). Formulates structure learning as a smooth optimisation problem with an acyclicity constraint.

  • PC Algorithm : Constraint-based causal discovery using conditional independence tests (Spirtes, Glymour, Scheines 2000). Learns a CPDAG (completed partially directed acyclic graph).

References

Zheng, X., Aragam, B., Ravikumar, P., & Xing, E. P. (2018). DAGs with NO TEARS: Continuous Optimization for Structure Learning. Advances in Neural Information Processing Systems, 31. [@zheng2018dags]

Spirtes, P., Glymour, C., & Scheines, R. (2000). Causation, Prediction, and Search (2nd ed.). MIT Press. [@spirtes2000causation]

NOTEARS

NOTEARS: Continuous optimization for DAG structure learning.

Parameters:

Name Type Description Default
data DataFrame
required
variables list of str
None
lambda1 float
0.1
max_iter int
100
h_tol float
1e-08
rho_max float
1e+16
w_threshold float
0.3
random_state int
42

fit

fit() -> Dict[str, Any]

Run NOTEARS and return the learned DAG.

summary

summary() -> str

Print a summary of the learned DAG.

PCAlgorithm

PC Algorithm for causal discovery.

Parameters:

Name Type Description Default
data DataFrame
required
variables list of str
None
alpha float
0.05
max_cond_size int
None
ci_test str
'fisherz'

fit

fit() -> Dict[str, Any]

Run the PC algorithm and return learned structure.

summary

summary() -> str

Print a summary of the learned structure.

PCMCIResult dataclass

PCMCI output — lag-specific adjacency + discovered links.

discovered_links() -> DataFrame

Return a DataFrame of significant links sorted by strength.

LPCMCIResult dataclass

Output of :func:lpcmci.

to_frame

to_frame() -> DataFrame

Long-format edges DataFrame.

DYNOTEARSResult dataclass

Output of :func:dynotears.

to_frame

to_frame() -> DataFrame

Long-format edges DataFrame (|coef| > threshold).

pc_algorithm

pc_algorithm(data: DataFrame, variables: Optional[List[str]] = None, alpha: float = 0.05, max_cond_size: Optional[int] = None, ci_test: str = 'fisherz', forbidden: Optional[List[Tuple[str, str]]] = None, required: Optional[List[Tuple[str, str]]] = None) -> Dict[str, Any]

Learn causal structure using the PC algorithm.

Parameters:

Name Type Description Default
data DataFrame

Observational data (n_samples x d_variables).

required
variables list of str

Column names to use. If None, uses all numeric columns.

None
alpha float

Significance level for conditional independence tests. Lower alpha = sparser graph (fewer edges).

0.05
max_cond_size int

Maximum conditioning set size. If None, goes up to d-2.

None
ci_test str

Conditional independence test: 'fisherz' (partial correlation) or 'hsic' (kernel-based, non-linear).

'fisherz'
forbidden list of (str, str)

Background knowledge: edges that must NOT appear in the final graph (treated as undirected — both (a, b) and (b, a) are forbidden when either is given). The skeleton phase keeps these absent regardless of CI test outcomes.

None
required list of (str, str)

Background knowledge: directed edges a -> b that must appear in the CPDAG. The skeleton phase preserves them regardless of CI rejection, and the orientation phase pins their direction.

None

Returns:

Type Description
dict

'skeleton' : pd.DataFrame Undirected adjacency matrix (0/1). 'cpdag' : pd.DataFrame CPDAG adjacency matrix. cpdag[i,j] = 1 means i -> j. If both cpdag[i,j] = 1 and cpdag[j,i] = 1, the edge is undirected (i -- j). 'edges' : list of tuples Directed edges as (parent, child) tuples. 'undirected_edges' : list of tuples Undirected edges as (node1, node2) tuples. 'separating_sets' : dict {(i, j): set} of separating sets for removed edges. 'variables' : list of str 'n_edges' : int 'n_obs' : int 'alpha' : float 'ci_test' : str

Examples:

>>> import statspai as sp
>>> result = sp.pc_algorithm(df, variables=['X', 'Z', 'M', 'Y'])
>>> print(result['edges'])       # directed edges
>>> print(result['cpdag'])       # CPDAG adjacency matrix

nonlinear_icp

nonlinear_icp(X, y, environment, alpha: float = 0.05, **kw) -> ICPResult

Alias for icp(..., method='nonlinear') -- Heinze-Deml et al. 2018.

partial_corr_pvalue

partial_corr_pvalue(x: ndarray, y: ndarray, Z: Optional[ndarray] = None) -> float

Partial-correlation p-value for H0: X ⟂ Y | Z.

Residualises X and Y on Z via OLS, then applies a Fisher-z transform to the residual correlation with df = n - |Z| - 2.

to_networkx

to_networkx(adjacency: ndarray, names: Sequence[str], directed: bool = True, threshold: float = 0.0)

Build a :class:networkx.DiGraph (or :class:networkx.Graph) from an adjacency matrix. Edge weight equals the matrix entry.

Requires the optional networkx dependency.

to_dot

to_dot(adjacency: ndarray, names: Sequence[str], directed: bool = True, threshold: float = 0.0, title: Optional[str] = None, digits: int = 2) -> str

Render a Graphviz DOT-format string for the DAG.

Edge labels are weights rounded to digits; positive edges are drawn solid, negative edges dashed (matches the bnlearn convention).

plot_dag

plot_dag(adjacency: ndarray, names: Sequence[str], *, directed: bool = True, threshold: float = 0.0, layout: str = 'circular', ax: Optional[Any] = None, edge_labels: bool = False, title: Optional[str] = None, figsize: Tuple[float, float] = (6.0, 6.0), node_color: str = '#e8f0fe', pos_edge_color: str = '#1f77b4', neg_edge_color: str = '#d62728', digits: int = 2)

Draw the DAG with Matplotlib + NetworkX.

Parameters:

Name Type Description Default
adjacency (k, k) ndarray
required
names sequence of str
required
directed bool
True
threshold float

Drop edges with |w| <= threshold.

0.0
layout ('circular', 'spring', 'kamada_kawai', 'shell', 'graphviz')

"graphviz" requires pygraphviz; falls back to "spring" when missing.

"circular"
ax matplotlib Axes
None
edge_labels bool

Annotate edges with the weight (rounded to digits).

False
title str
None
figsize (w, h)
(6.0, 6.0)
node_color str
'#e8f0fe'
pos_edge_color str
'#e8f0fe'
neg_edge_color str
'#e8f0fe'
digits int
2

Returns:

Type Description
(fig, ax)

edge_list

edge_list(adjacency: ndarray, names: Sequence[str], threshold: float = 0.0, directed: bool = True) -> List[Tuple[str, str, float]]

Extract a sorted [(parent, child, weight), ...] list.

Edges with |w| ≤ threshold are dropped. Output is sorted by descending |weight| for stable display.

shd

shd(estimated: ndarray, truth: ndarray, threshold: float = 0.0) -> int

Structural Hamming Distance between two adjacency matrices.

Counts the number of edge insertions / deletions / reversals required to transform :math:\hat A into the true DAG. Both inputs are binarised at |·| > threshold first.

Reference: Tsamardinos, Brown, Aliferis (2006). "The max-min hill-climbing Bayesian network structure learning algorithm." Machine Learning 65(1): 31-78. DOI: 10.1007/s10994-006-6889-7.