dodiscover.toporder.DAS#
- class dodiscover.toporder.DAS(eta_G=0.001, eta_H=0.001, alpha=0.05, prune=True, das_cutoff=None, n_splines=10, splines_degree=3, min_parents=5, max_parents=20)[source]#
The DAS (Discovery At Scale) algorithm for causal discovery.
DAS [1] infer the topological ordering using SCORE [2]. Then it finds edges in the graph by inspection of the non diagonal entries of the Hessian of the log likelihood. A final, computationally cheap, pruning step is performed with CAM pruning [3]. The method assumes Additive Noise Model and Gaussianity of the noise terms. DAS is a highly scalable method, allowing to run causal discovery on thousands of nodes. It reduces the computational complexity of the pruning method of an order of magnitude with respect to SCORE.
- Parameters:
- eta_G: float, optional
Regularization parameter for Stein gradient estimator, default is 0.001.
- eta_H
float
, optional Regularization parameter for Stein Hessian estimator, default is 0.001.
- alpha
float
, optional Alpha cutoff value for variable selection with hypothesis testing over regression coefficients, default is 0.05.
- prunebool, optional
If True (default), apply CAM-pruning after finding the topological order. If False, DAS is equivalent to SCORE.
- das_cutoff
float
, optional Alpha value for hypothesis testing in preliminary DAS pruning. If None (default), it is set equal to
alpha
.- n_splines
int
, optional Number of splines to use for the feature function, default is 10. Automatically decreased in case of insufficient samples
- splines_degree: int, optional
Order of spline to use for the feature function, default is 3.
- min_parents
int
, optional Minimum number of edges retained by DAS preliminary pruning step, default is 5. min_parents <= 5 doesn’t significantly affects execution time, while increasing the accuracy.
- max_parents
int
, optional Maximum number of parents allowed for a single node, default is 20. Given that CAM pruning is inefficient for > ~20 nodes, larger values are not advised. The value of max_parents should be decrease under the assumption of sparse graphs.
Notes
Prior knowledge about the included and excluded directed edges in the output DAG is supported. It is not possible to provide explicit constraints on the relative positions of nodes in the topological ordering. However, explicitly including a directed edge in the DAG defines an implicit constraint on the relative position of the nodes in the topological ordering (i.e. if directed edge
(i,j)
is encoded in the graph, nodei
will precede nodej
in the output order).References
Methods
hessian
(X, eta_G, eta_H)Stein estimator of the Hessian of log p(x).
hessian_diagonal
(X, eta_G, eta_H)Stein estimator of the diagonal of the Hessian matrix of log p(x).
learn_graph
(data_df, context)Fit topological order based causal discovery algorithm on input data.
prune
(X, A_dense, G_included, G_excluded)Prune the dense adjacency matrix
A_dense
from spurious edges.score
(X, eta_G[, K, nablaK])Stein gradient estimator of the score, i.e. gradient log p(x).
- hessian(X, eta_G, eta_H)#
Stein estimator of the Hessian of log p(x).
The Hessian matrix is efficiently estimated by exploitation of the Stein identity. Implements [2].
- Parameters:
- X
np.ndarray
of shape (n_samples, n_nodes) I.i.d. samples from p(X) joint distribution.
- eta_G: float
regularization parameter for ridge regression in Stein gradient estimator.
- eta_H: float
regularization parameter for ridge regression in Stein hessian estimator.
- X
- Returns:
- H
np.ndarray
Stein estimator of the Hessian matrix of log p(x).
- H
References
- hessian_diagonal(X, eta_G, eta_H)#
Stein estimator of the diagonal of the Hessian matrix of log p(x).
- Parameters:
- X
np.ndarray
(n_samples, n_nodes) I.i.d. samples from p(X) joint distribution.
- eta_G: float
regularization parameter for ridge regression in Stein gradient estimator.
- eta_H: float
regularization parameter for ridge regression in Stein hessian estimator.
- X
- Returns:
- H_diag
np.ndarray
Stein estimator of the diagonal of the Hessian matrix of log p(x).
- H_diag
- learn_graph(data_df, context)#
Fit topological order based causal discovery algorithm on input data.
- Parameters:
- data_df
pd.DataFrame
Datafame of the input data.
- context: Context
The context of the causal discovery problem.
- data_df
- prune(X, A_dense, G_included, G_excluded)#
Prune the dense adjacency matrix
A_dense
from spurious edges.Use sparse regression over the matrix of the data
X
for variable selection over the edges in the dense (potentially fully connected) adjacency matrixA_dense
- Parameters:
- X
np.ndarray
of shape (n_samples, n_nodes) Matrix of the data.
- A_dense
np.ndarray
of shape (n_nodes, n_nodes) Dense adjacency matrix to be pruned.
- G_included
nx.DiGraph
Graph with edges that are required to be included in the output. It encodes assumptions and prior knowledge about the causal graph.
- G_excluded
nx.DiGraph
Graph with edges that are required to be excluded from the output. It encodes assumptions and prior knowledge about the causal graph.
- X
- Returns:
- A
np.ndarray
The pruned adjacency matrix output of the causal discovery algorithm.
- A
- score(X, eta_G, K=None, nablaK=None)#
Stein gradient estimator of the score, i.e. gradient log p(x).
The Stein gradient estimator [4] exploits the Stein identity for efficient estimate of the score function.
- Parameters:
- X
np.ndarray
of shape (n_samples, n_nodes) I.i.d. samples from p(X) joint distribution.
- eta_G: float
regularization parameter for ridge regression in Stein gradient estimator.
- K
np.ndarray
of shape (n_samples, n_samples) Gaussian kernel evaluated at X, by default None. If
K
is None, it is computed inside of the method.- nablaK
np.ndarray
of shape (n_samples, ) <nabla, K> evaluated dot product, by default None. If
nablaK
is None, it is computed inside of the method.
- X
- Returns:
- G
np.ndarray
Stein estimator of the score function.
- G
References