dodiscover.toporder.NoGAM#

class dodiscover.toporder.NoGAM(n_crossval=5, ridge_alpha=0.01, ridge_gamma=0.1, eta_G=0.001, eta_H=0.001, alpha=0.05, prune=True, n_splines=10, splines_degree=3, pns=False, pns_num_neighbors=None, pns_threshold=1)[source]#

The NoGAM (Not only Gaussian Additive Model) algorithm for causal discovery.

NoGAM [1] iteratively defines a topological ordering finding leaf nodes by predicting the entries in the gradient of the log-likelihood via estimated residuals. Then it prunes the fully connected DAG with CAM pruning [2]. The method assumes Additive Noise Model, while it doesn’t need to assume any distribution of the noise terms.

Parameters:

n_crossvalint, optional: Residuals of each variable in the graph are estimated via KernelRidgeRegressor of the sklearn library. To avoid overfitting in the prediction of the residuals, the method uses leave out cross validation, training a number of models equals n_crossval, which is used to predict the residuals on the portion of validation data unseen during the fitting of the regressor. Default value is 5. Similarly, KernelRidgeRegressor with ‘rbf’ kernel is used to predict entries in the gradient of the log-likelihood via estimated residuals.
ridge_alpha: float, optional: Alpha value for KernelRidgeRegressor with ‘rbf’ kernel, default is 0.01. ridge_alpha is used to fit both the regressor for the residuals estimation (Equation (14) [1]) and for the estimation of the score entries from the estimated residuals.
ridge_gamma: float, optional: Gamma value for KernelRidgeRegressor with ‘rbf’ kernel, default is 0.1. ridge_gamma is used to fit both the regressor for the residuals estimation (Equation (20) [1]) and for the estimation of the score entries from the estimated residuals.
eta_G: float, optional: Regularization parameter for Stein gradient estimator, default is 0.001.
eta_Hfloat, optional: Regularization parameter for Stein Hessian estimator, default is 0.001.
alphafloat, optional: Alpha cutoff value for variable selection with hypothesis testing over regression coefficients, default is 0.05.
prunebool, optional: If True (default), apply CAM-pruning after finding the topological order.
n_splinesint, optional: Number of splines to use for the feature function, default is 10. Automatically decreased in case of insufficient samples
splines_degree: int, optional: Order of spline to use for the feature function, default is 3.
pnsbool, optional: If True, perform Preliminary Neighbour Search (PNS) before CAM pruning step, default is False. Allows scaling CAM pruning and ordering to large graphs.
pns_num_neighbors: int, optional: Number of neighbors to use for PNS. If None (default) use all variables.
pns_threshold: float, optional: Threshold to use for PNS, default is 1.

Notes

Prior knowledge about the included and excluded directed edges in the output DAG is supported. It is not possible to provide explicit constraints on the relative positions of nodes in the topological ordering. However, explicitly including a directed edge in the DAG defines an implicit constraint on the relative position of the nodes in the topological ordering (i.e. if directed edge (i,j) is encoded in the graph, node i will precede node j in the output order).

References

Methods

`hessian`(X, eta_G, eta_H)	Stein estimator of the Hessian of log p(x).
`hessian_diagonal`(X, eta_G, eta_H)	Stein estimator of the diagonal of the Hessian matrix of log p(x).
`learn_graph`(data_df, context)	Fit topological order based causal discovery algorithm on input data.
`prune`(X, A_dense, G_included, G_excluded)	Prune the dense adjacency matrix `A_dense` from spurious edges.
`score`(X, eta_G[, K, nablaK])	Stein gradient estimator of the score, i.e. gradient log p(x).

hessian(X, eta_G, eta_H)#

Stein estimator of the Hessian of log p(x).

The Hessian matrix is efficiently estimated by exploitation of the Stein identity. Implements [3].

Parameters:

Xnp.ndarray of shape (n_samples, n_nodes): I.i.d. samples from p(X) joint distribution.
eta_G: float: regularization parameter for ridge regression in Stein gradient estimator.
eta_H: float: regularization parameter for ridge regression in Stein hessian estimator.

Returns:

Hnp.ndarray: Stein estimator of the Hessian matrix of log p(x).

References

hessian_diagonal(X, eta_G, eta_H)#

Stein estimator of the diagonal of the Hessian matrix of log p(x).

Parameters:

Xnp.ndarray (n_samples, n_nodes): I.i.d. samples from p(X) joint distribution.
eta_G: float: regularization parameter for ridge regression in Stein gradient estimator.
eta_H: float: regularization parameter for ridge regression in Stein hessian estimator.

Returns:

H_diagnp.ndarray: Stein estimator of the diagonal of the Hessian matrix of log p(x).

learn_graph(data_df, context)#

Fit topological order based causal discovery algorithm on input data.

Parameters:

data_dfpd.DataFrame: Datafame of the input data.
context: Context: The context of the causal discovery problem.

prune(X, A_dense, G_included, G_excluded)#

Prune the dense adjacency matrix A_dense from spurious edges.

Use sparse regression over the matrix of the data X for variable selection over the edges in the dense (potentially fully connected) adjacency matrix A_dense

Parameters:

Xnp.ndarray of shape (n_samples, n_nodes): Matrix of the data.
A_densenp.ndarray of shape (n_nodes, n_nodes): Dense adjacency matrix to be pruned.
G_includednx.DiGraph: Graph with edges that are required to be included in the output. It encodes assumptions and prior knowledge about the causal graph.
G_excludednx.DiGraph: Graph with edges that are required to be excluded from the output. It encodes assumptions and prior knowledge about the causal graph.

Returns:

Anp.ndarray: The pruned adjacency matrix output of the causal discovery algorithm.

score(X, eta_G, K=None, nablaK=None)#

Stein gradient estimator of the score, i.e. gradient log p(x).

The Stein gradient estimator [4] exploits the Stein identity for efficient estimate of the score function.

Parameters:

Xnp.ndarray of shape (n_samples, n_nodes): I.i.d. samples from p(X) joint distribution.
eta_G: float: regularization parameter for ridge regression in Stein gradient estimator.
Knp.ndarray of shape (n_samples, n_samples): Gaussian kernel evaluated at X, by default None. If K is None, it is computed inside of the method.
nablaKnp.ndarray of shape (n_samples, ): <nabla, K> evaluated dot product, by default None. If nablaK is None, it is computed inside of the method.

Returns:

Gnp.ndarray: Stein estimator of the score function.

References