1.4.1. dodiscover.constraint.PsiFCI#

class dodiscover.constraint.PsiFCI(ci_estimator, cd_estimator, alpha=0.05, min_cond_set_size=None, max_cond_set_size=None, max_combinations=None, condsel_method=ConditioningSetSelection.NBRS, apply_orientations=True, keep_sorted=False, max_iter=1000, max_path_length=None, pds_condsel_method=ConditioningSetSelection.PDS, known_intervention_targets=False, n_jobs=None)[source]#

Interventional (Psi) FCI algorithm.

The I-FCI (or Psi-FCI) algorithm is an algorithm that accepts multiple sets of data that may pertain to observational and/or multiple interventional datasets under a known (I-FCI), or unknown (Psi-FCI) intervention target setting. Our API consolidates them here under one class, but you can control the setting using our hyperparameter. See [1] for more information on I-FCI and [2] for more information on Psi-FCI.

The Psi-FCI algorithm is complete for the Psi-PAG equivalence class. However, the I-FCI has not been shown to be complete for the I-PAG equivalence class. Note that the I-FCI algorithm may change without notice.

Parameters:

ci_estimatorBaseConditionalIndependenceTest: The conditional independence test function. The arguments of the estimator should be data, node, node to compare, conditioning set of nodes, and any additional keyword arguments.
cd_estimatorBaseConditionalDiscrepancyTest: The conditional discrepancy test function.
alphafloat, optional: The significance level for the conditional independence test, by default 0.05.
min_cond_set_sizeint, optional: Minimum size of the conditioning set, by default None, which will be set to ‘0’. Used to constrain the computation spent on the algorithm.
max_cond_set_sizeint, optional: Maximum size of the conditioning set, by default None. Used to limit the computation spent on the algorithm.
max_combinationsint, optional: The maximum number of conditional independence tests to run from the set of possible conditioning sets. By default None, which means the algorithm will check all possible conditioning sets. If max_combinations=n is set, then for every conditioning set size, ‘p’, there will be at most ‘n’ CI tests run before the conditioning set size ‘p’ is incremented. For controlling the size of ‘p’, see min_cond_set_size and max_cond_set_size. This can be used in conjunction with keep_sorted parameter to only test the “strongest” dependences.
condsel_methodConditioningSetSelection: The method to use for selecting the conditioning sets. Must be one of (‘neighbors’, ‘complete’, ‘neighbors_path’). See Notes for more details.
apply_orientationsbool: Whether or not to apply orientation rules given the learned skeleton graph and separating set per pair of variables. If True (default), will apply Zhang’s orientation rules R0-10, orienting colliders and certain arrowheads and tails [3].
keep_sortedbool: Whether or not to keep the considered conditioning set variables in sorted dependency order. If True (default) will sort the existing dependencies of each variable by its dependencies from strongest to weakest (i.e. largest CI test statistic value to lowest). The conditioning set is chosen lexographically based on the sorted test statistic values of ‘ith Pa(X) -> X’, for each possible parent node of ‘X’. This can be used in conjunction with max_combinations parameter to only test the “strongest” dependences.
max_iterint: The maximum number of iterations through the graph to apply orientation rules.
max_path_lengthint, optional: The maximum length of any discriminating path, or None if unlimited.
pds_condsel_methodConditioningSetSelection: The method to use for selecting the conditioning sets using PDS. Must be one of (‘pds’, ‘pds_path’). See Notes for more details.
known_intervention_targetsbool, optional: If True, then will run the I-FCI algorithm. If False, will run the Psi-FCI algorithm. By default False.
n_jobsint, optional: The number of parallel jobs to run. If -1, then the number of jobs is set to the number of cores. If 1 is given, no parallel computing code is used at all, By default None, which means 1.

Notes

Selection bias is unsupported because it is still an active research area.

Methods

`evaluate_edge`(data, X, Y[, Z])	Test any specific edge for X \|\| Y \| Z.
`learn_graph`(data, context)	Learn the relevant causal graph equivalence class.
`learn_skeleton`(data, context[, sep_set])	Learns the skeleton of a causal DAG using pairwise (conditional) independence testing.
`orient_edges`(graph)	Apply orientations to edges using logical rules.
`orient_unshielded_triples`(graph, sep_set)	Orient colliders given a graph and separation set.

convert_skeleton_graph

evaluate_edge(data, X, Y, Z=None)#

Test any specific edge for X || Y | Z.

Parameters:

datapd.DataFrame: The dataset
Xcolumn: A column in data.
Ycolumn: A column in data.
Zset, optional: A list of columns in data, by default None.

Returns:

test_statfloat: Test statistic.
pvaluefloat: The pvalue.

learn_graph(data, context)[source]#

Learn the relevant causal graph equivalence class.

From the pairs of datasets, we take all combinations and construct F-nodes corresponding to those.

Parameters:

dataList[pd.DataFrame]: The list of different datasets assigned to different environments. We assume the first dataset is always observational.
contextContext: The context with interventional assumptions.

Returns:

selfPsiFCI: The fitted learner.

learn_skeleton(data, context, sep_set=None)[source]#

Learns the skeleton of a causal DAG using pairwise (conditional) independence testing.

Encodes the skeleton via an undirected graph, networkx.Graph.

Parameters:

datapd.DataFrame: The dataset.
contextContext: A context object.
sep_setdict of dict of list of set: The separating set.

Returns:

skel_graphnx.Graph: The undirected graph of the causal graph’s skeleton.
sep_setdict of dict of list of set: The separating set per pairs of variables.

Notes

Learning the skeleton of a causal DAG uses (conditional) independence testing to determine which variables are (in)dependent. This specific algorithm compares exhaustively pairs of adjacent variables.

orient_edges(graph)#

Apply orientations to edges using logical rules.

Parameters:

graphEquivalenceClass: Causal graph.

Raises:

NotImplementedError: All constraint-based discovery algorithms must implement this.

orient_unshielded_triples(graph, sep_set)#

Orient colliders given a graph and separation set.

Parameters:

graphEquivalenceClass: The partial ancestral graph (PAG).
sep_setSeparatingSet: The separating set between any two nodes.