dowhy.causal_identifier package#
Submodules#
dowhy.causal_identifier.auto_identifier module#
- class dowhy.causal_identifier.auto_identifier.AutoIdentifier(estimand_type: EstimandType, backdoor_adjustment: BackdoorAdjustment = BackdoorAdjustment.BACKDOOR_DEFAULT, optimize_backdoor: bool = False, costs: List | None = None)[source]#
Bases:
object
Class that implements different identification methods.
Currently supports backdoor and instrumental variable identification methods. The identification is based on the causal graph provided.
This class is for backwards compatibility with CausalModel Will be deprecated in the future in favor of function call auto_identify_effect()
- class dowhy.causal_identifier.auto_identifier.BackdoorAdjustment(value)[source]#
Bases:
Enum
An enumeration.
- BACKDOOR_DEFAULT = 'default'#
- BACKDOOR_EFFICIENT = 'efficient-adjustment'#
- BACKDOOR_EXHAUSTIVE = 'exhaustive-search'#
- BACKDOOR_MAX = 'maximal-adjustment'#
- BACKDOOR_MIN = 'minimal-adjustment'#
- BACKDOOR_MINCOST_EFFICIENT = 'efficient-mincost-adjustment'#
- BACKDOOR_MIN_EFFICIENT = 'efficient-minimal-adjustment'#
- class dowhy.causal_identifier.auto_identifier.EstimandType(value)[source]#
Bases:
Enum
An enumeration.
- NONPARAMETRIC_ATE = 'nonparametric-ate'#
- NONPARAMETRIC_CDE = 'nonparametric-cde'#
- NONPARAMETRIC_NDE = 'nonparametric-nde'#
- NONPARAMETRIC_NIE = 'nonparametric-nie'#
- dowhy.causal_identifier.auto_identifier.build_backdoor_estimands_dict(treatment_names: List[str], outcome_names: List[str], observed_nodes: List[str], backdoor_sets: List[str], estimands_dict: Dict)[source]#
Build the final dict for backdoor sets by filtering unobserved variables if needed.
- dowhy.causal_identifier.auto_identifier.construct_backdoor_estimand(treatment_name: List[str], outcome_name: List[str], common_causes: List[str])[source]#
- dowhy.causal_identifier.auto_identifier.construct_frontdoor_estimand(treatment_name: List[str], outcome_name: List[str], frontdoor_variables_names: List[str])[source]#
- dowhy.causal_identifier.auto_identifier.construct_iv_estimand(treatment_name: List[str], outcome_name: List[str], instrument_names: List[str])[source]#
- dowhy.causal_identifier.auto_identifier.construct_mediation_estimand(estimand_type: EstimandType, action_nodes: List[str], outcome_nodes: List[str], mediator_nodes: List[str])[source]#
- dowhy.causal_identifier.auto_identifier.find_valid_adjustment_sets(graph: DiGraph, action_nodes: List[str], outcome_nodes: List[str], observed_nodes: List[str], backdoor_paths: List, bdoor_graph: DiGraph, dseparation_algo: str, backdoor_sets: List, filt_eligible_variables: List, backdoor_adjustment: BackdoorAdjustment, max_iterations: int)[source]#
- dowhy.causal_identifier.auto_identifier.get_default_backdoor_set_id(graph: DiGraph, action_nodes: List[str], outcome_nodes: List[str], backdoor_sets_dict: Dict)[source]#
- dowhy.causal_identifier.auto_identifier.identify_ate_effect(graph: DiGraph, action_nodes: List[str], outcome_nodes: List[str], observed_nodes: List[str], backdoor_adjustment: BackdoorAdjustment, optimize_backdoor: bool, estimand_type: EstimandType, costs: List, conditional_node_names: List[str] | None = None)[source]#
- dowhy.causal_identifier.auto_identifier.identify_backdoor(graph: DiGraph, action_nodes: List[str], outcome_nodes: List[str], observed_nodes: List[str], backdoor_adjustment: BackdoorAdjustment, include_unobserved: bool = False, dseparation_algo: str = 'default', direct_effect: bool = False)[source]#
- dowhy.causal_identifier.auto_identifier.identify_cde_effect(graph: DiGraph, action_nodes: List[str], outcome_nodes: List[str], observed_nodes: List[str], backdoor_adjustment: BackdoorAdjustment, estimand_type: EstimandType)[source]#
Identify controlled direct effect. For a definition, see Vanderwheele (2011). Controlled direct and mediated effects: definition, identification and bounds. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4193506/
Using do-calculus rules, identification yields a adjustment set. It is based on the principle that under a graph where the direct edge from treatment to outcome is removed, conditioning on the adjustment set should d-separate treatment and outcome.
- dowhy.causal_identifier.auto_identifier.identify_effect_auto(graph: DiGraph, action_nodes: str | List[str], outcome_nodes: str | List[str], observed_nodes: str | List[str], estimand_type: EstimandType, conditional_node_names: List[str] | None = None, backdoor_adjustment: BackdoorAdjustment = BackdoorAdjustment.BACKDOOR_DEFAULT, optimize_backdoor: bool = False, costs: List | None = None) IdentifiedEstimand [source]#
Main method that returns an identified estimand (if one exists).
If estimand_type is non-parametric ATE, then uses backdoor, instrumental variable and frontdoor identification methods, to check if an identified estimand exists, based on the causal graph.
- Parameters:
optimize_backdoor – if True, uses an optimised algorithm to compute the backdoor sets
costs – non-negative costs associated with variables in the graph. Only used
for estimand_type=’non-parametric-ate’ and backdoor_adjustment=’efficient-mincost-adjustment’. If no costs are provided by the user, and backdoor_adjustment=’efficient-mincost-adjustment’, costs are assumed to be equal to one for all variables in the graph. :param conditional_node_names: variables that are used to determine treatment. If none are provided, it is assumed that the intervention is static. :returns: target estimand, an instance of the IdentifiedEstimand class
- dowhy.causal_identifier.auto_identifier.identify_efficient_backdoor(graph: DiGraph, action_nodes: List[str], outcome_nodes: List[str], observed_nodes: List[str], backdoor_adjustment: BackdoorAdjustment, costs: List, conditional_node_names: List[str] | None = None)[source]#
Method implementing algorithms to compute efficient backdoor sets, as described in Rotnitzky and Smucler (2020), Smucler, Sapienza and Rotnitzky (2021) and Smucler and Rotnitzky (2022).
For backdoor_adjustment=’efficient-adjustment’, computes an optimal backdoor set, that is, a backdoor set comprised of observable variables that yields non-parametric estimators of the interventional mean with the smallest asymptotic variance among those that are based on observable backdoor sets. This optimal backdoor set always exists when no variables are latent, and the algorithm is guaranteed to compute it in this case. Under a non-parametric graphical model with latent variables, such a backdoor set can fail to exist. When certain sufficient conditions under which it is known that such a backdoor set exists are not satisfied, an error is raised.
For backdoor_adjustment=’efficient-minimal-adjustment’, computes an optimal minimal backdoor set, that is, a minimal backdoor set comprised of observable variables that yields non-parametric estimators of the interventional mean with the smallest asymptotic variance among those that are based on observable minimal backdoor sets.
For backdoor_adjustment=’efficient-mincost-adjustment’, computes an optimal minimum cost backdoor set, that is, a minimum cost backdoor set comprised of observable variables that yields non-parametric estimators of the interventional mean with the smallest asymptotic variance among those that are based on observable minimum cost backdoor sets. The cost of a backdoor set is defined as the sum of the costs of the variables that comprise it.
The various optimal backdoor sets computed by this method are not only optimal under non-parametric graphical models and non-parametric estimators of interventional mean, but also under linear graphical models and OLS estimators, per results in Henckel, Perkovic and Maathuis (2020).
- Parameters:
costs – a list with non-negative costs associated with variables in the graph. Only used
for estimatand_type=’non-parametric-ate’ and backdoor_adjustment=’efficient-mincost-adjustment’. If not costs are provided by the user, and backdoor_adjustment=’efficient-mincost-adjustment’, costs are assumed to be equal to one for all variables in the graph. The structure of the list should be of the form [(node, {“cost”: x}) for node in nodes]. :param conditional_node_names: variables that are used to determine treatment. If none are provided, it is assumed that the intervention sets the treatment to a constant. :returns: backdoor_sets, a list of dictionaries, with each dictionary having as values a backdoor set.
- dowhy.causal_identifier.auto_identifier.identify_frontdoor(graph: DiGraph, action_nodes: List[str], outcome_nodes: List[str], observed_nodes: List[str], dseparation_algo: str = 'default')[source]#
Find a valid frontdoor variable set if it exists.
- dowhy.causal_identifier.auto_identifier.identify_mediation(graph: DiGraph, action_nodes: List[str], outcome_nodes: List[str])[source]#
Find a valid mediator if it exists.
Currently only supports a single variable mediator set.
- dowhy.causal_identifier.auto_identifier.identify_mediation_first_stage_confounders(graph: DiGraph, action_nodes: List[str], outcome_nodes: List[str], mediator_nodes: List[str], observed_nodes: List[str], backdoor_adjustment: BackdoorAdjustment)[source]#
- dowhy.causal_identifier.auto_identifier.identify_mediation_second_stage_confounders(graph: DiGraph, action_nodes: List[str], mediator_nodes: List[str], outcome_nodes: List[str], observed_nodes: List[str], backdoor_adjustment: BackdoorAdjustment)[source]#
- dowhy.causal_identifier.auto_identifier.identify_nde_effect(graph: DiGraph, action_nodes: List[str], outcome_nodes: List[str], observed_nodes: List[str], backdoor_adjustment: BackdoorAdjustment, estimand_type: EstimandType)[source]#
- dowhy.causal_identifier.auto_identifier.identify_nie_effect(graph: DiGraph, action_nodes: List[str], outcome_nodes: List[str], observed_nodes: List[str], backdoor_adjustment: BackdoorAdjustment, estimand_type: EstimandType)[source]#
dowhy.causal_identifier.backdoor module#
- class dowhy.causal_identifier.backdoor.Backdoor(graph, nodes1, nodes2)[source]#
Bases:
object
Class for optimized implementation of Backdoor variable search between the source nodes and the target nodes.
- class dowhy.causal_identifier.backdoor.HittingSetAlgorithm(list_of_sets, colliders={})[source]#
Bases:
object
Class for the Hitting Set Algorithm to obtain a approximate minimal set of backdoor variables to condition on for each node pair.
- Parameters:
list_of_sets – List of sets such that each set comprises nodes representing a single backdoor path between a source node and a target node.
dowhy.causal_identifier.efficient_backdoor module#
- class dowhy.causal_identifier.efficient_backdoor.EfficientBackdoor(graph: DiGraph, action_nodes: List[str], outcome_nodes: List[str], observed_nodes: List[str], conditional_node_names=None, costs=None)[source]#
Bases:
object
Implements methods for finding optimal (efficient) backdoor sets.
- Parameters:
graph – nx.DiGraph A causal graph.
costs – list A list with non-negative costs associated with variables in the graph. Only used for estimatand_type=’non-parametric-ate’ and method_name=’efficient-mincost-adjustment’. If not costs are provided by the user, and method_name=’efficient-mincost-adjustment’, costs are assumed to be equal to one for all variables in the graph. The structure of the list should be of the form [(node, {“cost”: x}) for node in nodes].
conditional_node_names – list A list with variables that are used to determine treatment. If none are provided, it is assumed that the intervention sets the treatment to a constant.
- ancestors_all(nodes)[source]#
Method to compute the set of all ancestors of a set of nodes. A node is always an ancestor of itself.
- Parameters:
nodes – list A list of nodes in the graph.
- Returns ancestors:
set The set of nodes that are ancestors of nodes in nodes.
- backdoor_graph(G)[source]#
- Method to compute the proper back-door graph associated with
treatment and outcome.
- Parameters:
G – nx.DiGraph A directed acyclic graph.
- Returns Gbd:
nx.DiGraph The proper backdoor graph of G.
- build_D()[source]#
Returns the D flow network associated with treatment, outcome, conditional and observable variables. If a node does not have a ‘cost’ attribute, this function will assume the cost is infinity.
See Smucler and Rotnitzky (2022), Journal of Causa Inference, for the full definition of this flow network.
- Returns D:
nx.DiGraph The D flow network.
- build_H0()[source]#
Returns the H0 graph associated with treatment, outcome, conditional and observable variables. See Smucler, Sapienza and Rotnitzky (2021), Biometrika, for the full definition of this graph.
- Returns H0:
nx.Graph The H0 graph.
- build_H1()[source]#
Returns the H1 graph associated with treatment, outcome, conditional and observable variables. See Smucler, Sapienza and Rotnitzky (2021), Biometrika, for the full definition of this graph.
- Returns H1:
nx.Graph The H1 graph.
- causal_vertices()[source]#
- Method to compute the set of all vertices that lie in a causal path
between treatment and outcome.
- Returns causal_vertices:
set A set with vertices lying on some causal path between treatment and outcome.
- compute_smallest_mincut()[source]#
Returns a min-cut in the flow network D associated with treatment, outcome, conditional and observable variables that is contained in any other min-cut.
- Returns S_c:
set The min-cut with the above property.
- forbidden()[source]#
- Method to compute the forbidden set with respect to treatment and
outcome.
- Returns forbidden:
set The forbidden set.
- h_operator(S)[source]#
Given a set S of vertices in the flow network D, returns the operator h(S), a set of vertices in the undirected graph H1.
See Smucler and Rotnitzky (2022), Journal of Causal Inference, for the full definition of this operator.
- Parameters:
S – set A set of vertices in the flow network D associated treatment, outcome, conditional and observable variables.
- Returns Z:
set The set obtained from applying the h operator to S.
- ignore()[source]#
- Method to compute the set of ignorable vertices with respect to
treatment, outcome, conditional and observable variables. Used in the construction of the H0 and H1 graphs. See Smucler, Sapienza and Rotnitzky (2021), Biometrika, for the full definition of this set.
- Returns ignore:
set The set of ignorable vertices.
- optimal_adj_set()[source]#
Returns the optimal adjustment set with respect to treatment, outcome, conditional and observable variables.
If the sufficient conditions for the existence of the optimal adjustment set outlined in Smucler, Sapienza and Rotnitzky (2021), Biometrika, do not hold, an error is raised.
- Returns:
optimal: set The optimal adjustment set.
- optimal_mincost_adj_set()[source]#
Returns the optimal minimum cost adjustment set with respect to treatment, outcome, conditional and observable variables.
Note that when the costs are constant, this is the optimal adjustment set among those of minimum cardinality.
- Returns:
optimal_mincost: set The optimal minimum cost adjustment set.
- optimal_minimal_adj_set()[source]#
Returns the optimal minimal adjustment set with respect to treatment, outcome, conditional and observable variables.
- Returns:
optimal_minimal: set The optimal minimal adjustment set.
- unblocked(H, Z)[source]#
Method to compute the unblocked set of Z with respect to treatment. See Smucler, Sapienza and Rotnitzky (2021), Biometrika, for the full definition of this set.
- Params H:
nx.Graph An undirected graph.
- Parameters:
Z – list A list with nodes in the graph H.
- Returns unblocked:
set The unblocked set.
dowhy.causal_identifier.id_identifier module#
- class dowhy.causal_identifier.id_identifier.IDExpression[source]#
Bases:
object
Class for storing a causal estimand, as a result of the identification step using the ID algorithm. The object stores a list of estimators(self._product) whose porduct must be obtained and a list of variables (self._sum) over which the product must be marginalized.
- add_product(element: Dict | IDExpression)[source]#
Add an estimator to the list of product.
- Parameters:
element – Estimator to append to the product list.
- class dowhy.causal_identifier.id_identifier.IDIdentifier[source]#
Bases:
object
This class is for backwards compatibility with CausalModel Will be deprecated in the future in favor of function call id_identify_effect()
- dowhy.causal_identifier.id_identifier.identify_effect_id(graph: DiGraph, action_nodes: str | List[str], outcome_nodes: str | List[str]) IDExpression [source]#
Implementation of the ID algorithm. Link - https://ftp.cs.ucla.edu/pub/stat_ser/shpitser-thesis.pdf The pseudo code has been provided on Pg 40.
- Parameters:
treatment_names – OrderedSet comprising names of treatment variables.
:param outcome_names:OrderedSet comprising names of outcome variables.
- Returns:
target estimand, an instance of the IDExpression class.
dowhy.causal_identifier.identified_estimand module#
- class dowhy.causal_identifier.identified_estimand.IdentifiedEstimand(identifier, treatment_variable, outcome_variable, estimand_type=None, estimands=None, backdoor_variables=None, instrumental_variables=None, frontdoor_variables=None, mediator_variables=None, mediation_first_stage_confounders=None, mediation_second_stage_confounders=None, default_backdoor_id=None, identifier_method=None, no_directed_path=False)[source]#
Bases:
object
Class for storing a causal estimand, typically as a result of the identification step.
- get_backdoor_variables(key: str | None = None)[source]#
Return a list containing the backdoor variables.
If the calling estimator method is a backdoor method, return the backdoor variables corresponding to its target estimand. Otherwise, return the backdoor variables for the default backdoor estimand.
dowhy.causal_identifier.identify_effect module#
- class dowhy.causal_identifier.identify_effect.CausalIdentifier(*args, **kwargs)[source]#
Bases:
Protocol
Protocol to define a CausalIdentifier, all CausalIdentifiers must conform to at least this list of methods.
This class is for backwards compatibility with CausalModel Will be deprecated in the future in favor of function call auto_identify_effect()
- identify_effect(graph: DiGraph, action_nodes: str | List[str], outcome_nodes: str | List[str], **kwargs)[source]#
Identify the causal effect to be estimated based on a causal graph :param graph: Causal graph to be analyzed :param action_nodes: name of the treatment :param outcome_nodes: name of the outcome :param **kwargs: Additional parameters required by the identify_effect of a specific CausalIdentifier for example: conditional_node_names in AutoIdentifier or node_names in IDIdentifier :returns: a probability expression (estimand) for the causal effect if identified, else NULL
- dowhy.causal_identifier.identify_effect.identify_effect(graph: DiGraph, action_nodes: str | List[str], outcome_nodes: str | List[str], observed_nodes: str | List[str]) IdentifiedEstimand [source]#
Identify the causal effect to be estimated based on a causal graph
- Parameters:
graph – Causal graph to be analyzed
treatment – name of the treatment
outcome – name of the outcome
- Returns:
a probability expression (estimand) for the causal effect if identified, else NULL
Module contents#
- class dowhy.causal_identifier.AutoIdentifier(estimand_type: EstimandType, backdoor_adjustment: BackdoorAdjustment = BackdoorAdjustment.BACKDOOR_DEFAULT, optimize_backdoor: bool = False, costs: List | None = None)[source]#
Bases:
object
Class that implements different identification methods.
Currently supports backdoor and instrumental variable identification methods. The identification is based on the causal graph provided.
This class is for backwards compatibility with CausalModel Will be deprecated in the future in favor of function call auto_identify_effect()
- class dowhy.causal_identifier.BackdoorAdjustment(value)[source]#
Bases:
Enum
An enumeration.
- BACKDOOR_DEFAULT = 'default'#
- BACKDOOR_EFFICIENT = 'efficient-adjustment'#
- BACKDOOR_EXHAUSTIVE = 'exhaustive-search'#
- BACKDOOR_MAX = 'maximal-adjustment'#
- BACKDOOR_MIN = 'minimal-adjustment'#
- BACKDOOR_MINCOST_EFFICIENT = 'efficient-mincost-adjustment'#
- BACKDOOR_MIN_EFFICIENT = 'efficient-minimal-adjustment'#
- class dowhy.causal_identifier.EstimandType(value)[source]#
Bases:
Enum
An enumeration.
- NONPARAMETRIC_ATE = 'nonparametric-ate'#
- NONPARAMETRIC_CDE = 'nonparametric-cde'#
- NONPARAMETRIC_NDE = 'nonparametric-nde'#
- NONPARAMETRIC_NIE = 'nonparametric-nie'#
- class dowhy.causal_identifier.IDIdentifier[source]#
Bases:
object
This class is for backwards compatibility with CausalModel Will be deprecated in the future in favor of function call id_identify_effect()
- class dowhy.causal_identifier.IdentifiedEstimand(identifier, treatment_variable, outcome_variable, estimand_type=None, estimands=None, backdoor_variables=None, instrumental_variables=None, frontdoor_variables=None, mediator_variables=None, mediation_first_stage_confounders=None, mediation_second_stage_confounders=None, default_backdoor_id=None, identifier_method=None, no_directed_path=False)[source]#
Bases:
object
Class for storing a causal estimand, typically as a result of the identification step.
- get_backdoor_variables(key: str | None = None)[source]#
Return a list containing the backdoor variables.
If the calling estimator method is a backdoor method, return the backdoor variables corresponding to its target estimand. Otherwise, return the backdoor variables for the default backdoor estimand.
- dowhy.causal_identifier.construct_backdoor_estimand(treatment_name: List[str], outcome_name: List[str], common_causes: List[str])[source]#
- dowhy.causal_identifier.construct_frontdoor_estimand(treatment_name: List[str], outcome_name: List[str], frontdoor_variables_names: List[str])[source]#
- dowhy.causal_identifier.construct_iv_estimand(treatment_name: List[str], outcome_name: List[str], instrument_names: List[str])[source]#
- dowhy.causal_identifier.identify_effect(graph: DiGraph, action_nodes: str | List[str], outcome_nodes: str | List[str], observed_nodes: str | List[str]) IdentifiedEstimand [source]#
Identify the causal effect to be estimated based on a causal graph
- Parameters:
graph – Causal graph to be analyzed
treatment – name of the treatment
outcome – name of the outcome
- Returns:
a probability expression (estimand) for the causal effect if identified, else NULL
- dowhy.causal_identifier.identify_effect_auto(graph: DiGraph, action_nodes: str | List[str], outcome_nodes: str | List[str], observed_nodes: str | List[str], estimand_type: EstimandType, conditional_node_names: List[str] | None = None, backdoor_adjustment: BackdoorAdjustment = BackdoorAdjustment.BACKDOOR_DEFAULT, optimize_backdoor: bool = False, costs: List | None = None) IdentifiedEstimand [source]#
Main method that returns an identified estimand (if one exists).
If estimand_type is non-parametric ATE, then uses backdoor, instrumental variable and frontdoor identification methods, to check if an identified estimand exists, based on the causal graph.
- Parameters:
optimize_backdoor – if True, uses an optimised algorithm to compute the backdoor sets
costs – non-negative costs associated with variables in the graph. Only used
for estimand_type=’non-parametric-ate’ and backdoor_adjustment=’efficient-mincost-adjustment’. If no costs are provided by the user, and backdoor_adjustment=’efficient-mincost-adjustment’, costs are assumed to be equal to one for all variables in the graph. :param conditional_node_names: variables that are used to determine treatment. If none are provided, it is assumed that the intervention is static. :returns: target estimand, an instance of the IdentifiedEstimand class
- dowhy.causal_identifier.identify_effect_id(graph: DiGraph, action_nodes: str | List[str], outcome_nodes: str | List[str]) IDExpression [source]#
Implementation of the ID algorithm. Link - https://ftp.cs.ucla.edu/pub/stat_ser/shpitser-thesis.pdf The pseudo code has been provided on Pg 40.
- Parameters:
treatment_names – OrderedSet comprising names of treatment variables.
:param outcome_names:OrderedSet comprising names of outcome variables.
- Returns:
target estimand, an instance of the IDExpression class.