Getting Started =============== Installation ^^^^^^^^^^^^ The simplest installation is through `pip `__ or conda: .. tab-set-code:: .. code-block:: pip pip install dowhy .. code-block:: conda conda install -c conda-forge dowhy Further installation scenarios and instructions can be found at :doc:`install`. "Hello causal inference world" ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ In this section, we will show the "Hello world" version of DoWhy. DoWhy is based on a simple unifying language for causal inference, unifying two powerful frameworks, namely graphical causal models (GCM) and potential outcomes (PO). It uses graph-based criteria and do-calculus for modeling assumptions and identifying a non-parametric causal effect. To get you started, we introduce two features out of a large variety of features DoWhy offers. Effect inference ---------------- For effect estimation, DoWhy switches to methods based primarily on potential outcomes. To do it, DoWhy offers a simple 4-step recipe consisting of modeling a causal model, identification, estimation, and refutation: .. code:: python from dowhy import CausalModel import dowhy.datasets # Generate some sample data data = dowhy.datasets.linear_dataset( beta=10, num_common_causes=5, num_instruments=2, num_samples=10000) # Step 1: Create a causal model from the data and given graph. model = CausalModel( data=data["df"], treatment=data["treatment_name"], outcome=data["outcome_name"], graph=data["gml_graph"]) # Step 2: Identify causal effect and return target estimands identified_estimand = model.identify_effect() # Step 3: Estimate the target estimand using a statistical method. estimate = model.estimate_effect(identified_estimand, method_name="backdoor.propensity_score_matching") # Step 4: Refute the obtained estimate using multiple robustness checks. refute_results = model.refute_estimate(identified_estimand, estimate, method_name="random_common_cause") To understand what these four steps mean (and why we need four steps), the best place to learn more is the user guide's :doc:`../user_guide/causal_tasks/estimating_causal_effects/index` chapter. Alternatively, you can dive into the code and explore basic features in :doc:`../example_notebooks/dowhy_simple_example`. For estimation of conditional effects, you can also use methods from `EconML `_ using the same API, refer to :doc:`../example_notebooks/dowhy-conditional-treatment-effects`. Graphical causal model-based inference --------------------------------------- For features like root cause analysis, point-wise counterfactual inference, structural analysis and similar, DoWhy uses graphical causal models. The language of graphical causal models again offers a variety of causal questions that can be answered. DoWhy's API to answer these causal questions follows a simple 3-step recipe as follows: .. code:: python import networkx as nx, numpy as np, pandas as pd from dowhy import gcm # Let's generate some "normal" data we assume we're given from our problem domain: X = np.random.normal(loc=0, scale=1, size=1000) Y = 2 * X + np.random.normal(loc=0, scale=1, size=1000) Z = 3 * Y + np.random.normal(loc=0, scale=1, size=1000) data = pd.DataFrame(dict(X=X, Y=Y, Z=Z)) # Step 1: Model our system: causal_model = gcm.StructuralCausalModel(nx.DiGraph([('X', 'Y'), ('Y', 'Z')])) gcm.auto.assign_causal_mechanisms(causal_model, data) # Step 2: Train our causal model with the data from above: gcm.fit(causal_model, data) # Step 3: Perform a causal analysis. For instance, root cause analysis, where we observe anomalous_sample = pd.DataFrame(dict(X=[0.1], Y=[6.2], Z=[19])) # Here, Y is the root cause. # ... and would like to answer the question: # "Which node is the root cause of the anomaly in Z?": anomaly_attribution = gcm.attribute_anomalies(causal_model, "Z", anomalous_sample) If you want to learn more about this and other GCM features, we recommend starting with :doc:`../user_guide/modeling_gcm/index` in the user guide or check out :doc:`../example_notebooks/gcm_basic_example`. Further resources ^^^^^^^^^^^^^^^^^ There are further resources available: - An introductory `tutorial on causal inference `_ - A comprehensive `tutorial on Causal Inference and Counterfactual Reasoning `_ at the `ACM Knowledge Discovery and Data Mining 2018 conference `_ - A video introduction to the four steps of causal inference and its implications for machine learning from Microsoft Research: `Foundations of causal inference and its impacts on machine learning `_ - The PDF book `Elements of Causal Inference `_ - Draft chapters of an upcoming book: `Causal reasoning: Fundamentals and machine learning applications `_ - A blog post describing one of DoWhy's root cause analysis algorithms via graphical causal models: `New method identifies the root causes of statistical outliers `_