Example to demonstrate optimized backdoor variable search for Causal Identification#
This notebook compares the performance between causal identification using vanilla backdoor search and the optimized backdoor search and demonstrates the performance gains obtained by using the latter.
import time
import random
from networkx.linalg.graphmatrix import adjacency_matrix
import numpy as np
import pandas as pd
import networkx as nx
import dowhy
from dowhy import CausalModel
from dowhy.utils import graph_operations
import dowhy.datasets
Create Random Graph#
In this section, we create a random graph with the designated number of nodes (10 in this case).
n = 10
p = 0.5
G = nx.generators.random_graphs.fast_gnp_random_graph(n, p, directed=True)
graph = nx.DiGraph([(u,v) for (u,v) in G.edges() if u<v])
nodes = []
for i in graph.nodes:
adjacency_matrix = np.asarray(nx.to_numpy_array(graph))
graph_dot = graph_operations.adjacency_matrix_to_graph(adjacency_matrix, nodes)
graph_dot = graph_operations.str_to_dot(graph_dot.source)
print("Graph Generated.")
df = pd.DataFrame(columns=nodes)
print("Dataframe Generated.")
Graph Generated.
Dataframe Generated.
Testing optimized backdoor search#
In this section, we compare the runtimes for causal identification using vanilla backdoor search and the optimized backdoor search.
start = time.time()
# I. Create a causal model from the data and given graph.
model = CausalModel(data=df,treatment=str(random.randint(0,n-1)),outcome=str(random.randint(0,n-1)),graph=graph_dot)
time1 = time.time()
print("Time taken for initializing model =", time1-start)
# II. Identify causal effect and return target estimands
identified_estimand = model.identify_effect()
time2 = time.time()
print("Time taken for vanilla identification =", time2-time1)
# III. Identify causal effect using the optimized backdoor implementation
identified_estimand = model.identify_effect(optimize_backdoor=True)
end = time.time()
print("Time taken for optimized backdoor identification =", end-time2)
Time taken for initializing model = 0.0044956207275390625
Time taken for vanilla identification = 0.00021505355834960938
Time taken for optimized backdoor identification = 0.00013136863708496094
It can be observed that the optimized backdoor search makes causal identification faster as compared to the vanilla implementation.