2.3.1. pywhy_stats.conditional_ksample.bregman#
Bregman (conditional) discrepancy test.
Also known as a conditional k-sample test, where the null hypothesis is that the conditional distributions are equal across different population groups. The Bregman tests for conditional divergence using correntropy.
2.3.1.1. Returns#
- PValueResult
The result of the test, which includes the test statistic and pvalue.
Functions
|
Test whether Y conditioned on X is invariant across the groups. |
- condind(X, Y, group_ind, kernel=None, null_sample_size=1000, propensity_model=None, propensity_weights=None, centered=False, n_jobs=None, random_seed=None)[source]#
Test whether Y conditioned on X is invariant across the groups.
For testing conditional independence on continuous data, we compute Bregman divergences [1]. This specifically tests the (conditional) invariance null hypothesis :math:
P_{Z=1}(Y|X) = P_{Z=0}(Y|X)
- Parameters:
- Xarray_like of shape (n_samples, n_features_x)
Data for variable X, which can be multidimensional.
- Yarray_like of shape (n_samples, n_features_y)
Data for variable Y, which can be multidimensional.
- group_indarray_like of shape (n_samples,)
Data for group indicator Z, which can be multidimensional. This assigns each sample to a group indicated by 0 or 1.
- kernel_X
Callable
[[array_like], array_like] The kernel function for X. By default, the RBF kernel is used for continuous and the delta kernel for categorical data. Note that we currently only consider string values as categorical data.
- kernel_Y
Callable
[[array_like], array_like] The kernel function for Y. By default, the RBF kernel is used for continuous and the delta kernel for categorical data. Note that we currently only consider string values as categorical data.
- null_sample_size
int
The number of samples to generate for the bootstrap distribution to approximate the pvalue, by default 1000.
- propensity_model
Optional
[sklearn.base.BaseEstimator
], optional The propensity model to use to estimate the propensity score, by default None.
- propensity_weights
Optional
[array_like], optional The propensity weights to use, by default None, which means that the propensity scores will be estimated from the propensity_model.
- centeredbool
Whether the kernel matrix should be centered, by default True.
- n_jobs
Optional
[int
], optional The number of jobs to run in parallel, by default None.
- random_seed
Optional
[int
], optional Random seed, by default None.
Notes
Any callable can be given to create the kernel matrix. For instance, to use a particular kernel from sklearn:
kernel_X = func:`sklearn.metrics.pairwise.pairwise_kernels.polynomial`
References