gcastle

gCastle is the fundamental package for causal structure learning with Python.

1.0.3
PyPI

Maintainers: 2

gCastle

中文版本

Version 1.0.3 released.

We'll release Version 1.0.3 on 2022/08/08.

Introduction

gCastle is a causal structure learning toolchain developed by Huawei Noah's Ark Lab. The package contains various functionality related to causal learning and evaluation, including:

Data generation and processing: data simulation, data reading operators, and data pre-processing operators（such as prior injection and variable selection).
Causal structure learning: causal structure learning methods, including both classic and recently developed methods, especially gradient-based ones that can handle large problems.
Evaluation metrics: various commonly used metrics for causal structure learning, including F1, SHD, FDR, TPR, FDR, NNZ, etc.

Algorithm List

Algorithm	Category	Description	Status
PC	IID/Constraint-based	A classic causal discovery algorithm based on conditional independence tests	v1.0.3
ANM	IID/Function-based	Nonlinear causal discovery with additive noise models	v1.0.3
DirectLiNGAM	IID/Function-based	A direct learning algorithm for linear non-Gaussian acyclic model (LiNGAM)	v1.0.3
ICALiNGAM	IID/Function-based	An ICA-based learning algorithm for linear non-Gaussian acyclic model (LiNGAM)	v1.0.3
GES	IID/Score-based	A classical Greedy Equivalence Search algorithm	v1.0.3
PNL	IID/Funtion-based	Causal discovery based on the post-nonlinear causal assumption	v1.0.3
NOTEARS	IID/Gradient-based	A gradient-based algorithm for linear data models (typically with least-squares loss)	v1.0.3
NOTEARS-MLP	IID/Gradient-based	A gradient-based algorithm using neural network modeling for non-linear causal relationships	v1.0.3
NOTEARS-SOB	IID/Gradient-based	A gradient-based algorithm using Sobolev space modeling for non-linear causal relationships	v1.0.3
NOTEARS-lOW-RANK	IID/Gradient-based	Adapting NOTEARS for large problems with low-rank causal graphs	v1.0.3
DAG-GNN	IID/Gradient-based	DAG Structure Learning with Graph Neural Networks	v1.0.3
GOLEM	IID/Gradient-based	A more efficient version of NOTEARS that can reduce number of optimization iterations	v1.0.3
GraNDAG	IID/Gradient-based	A gradient-based algorithm using neural network modeling for non-linear additive noise data	v1.0.3
MCSL	IID/Gradient-based	A gradient-based algorithm for non-linear additive noise data by learning the binary adjacency matrix	v1.0.3
GAE	IID/Gradient-based	A gradient-based algorithm using graph autoencoder to model non-linear causal relationships	v1.0.3
RL	IID/Gradient-based	A RL-based algorithm that can work with flexible score functions (including non-smooth ones)	v1.0.3
CORL	IID/Gradient-based	A RL- and order-based algorithm that improves the efficiency and scalability of previous RL-based approach	v1.0.3
TTPM	EventSequence/Function-based	A causal structure learning algorithm based on Topological Hawkes process for spatio-temporal event sequences	v1.0.3
HPCI	EventSequence/Hybrid	A causal structure learning algorithm based on Hawkes process and CI tests for event sequences	under development.

Installation

Dependencies

gCastle requires:

python (>= 3.6, <=3.9)
tqdm (>= 4.48.2)
numpy (>= 1.19.1)
pandas (>= 0.22.0)
scipy (>= 1.7.3)
scikit-learn (>= 0.21.1)
matplotlib (>=2.1.2)
networkx (>= 2.5)
torch (>= 1.9.0)

PIP installation

pip install gcastle==1.0.3

Usage Example (PC algorithm)

from castle.common import GraphDAG
from castle.metrics import MetricsDAG
from castle.datasets import IIDSimulation, DAG
from castle.algorithms import PC

# data simulation, simulate true causal dag and train_data.
weighted_random_dag = DAG.erdos_renyi(n_nodes=10, n_edges=10, 
                                      weight_range=(0.5, 2.0), seed=1)
dataset = IIDSimulation(W=weighted_random_dag, n=2000, method='linear', 
                        sem_type='gauss')
true_causal_matrix, X = dataset.B, dataset.X

# structure learning
pc = PC()
pc.learn(X)

# plot predict_dag and true_dag
GraphDAG(pc.causal_matrix, true_causal_matrix, 'result')

# calculate metrics
mt = MetricsDAG(pc.causal_matrix, true_causal_matrix)
print(mt.metrics)

You can visit examples to find more examples.

Citation

If you find gCastle useful in your research, please consider citing the the following paper:

@misc{zhang2021gcastle,
  title={gCastle: A Python Toolbox for Causal Discovery}, 
  author={Keli Zhang and Shengyu Zhu and Marcus Kalander and Ignavier Ng and Junjian Ye and Zhitang Chen and Lujia Pan},
  year={2021},
  eprint={2111.15155},
  archivePrefix={arXiv},
  primaryClass={cs.LG}
}

Next Up & Contributing

This is the first released version of gCastle, we'll be continuously complementing and optimizing the code and documentation. We welcome new contributors of all experience levels, the specifications about how to contribute code will be coming out soon. If you have any questions or suggestions (such as, contributing new algorithms, optimizing code, improving documentation), please submit an issue here. We will reply as soon as possible.

FAQs

What is gcastle?

Is gcastle well maintained?

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install