DaRE RF: Data Removal-Enabled Random Forests
data:image/s3,"s3://crabby-images/8ba08/8ba08e24ea08a1e865b2b2066bf323a71cdc49fb" alt="Build"
dare-rf is a python library that implements machine unlearning for random forests, enabling the efficient removal of training data without having to retrain from scratch. It is built using Cython and is designed to be scalable to large datasets.
Installation
pip install dare-rf
Usage
Simple example of removing a single training instance:
import dare
import numpy as np
X_train = np.array([[0, 1], [0, 1], [0, 1], [1, 0], [1, 0]])
y_train = np.array([1, 1, 1, 0, 1])
X_test = np.array([[1, 0]])
rf = dare.Forest(n_estimators=100,
max_depth=3,
k=5,
topd=0,
random_state=1)
rf.fit(X_train, y_train)
rf.predict_proba(X_test)
rf.delete(3)
rf.predict_proba(X_test)
License
Apache License 2.0.
Reference
Brophy and Lowd. Machine Unlearning for Random Forests. ICML 2021.
@inproceedings{brophy2021machine,
title={Machine Unlearning for Random Forests},
author={Brophy, Jonathan and Lowd, Daniel},
booktitle={International Conference on Machine Learning},
pages={1092--1104},
year={2021},
organization={PMLR}
}