🚀 Big News: Socket Acquires Coana to Bring Reachability Analysis to Every Appsec Team.Learn more
Socket
Sign inDemoInstall
Socket

perpetual

Package Overview
Dependencies
Maintainers
1
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

perpetual

A self-generalizing gradient boosting machine that doesn't need hyperparameter optimization

0.9.2
PyPI
Maintainers
1

Python Versions PyPI Version Crates.io Version Static Badge PyPI - Downloads

Perpetual

PerpetualBooster is a gradient boosting machine (GBM) algorithm that doesn't need hyperparameter optimization unlike other GBM algorithms. Similar to AutoML libraries, it has a budget parameter. Increasing the budget parameter increases the predictive power of the algorithm and gives better results on unseen data. Start with a small budget (e.g. 0.5) and increase it (e.g. 1.0) once you are confident with your features. If you don't see any improvement with further increasing the budget, it means that you are already extracting the most predictive power out of your data.

Usage

You can use the algorithm like in the example below. Check examples folders for both Rust and Python.

from perpetual import PerpetualBooster

model = PerpetualBooster(objective="SquaredLoss", budget=0.5)
model.fit(X, y)

Documentation

Documentation for the Python API can be found here and for the Rust API here.

Benchmark

PerpetualBooster vs. Optuna + LightGBM

Hyperparameter optimization usually takes 100 iterations with plain GBM algorithms. PerpetualBooster achieves the same accuracy in a single run. Thus, it achieves up to 100x speed-up at the same accuracy with different budget levels and with different datasets.

The following table summarizes the results for the California Housing dataset (regression):

Perpetual budgetLightGBM n_estimatorsPerpetual mseLightGBM mseSpeed-up wall timeSpeed-up cpu time
1.01000.1920.19254x56x
1.53000.1880.18859x58x
2.110000.1850.18642x41x

The following table summarizes the results for the Cover Types dataset (classification):

Perpetual budgetLightGBM n_estimatorsPerpetual log lossLightGBM log lossSpeed-up wall timeSpeed-up cpu time
0.91000.0910.08472x78x

The results can be reproduced using the scripts in the examples folder.

PerpetualBooster vs. AutoGluon

PerpetualBooster is a GBM but behaves like AutoML so it is benchmarked also against AutoGluon (v1.2, best quality preset), the current leader in AutoML benchmark. Top 10 datasets with the most number of rows are selected from OpenML datasets for both regression and classification tasks.

The results are summarized in the following table for regression tasks:

OpenML TaskPerpetual Training DurationPerpetual Inference DurationPerpetual RMSEAutoGluon Training DurationAutoGluon Inference DurationAutoGluon RMSE
Airlines_DepDelay_10M51811.329.052030.9 28.8
bates_regr_100342115.1 1.084 OOMOOMOOM
BNG(libras_move)19564.2 2.51 192297.62.53
BNG(satellite_image)3341.60.73133710.0 0.721
COMET_MC441.0 0.0615 475.00.0662
friedman12754.2 1.047 2785.11.487
poker380.6 0.256 411.20.722
subset_higgs86810.6 0.420 87024.50.421
BNG(autoHorse)1071.1 19.0 1073.220.5
BNG(pbc)480.6 836.5 510.2957.1
average4653.9-46419.7-

PerpetualBooster outperformed AutoGluon on 8 out of 10 regression tasks, training equally fast and inferring 5.1x faster.

The results are summarized in the following table for classification tasks:

OpenML TaskPerpetual Training DurationPerpetual Inference DurationPerpetual AUCAutoGluon Training DurationAutoGluon Inference DurationAutoGluon AUC
BNG(spambase)70.12.1 0.671 73.13.70.669
BNG(trains)89.51.7 0.996 106.42.40.994
breast13699.397.7 0.991 13330.779.70.949
Click_prediction_small89.11.0 0.749 101.02.80.703
colon12435.2126.7 0.997 12356.2152.30.997
Higgs3485.340.9 0.843 3501.467.90.816
SEA(50000)21.90.2 0.936 25.60.50.935
sf-police-incidents85.81.5 0.687 99.42.80.659
bates_classif_10011152.850.0 0.864 OOMOOMOOM
prostate13699.979.8 0.987 OOMOOMOOM
average3747.034.0-3699.239.0-

PerpetualBooster outperformed AutoGluon on 10 out of 10 classification tasks, training equally fast and inferring 1.1x faster.

PerpetualBooster demonstrates greater robustness compared to AutoGluon, successfully training on all 20 tasks, whereas AutoGluon encountered out-of-memory errors on 3 of those tasks.

The results can be reproduced using the automlbenchmark fork here.

Installation

The package can be installed directly from pypi:

pip install perpetual

Using conda-forge:

conda install conda-forge::perpetual

To use in a Rust project and to get the package from crates.io:

cargo add perpetual

Contribution

Contributions are welcome. Check CONTRIBUTING.md for the guideline.

Paper

PerpetualBooster prevents overfitting with a generalization algorithm. The paper is work-in-progress to explain how the algorithm works. Check our blog post for a high level introduction to the algorithm.

Keywords

rust

FAQs

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts