shap-hypetune
A python package for simultaneous Hyperparameters Tuning and Features Selection for Gradient Boosting Models.

Overview
Hyperparameters tuning and features selection are two common steps in every machine learning pipeline. Most of the time they are computed separately and independently. This may result in suboptimal performances and in a more time expensive process.
shap-hypetune aims to combine hyperparameters tuning and features selection in a single pipeline optimizing the optimal number of features while searching for the optimal parameters configuration. Hyperparameters Tuning or Features Selection can also be carried out as standalone operations.
shap-hypetune main features:
- designed for gradient boosting models, as LGBModel or XGBModel;
- developed to be integrable with the scikit-learn ecosystem;
- effective in both classification or regression tasks;
- customizable training process, supporting early-stopping and all the other fitting options available in the standard algorithms api;
- ranking feature selection algorithms: Recursive Feature Elimination (RFE); Recursive Feature Addition (RFA); or Boruta;
- classical boosting based feature importances or SHAP feature importances (the later can be computed also on the eval_set);
- apply grid-search, random-search, or bayesian-search (from hyperopt);
- parallelized computations with joblib.
Installation
pip install --upgrade shap-hypetune
lightgbm, xgboost are not needed requirements. The module depends only on NumPy, shap, scikit-learn and hyperopt. Python 3.6 or above is supported.
Media
Usage
from shaphypetune import BoostSearch, BoostRFE, BoostRFA, BoostBoruta
Hyperparameters Tuning
BoostSearch(
estimator,
param_grid=None,
greater_is_better=False,
n_iter=None,
sampling_seed=None,
verbose=1,
n_jobs=None
)
Feature Selection (RFE)
BoostRFE(
estimator,
min_features_to_select=None,
step=1,
param_grid=None,
greater_is_better=False,
importance_type='feature_importances',
train_importance=True,
n_iter=None,
sampling_seed=None,
verbose=1,
n_jobs=None
)
Feature Selection (BORUTA)
BoostBoruta(
estimator,
perc=100,
alpha=0.05,
max_iter=100,
early_stopping_boruta_rounds=None,
param_grid=None,
greater_is_better=False,
importance_type='feature_importances',
train_importance=True,
n_iter=None,
sampling_seed=None,
verbose=1,
n_jobs=None
)
Feature Selection (RFA)
BoostRFA(
estimator,
min_features_to_select=None,
step=1,
param_grid=None,
greater_is_better=False,
importance_type='feature_importances',
train_importance=True,
n_iter=None,
sampling_seed=None,
verbose=1,
n_jobs=None
)
Full examples in the notebooks folder.