
Security News
Meet Socket at Black Hat and DEF CON 2025 in Las Vegas
Meet Socket at Black Hat & DEF CON 2025 for 1:1s, insider security talks at Allegiant Stadium, and a private dinner with top minds in software supply chain security.
MetaSklearn: A Metaheuristic-Powered Hyperparameter Optimization Framework for Scikit-Learn Models
MetaSklearn is a flexible and extensible Python library that brings metaheuristic optimization to
hyperparameter tuning of scikit-learn
models. It provides a seamless interface to optimize hyperparameters
using nature-inspired algorithms from the Mealpy library.
It is designed to be user-friendly and efficient, making it easy to integrate into your machine learning workflow.
mealpy
.PerMetrics
for rich evaluation metrics.fit()
, .predict()
, .score()
Install the latest version using pip:
pip install metasklearn
After that, check the version to ensure successful installation:
$ python
>>> import metasklearn
>>> metasklearn.__version__
MetaSklearn
defines a custom MetaSearchCV
class that wraps your model and performs hyperparameter tuning using
any optimizer supported by Mealpy. The framework evaluates model performance using either
scikit-learnβs metrics or additional ones from PerMetrics
library.
from sklearn.svm import SVR
from sklearn.datasets import load_diabetes
from metasklearn import MetaSearchCV, FloatVar, StringVar, Data
## Load data object
X, y = load_diabetes(return_X_y=True)
data = Data(X, y)
## Split train and test
data.split_train_test(test_size=0.2, random_state=42, inplace=True)
print(data.X_train.shape, data.X_test.shape)
## Scaling dataset
data.X_train, scaler_X = data.scale(data.X_train, scaling_methods=("standard", "minmax"))
data.X_test = scaler_X.transform(data.X_test)
data.y_train, scaler_y = data.scale(data.y_train, scaling_methods=("standard", "minmax"))
data.y_train = data.y_train.ravel()
data.y_test = scaler_y.transform(data.y_test.reshape(-1, 1)).ravel()
# Define param bounds for SVC
# param_bounds = { ==> This is for GridSearchCV, show you how to convert to our MetaSearchCV
# "C": [0.1, 100],
# "gamma": [1e-4, 1],
# "kernel": ["linear", "rbf", "poly"]
# }
param_bounds = [
FloatVar(lb=0., ub=100., name="C"),
FloatVar(lb=1e-4, ub=1., name="gamma"),
StringVar(valid_sets=("linear", "rbf", "poly"), name="kernel")
]
# Initialize and fit MetaSearchCV
searcher = MetaSearchCV(
estimator=SVR(),
param_bounds=param_bounds,
task_type="regression",
optim="BaseGA",
optim_params={"epoch": 20, "pop_size": 30, "name": "GA"},
cv=3,
scoring="MSE", # or any custom scoring like "F1_macro"
seed=42,
n_jobs=2,
verbose=True,
mode='single', n_workers=None, termination=None
)
searcher.fit(data.X_train, data.y_train)
print("Best parameters (Classification):", searcher.best_params)
print("Best model: ", searcher.best_estimator)
print("Best score during searching: ", searcher.best_score)
# Make prediction after re-fit
y_pred = searcher.predict(data.X_test)
print("Test Accuracy:", searcher.score(data.X_test, data.y_test))
print("Test Score: ", searcher.scores(data.X_test, data.y_test, list_metrics=("RMSE", "R", "KGE", "NNSE")))
from sklearn.svm import SVC
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from metasklearn import MetaSearchCV, FloatVar, StringVar
# Load dataset
X, y = load_iris(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Define param bounds for SVC
# param_bounds = { ==> This is for GridSearchCV, show you how to convert to our MetaSearchCV
# "C": [0.1, 100],
# "gamma": [1e-4, 1],
# "kernel": ["linear", "rbf", "poly"]
# }
param_bounds = [
FloatVar(lb=0., ub=100., name="C"),
FloatVar(lb=1e-4, ub=1., name="gamma"),
StringVar(valid_sets=("linear", "rbf", "poly"), name="kernel")
]
# Initialize and fit MetaSearchCV
searcher = MetaSearchCV(
estimator=SVC(),
param_bounds=param_bounds,
task_type="classification",
optim="BaseGA",
optim_params={"epoch": 20, "pop_size": 30, "name": "GA"},
cv=3,
scoring="AS", # or any custom scoring like "F1_macro"
seed=42,
n_jobs=2,
verbose=True,
mode='single', n_workers=None, termination=None
)
searcher.fit(X_train, y_train)
print("Best parameters (Classification):", searcher.best_params)
print("Best model: ", searcher.best_estimator)
print("Best score during searching: ", searcher.best_score)
# Make prediction after re-fit
y_pred = searcher.predict(X_test)
print("Test Accuracy:", searcher.score(X_test, y_test))
print("Test Score: ", searcher.scores(X_test, y_test, list_metrics=("AS", "RS", "PS", "F1S")))
As can be seen, you do it like any other model from Scikit-Learn library such as Random Forest, Decision Tree, XGBoost,...
This section explains how to use different types of variables from the MetaSearchCV
library when defining hyperparameter
search spaces. Each variable type is suitable for different kinds of optimization parameters.
IntegerVar
β Integer Variablefrom metasklearn import IntegerVar
var = IntegerVar(lb=1, ub=100, name="n_estimators")
Used for discrete numerical parameters like number of neighbors in KNN, number of estimators in ensembles, etc.
FloatVar
β Float/Continuous Variablefrom metasklearn import FloatVar
var = FloatVar(lb=0.001, ub=1.0, name="learning_rate")
Used for continuous numerical parameters such as learning_rate
, C
, gamma
, etc.
StringVar
β Categorical/String Variablefrom metasklearn import StringVar
var = StringVar(valid_sets=("linear", "poly", "rbf"), name="kernel")
Used for string parameters with a limited number of choices, e.g., kernel
in SVM. Value None can be set also.
BinaryVar
β Binary Variable (0 or 1)from metasklearn import BinaryVar
var = BinaryVar(n_vars=1, name="feature_selected")
Used in binary feature selection problems or any 0/1-based decision.
BoolVar
β Boolean Variable (True or False)from metasklearn import BoolVar
var = BoolVar(n_vars=1, name="use_bias")
Used for Boolean-type arguments such as fit_intercept
, use_bias
, etc.
CategoricalVar
- A set of mixed discrete variables such as int, float, string, Nonefrom metasklearn import CategoricalVar
var = CategoricalVar(valid_sets=((3., None, "alpha"), (5, 12, 32), ("auto", "exp", "sin")), name="categorical")
This type of variable is useful when a hyperparameter can take on a predefined set of mixed values, such as: Mixed types of parameters in optimization tasks (int, string, bool, float,...).
SequenceVar
- Variables as tuple, list, or setfrom metasklearn import SequenceVar
var = SequenceVar(valid_sets=((10, ), (20, 15), (30, 10, 5)), return_type=list, name="hidden_layer_sizes")
This type of variable is useful for defining hyperparameters that represent sequences, such as the sizes of hidden layers in a neural network.
PermutationVar
β Permutation Variablefrom metasklearn import PermutationVar
var = PermutationVar(valid_set=(1, 2, 5, 10), name="job_order")
Used for optimization problems involving permutations, like scheduling or routing.
TransferBinaryVar
β Transfer Binary Variablefrom metasklearn import TransferBinaryVar
var = TransferBinaryVar(n_vars=1, tf_func="vstf_01", lb=-8., ub=8., all_zeros=True, name="transfer_binary")
Used in binary search spaces that support transformation-based metaheuristics.
TransferBoolVar
β Transfer Boolean Variablefrom metasklearn import TransferBoolVar
var = TransferBoolVar(n_vars=1, tf_func="vstf_01", lb=-8., ub=8., name="transfer_bool")
Used in Boolean search spaces with transferable logic between states.
from metasklearn import (IntegerVar, FloatVar, StringVar, BinaryVar, BoolVar,
PermutationVar, CategoricalVar, SequenceVar, TransferBinaryVar, TransferBoolVar)
param_bounds = [
IntegerVar(lb=1, ub=20, name="n_neighbors"),
FloatVar(lb=0.001, ub=1.0, name="alpha"),
StringVar(valid_sets=["uniform", "distance"], name="weights"),
BinaryVar(name="use_feature"),
BoolVar(name="fit_bias"),
PermutationVar(valid_set=(1, 2, 5, 10), name="job_order"),
CategoricalVar(valid_sets=[0.1, "relu", False, None, 3], name="activation_choice"),
SequenceVar(valid_sets=((10,), (20, 10), (30, 50, 5)), name="mixed_choice"),
TransferBinaryVar(name="bin_transfer"),
TransferBoolVar(name="bool_transfer")
]
Use this format when designing hyperparameter spaces for advanced models in MetaSearchCV
.
MetaSklearn
integrates all metaheuristic algorithms from Mealpy, including:
You can pass any optimizer name or an instantiated optimizer object to MetaSearchCV. For more details, please refer to the link
You can use custom scoring functions from:
sklearn.metrics.get_scorer_names()
permetrics.RegressionMetric and ClassificationMetric
For details on PerMetrics
library, please refer to the link
Documentation is available at: π https://metasklearn.readthedocs.io
You can build the documentation locally:
cd docs
make html
You can run unit tests using:
pytest tests/
We welcome contributions to MetaSklearn
! If you have suggestions, improvements, or bug fixes, feel free to fork
the repository, create a pull request, or open an issue.
This project is licensed under the GPLv3 License. See the LICENSE file for more details.
Please include these citations if you plan to use this library:
@software{thieu20250510MetaSklearn,
author = {Nguyen Van Thieu},
title = {MetaSklearn: A Metaheuristic-Powered Hyperparameter Optimization Framework for Scikit-Learn Models},
month = June,
year = 2025,
doi = {10.6084/m9.figshare.28978805},
url = {https://github.com/thieu1995/MetaSklearn}
}
Developed by: Thieu @ 2025
FAQs
MetaSklearn: A Metaheuristic-Powered Hyperparameter Optimization Framework for Scikit-Learn Models
We found that metasklearn demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago.Β It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Meet Socket at Black Hat & DEF CON 2025 for 1:1s, insider security talks at Allegiant Stadium, and a private dinner with top minds in software supply chain security.
Security News
CAI is a new open source AI framework that automates penetration testing tasks like scanning and exploitation up to 3,600Γ faster than humans.
Security News
Deno 2.4 brings back bundling, improves dependency updates and telemetry, and makes the runtime more practical for real-world JavaScript projects.