mlmodels
This repository is the Model ZOO for Pytorch, Tensorflow, Keras, Gluon, LightGBM, Keras, Sklearn models etc with Lightweight Functional interface to wrap access to Recent and State of Art Deep Learning, ML models and Hyper-Parameter Search, cross platforms that follows the logic of sklearn, such as fit, predict, transform, metrics, save, load etc.
Now, recent models are available across those fields :
- Time Series,
- Text classification,
- Vision,
- Image Generation,Text generation,
- Gradient Boosting, Automatic Machine Learning tuning,
- Hyper-parameter search.
With the goal to transform Script/Research code into re-usable batch/code with minimal code change, we used functional interface instead of pure OOP. This is because functional reduces the amount of code needed which is good to scientific computing. Thus, we can focus on the computing part than design. Also, it is easy to maintain for medium size project.
A collection of Deep Learning and Machine Learning research papers is available in this repository.
Benefits :
Having a standard framework for both machine learning models and deep learning models,
allows a step towards automatic Machine Learning. The collection of models, model zoo in Pytorch, Tensorflow, Keras
allows removing dependency on one specific framework, and enable richer possibilities in model benchmarking and re-usage.
Unique and simple interface, zero boilerplate code (!), and recent state of art models/frameworks are the main strength
of MLMODELS. Emphasis is on traditional machine learning algorithms but recent state of art Deep Learning algorithms.
Processing of high-dimensional data is considered very useful using Deep Learning. For different applications, such as computer vision, natural language processing, object detection, facial recognition and speech recognition, deep learning created significant improvements and outstanding results.
Here you can find usages guide
Model List :
Time Series:
-
MILA, Nbeats: 2019, Advanced interpretable Time Series Neural Network, [Link]
-
Amazon Deep AR: 2019, Multi-variate Time Series NNetwork, [Link]
-
Facebook Prophet 2017, Time Series prediction [Link]
-
ARMDN, Advanced Multi-variate Time series Prediction : 2019, Associative and Recurrent Mixture Density Networks for time series. [Link]
-
LSTM Neural Network prediction : Stacked Bidirectional and Unidirectional LSTM Recurrent Neural Network for Network-wide Traffic Speed Prediction [Link]
NLP:
-
Sentence Transformers : 2019, Embedding of full sentences using BERT, [Link]
-
Transformers Classifier : Using Transformer for Text Classification, [Link]
-
TextCNN Pytorch : 2016, Text CNN Classifier, [Link]
-
TextCNN Keras : 2016, Text CNN Classifier, [Link]
-
Bi-directionnal Conditional Random Field LSTM for Name Entiryt Recognition, [Link]
-
DRMM: Deep Relevance Matching Model for Ad-hoc Retrieval.[Link]
-
DRMMTKS: Deep Top-K Relevance Matching Model for Ad-hoc Retrieval. [Link]
-
ARC-I: Convolutional Neural Network Architectures for Matching Natural Language Sentences
[Link]
-
ARC-II: Convolutional Neural Network Architectures for Matching Natural Language Sentences
[Link]
TABULAR:
LightGBM : Light Gradient Boosting
AutoML Gluon : 2020, AutoML in Gluon, MxNet using LightGBM, CatBoost
Auto-Keras : 2020, Automatic Keras model selection
All sklearn models :
linear_model.ElasticNetlinear_model.ElasticNetCVlinear_model.Larslinear_model.LarsCVlinear_model.Lassolinear_model.LassoCVlinear_model.LassoLarslinear_model.LassoLarsCVlinear_model.LassoLarsIClinear_model.OrthogonalMatchingPursuitlinear_model.OrthogonalMatchingPursuitCV
svm.LinearSVCsvm.LinearSVRsvm.NuSVCsvm.NuSVRsvm.OneClassSVMsvm.SVCsvm.SVRsvm.l1_min_c
neighbors.KNeighborsClassifierneighbors.KNeighborsRegressorneighbors.KNeighborsTransformer
Binary Neural Prediction from tabular data:
VISION:
-
Vision Models (pre-trained) :
alexnet: SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size
[Link]
-
densenet121: Adversarial Perturbations Prevail in the Y-Channel of the YCbCr Color Space
[Link]
-
densenet169: Classification of TrashNet Dataset Based on Deep Learning Models
[Link]
-
densenet201: Utilization of DenseNet201 for diagnosis of breast abnormality
[Link]
-
densenet161: Automated classification of histopathology images using transfer learning
[Link]
-
inception_v3: Menfish Classification Based on Inception_V3 Convolutional Neural Network
[Link]
-
resnet18: Leveraging the VTA-TVM Hardware-Software Stack for FPGA Acceleration of 8-bit ResNet-18 Inference
[Link]
-
resnet34: Automated Pavement Crack Segmentation Using Fully Convolutional U-Net with a Pretrained ResNet-34 Encoder
[Link]
-
resnet50: Extremely Large Minibatch SGD: Training ResNet-50 on ImageNet in 15 Minutes
[Link]
-
resnet101: Classification of Cervical MR Images using ResNet101
[Link]
-
resnet152: Deep neural networks show an equivalent and often superior performance to dermatologists in onychomycosis diagnosis: Automatic construction of onychomycosis datasets by region-based convolutional deep neural network
[Link]
More resources are available here
######################################################################################
① Installation Guide:
(A) Using pre-installed Setup (one click) :
Read-more
(C) Using Colab :
Read-more
Initialize template and Tests
Will copy template, dataset, example to your folder
ml_models --init /yourworkingFolder/
To test Hyper-parameter search:
ml_optim
To test model fitting
ml_models
Actual test runs
Read-more
Usage in Jupyter/Colab
Read-more
Command Line tools:
Read-more
Model List
Read-more
How to add a new model
Read-more
Index of functions/methods
Read-more
Define model and data definitions
import mlmodels
model_uri = "model_tf.1_lstm.py"
model_pars = { "num_layers": 1,
"size": ncol_input, "size_layer": 128, "output_size": ncol_output, "timestep": 4,
}
data_pars = {"data_path": "/folder/myfile.csv" , "data_type": "pandas" }
compute_pars = { "learning_rate": 0.001, }
out_pars = { "path": "ztest_1lstm/", "model_path" : "ztest_1lstm/model/"}
save_pars = { "path" : "ztest_1lstm/model/" }
load_pars = { "path" : "ztest_1lstm/model/" }
from mlmodels.models import module_load
module = module_load( model_uri= model_uri )
model = module.Model(model_pars=model_pars, data_pars=data_pars, compute_pars=compute_pars)
model, sess = module.fit(model, data_pars=data_pars, compute_pars=compute_pars, out_pars=out_pars)
metrics_val = module.fit_metrics( model, sess, data_pars, compute_pars, out_pars)
ypred = module.predict(model, sess, data_pars, compute_pars, out_pars)
import mlmodels
import autogluon as ag
model_uri = "model_gluon.gluon_automl.py"
data_pars = {"train": True, "uri_type": "amazon_aws", "dt_name": "Inc"}
model_pars = {"model_type": "tabular",
"learning_rate": ag.space.Real(1e-4, 1e-2, default=5e-4, log=True),
"activation": ag.space.Categorical(*tuple(["relu", "softrelu", "tanh"])),
"layers": ag.space.Categorical(
*tuple([[100], [1000], [200, 100], [300, 200, 100]])),
'dropout_prob': ag.space.Real(0.0, 0.5, default=0.1),
'num_boost_round': 10,
'num_leaves': ag.space.Int(lower=26, upper=30, default=36)
}
compute_pars = {
"hp_tune": True,
"num_epochs": 10,
"time_limits": 120,
"num_trials": 5,
"search_strategy": "skopt"
}
out_pars = {
"out_path": "dataset/"
}
from mlmodels.models import module_load
module = module_load( model_uri= model_uri )
model = module.Model(model_pars=model_pars, compute_pars=compute_pars)
model, sess = module.fit(model, data_pars=data_pars, model_pars=model_pars, compute_pars=compute_pars, out_pars=out_pars)
ypred = module.predict(model, data_pars, compute_pars, out_pars)
RandomForest example in Scikit-learn (Example notebook)
# import library
import mlmodels
#### Define model and data definitions
model_uri = "model_sklearn.sklearn.py"
model_pars = {"model_name": "RandomForestClassifier", "max_depth" : 4 , "random_state":0}
data_pars = {'mode': 'test', 'path': "../mlmodels/dataset", 'data_type' : 'pandas' }
compute_pars = {'return_pred_not': False}
out_pars = {'path' : "../ztest"}
#### Load Parameters and Train
from mlmodels.models import module_load
module = module_load( model_uri= model_uri ) # Load file definition
model = module.Model(model_pars=model_pars, data_pars=data_pars, compute_pars=compute_pars) # Create Model instance
model, sess = module.fit(model, data_pars=data_pars, compute_pars=compute_pars, out_pars=out_pars) # fit the model
#### Inference
ypred = module.predict(model, data_pars=data_pars, compute_pars=compute_pars, out_pars=out_pars) # predict pipeline
import mlmodels
model_uri = "model_keras.textcnn.py"
data_pars = {"path" : "../mlmodels/dataset/text/imdb.csv", "train": 1, "maxlen":400, "max_features": 10}
model_pars = {"maxlen":400, "max_features": 10, "embedding_dims":50}
compute_pars = {"engine": "adam", "loss": "binary_crossentropy", "metrics": ["accuracy"] ,
"batch_size": 32, "epochs":1, 'return_pred_not':False}
out_pars = {"path": "ztest/model_keras/textcnn/"}
from mlmodels.models import module_load
module = module_load( model_uri= model_uri )
model = module.Model(model_pars=model_pars, data_pars=data_pars, compute_pars=compute_pars)
module.fit(model, data_pars=data_pars, compute_pars=compute_pars, out_pars=out_pars)
data_pars['train'] = 0
ypred = module.predict(model, data_pars=data_pars, compute_pars=compute_pars, out_pars=out_pars)
Import library and functions
import mlmodels
from mlmodels.models import module_load
from mlmodels.util import load_config
model_uri = "model_tf.1_lstm.py"
module = module_load( model_uri= model_uri )
model_pars, data_pars, compute_pars, out_pars = module.get_params(param_pars={
'choice':'json',
'config_mode':'test',
'data_path':'../mlmodels/example/1_lstm.json'
})
model = module.Model(model_pars=model_pars, data_pars=data_pars, compute_pars=compute_pars)
model, sess = module.fit(model, data_pars=data_pars, compute_pars=compute_pars, out_pars=out_pars)
ypred = module.predict(model, sess=sess, data_pars=data_pars, compute_pars=compute_pars, out_pars=out_pars)
Using Scikit-learn's SVM for Titanic Problem from json file (Example notebook, JSON file)
Import library and functions
import mlmodels
from mlmodels.models import module_load
from mlmodels.util import load_config
model_uri = "model_sklearn.sklearn.py"
module = module_load( model_uri= model_uri )
model_pars, data_pars, compute_pars, out_pars = module.get_params(param_pars={
'choice':'json',
'config_mode':'test',
'data_path':'../mlmodels/example/sklearn_titanic_svm.json'
})
model = module.Model(model_pars=model_pars, data_pars=data_pars, compute_pars=compute_pars)
model, sess = module.fit(model, data_pars=data_pars, compute_pars=compute_pars, out_pars=out_pars)
ypred = module.predict(model, data_pars=data_pars, compute_pars=compute_pars, out_pars=out_pars)
ypred
import pandas as pd
from sklearn.metrics import roc_auc_score
y = pd.read_csv('../mlmodels/dataset/tabular/titanic_train_preprocessed.csv')['Survived'].values
roc_auc_score(y, ypred)
Using Scikit-learn's Random Forest for Titanic Problem from json file (Example notebook, JSON file)
Import library and functions
import mlmodels
from mlmodels.models import module_load
from mlmodels.util import load_config
model_uri = "model_sklearn.sklearn.py"
module = module_load( model_uri= model_uri )
model_pars, data_pars, compute_pars, out_pars = module.get_params(param_pars={
'choice':'json',
'config_mode':'test',
'data_path':'../mlmodels/example/sklearn_titanic_randomForest.json'
})
model = module.Model(model_pars=model_pars, data_pars=data_pars, compute_pars=compute_pars)
model, sess = module.fit(model, data_pars=data_pars, compute_pars=compute_pars, out_pars=out_pars)
ypred = module.predict(model, data_pars=data_pars, compute_pars=compute_pars, out_pars=out_pars)
ypred
import pandas as pd
from sklearn.metrics import roc_auc_score
y = pd.read_csv('../mlmodels/dataset/tabular/titanic_train_preprocessed.csv')['Survived'].values
roc_auc_score(y, ypred)
Using Autogluon for Titanic Problem from json file (Example notebook, JSON file)
Import library and functions
import mlmodels
from mlmodels.models import module_load
from mlmodels.util import load_config
model_uri = "model_gluon.gluon_automl.py"
module = module_load( model_uri= model_uri )
model_pars, data_pars, compute_pars, out_pars = module.get_params(
choice='json',
config_mode= 'test',
data_path= '../mlmodels/example/gluon_automl.json'
)
model = module.Model(model_pars=model_pars, compute_pars=compute_pars)
model = module.fit(model, model_pars=model_pars, data_pars=data_pars, compute_pars=compute_pars, out_pars=out_pars)
model.model.fit_summary()
ypred = module.predict(model, data_pars=data_pars, compute_pars=compute_pars, out_pars=out_pars)
model.model.model_performance
import pandas as pd
from sklearn.metrics import roc_auc_score
y = pd.read_csv('../mlmodels/dataset/tabular/titanic_train_preprocessed.csv')['Survived'].values
roc_auc_score(y, ypred)
Using hyper-params (optuna) for Titanic Problem from json file (Example notebook, JSON file)
Import library and functions
from mlmodels.models import module_load
from mlmodels.optim import optim
from mlmodels.util import params_json_load
model_uri = "model_sklearn.sklearn.py"
config_path = path_norm( 'example/hyper_titanic_randomForest.json' )
config_mode = "test"
hypermodel_pars, model_pars, data_pars, compute_pars, out_pars = params_json_load(config_path, config_mode= config_mode)
print( hypermodel_pars, model_pars, data_pars, compute_pars, out_pars)
module = module_load( model_uri= model_uri )
model_pars_update = optim(
model_uri = model_uri,
hypermodel_pars = hypermodel_pars,
model_pars = model_pars,
data_pars = data_pars,
compute_pars = compute_pars,
out_pars = out_pars
)
model = module.Model(model_pars=model_pars_update, data_pars=data_pars, compute_pars=compute_pars)y
model, sess = module.fit(model, data_pars=data_pars, compute_pars=compute_pars, out_pars=out_pars)
ypred = module.predict(model, data_pars=data_pars, compute_pars=compute_pars, out_pars=out_pars)
ypred
import pandas as pd
from sklearn.metrics import roc_auc_score
y = pd.read_csv( path_norm('dataset/tabular/titanic_train_preprocessed.csv') )
y = y['Survived'].values
roc_auc_score(y, ypred)
Using LightGBM for Titanic Problem from json file (Example notebook, JSON file)
Import library and functions
import mlmodels
from mlmodels.models import module_load
from mlmodels.util import path_norm_dict, path_norm
import json
model_uri = "model_sklearn.model_lightgbm.py"
module = module_load( model_uri= model_uri)
data_path = '../dataset/json/lightgbm_titanic.json'
pars = json.load(open( data_path , mode='r'))
for key, pdict in pars.items() :
globals()[key] = path_norm_dict( pdict )
model = module.Model(model_pars, data_pars, compute_pars)
model, session = module.fit(model, data_pars, compute_pars, out_pars)
ypred = module.predict(model, data_pars=data_pars, compute_pars=compute_pars, out_pars=out_pars)
ypred
metrics_val = module.fit_metrics(model, data_pars, compute_pars, out_pars)
metrics_val
import mlmodels
from mlmodels.models import module_load
from mlmodels.util import path_norm_dict, path_norm, params_json_load
import json
model_uri = "model_tch.torchhub.py"
config_path = path_norm( 'model_tch/torchhub_cnn.json' )
config_mode = "test"
hypermodel_pars, model_pars, data_pars, compute_pars, out_pars = params_json_load(config_path, config_mode= config_mode)
print( hypermodel_pars, model_pars, data_pars, compute_pars, out_pars)
module = module_load( model_uri)
model = module.Model(model_pars, data_pars, compute_pars)
`
model, session = module.fit(model, data_pars, compute_pars, out_pars)
metrics_val = module.fit_metrics(model, data_pars, compute_pars, out_pars)
print(metrics_val)
ypred = module.predict(model, session, data_pars, compute_pars, out_pars)
print(ypred)
import mlmodels
from mlmodels.models import module_load
from mlmodels.util import path_norm_dict, path_norm, params_json_load
import json
model_uri = "model_keras.ardmn.py"
config_path = path_norm( 'model_keras/ardmn.json' )
config_mode = "test"
hypermodel_pars, model_pars, data_pars, compute_pars, out_pars = params_json_load(config_path, config_mode= config_mode)
print( hypermodel_pars, model_pars, data_pars, compute_pars, out_pars)
module = module_load( model_uri)
model = module.Model(model_pars, data_pars, compute_pars)
`
model, session = module.fit(model, data_pars, compute_pars, out_pars)
metrics_val = module.fit_metrics(model, data_pars, compute_pars, out_pars)
print(metrics_val)
ypred = module.predict(model, session, data_pars, compute_pars, out_pars)
print(ypred)
module.save(model, save_pars ={ 'path': out_pars['path'] +"/model/"})
model2 = module.load(load_pars ={ 'path': out_pars['path'] +"/model/"})