
Security News
NVD Quietly Sweeps 100K+ CVEs Into a “Deferred” Black Hole
NVD now marks all pre-2018 CVEs as "Deferred," signaling it will no longer enrich older vulnerabilities, further eroding trust in its data.
Nyoka is a Python library for comprehensive support of the latest PMML (PMML 4.4) standard. Using Nyoka, Data Scientists can export a large number of Machine Learning models from popular Python frameworks into PMML by either using any of the numerous included ready-to-use exporters or by creating their own exporter for specialized/individual model types by simply calling a sequence of constructors.
Besides about 500 Python classes which each cover a PMML tag and all constructor parameters/attributes as defined in the standard, Nyoka also provides an increasing number of convenience classes and functions that make the Data Scientist’s life easier for example by reading or writing any PMML file in one line of code from within your favorite Python environment.
Nyoka comes to you with the complete source code in Python, extended HTML documentation for the classes/functions, and a growing number of Jupyter Notebook tutorials that help you familiarize yourself with the way Nyoka supports you in using PMML as your favorite Data Science transport file format.
Read the documentation at Nyoka Documentation.
linear_model.LinearRegression
linear_model.LogisticRegression
linear_model.RidgeClassifier
linear_model.SGDClassifier
discriminant_analysis.LinearDiscriminantAnalysis
tree.DecisionTreeClassifier
tree.DecisionTreeRegressor
svm.SVC
svm.SVR
svm.LinearSVC
svm.LinearSVR
svm.OneClassSVM
naive_bayes.GaussianNB
ensemble.RandomForestRegressor
ensemble.RandomForestClassifier
ensemble.GradientBoostingRegressor
ensemble.GradientBoostingClassifier
ensemble.IsolationForest
neural_network.MLPClassifier
neural_network.MLPRegressor
neighbors.KNeighborsClassifier
neighbors.KNeighborsRegressor
cluster.KMeans
preprocessing.StandardScaler
preprocessing.MinMaxScaler
preprocessing.RobustScaler
preprocessing.MaxAbsScaler
preprocessing.LabelEncoder
preprocessing.Imputer
preprocessing.Binarizer
preprocessing.PolynomialFeatures
preprocessing.LabelBinarizer
preprocessing.OneHotEncoder
feature_extraction.text.TfidfVectorizer
feature_extraction.text.CountVectorizer
decomposition.PCA
sklearn_pandas.CategoricalImputer
( From sklearn_pandas library )tsa.arima_model.ARIMA
tsa.arima.model.ARIMA
(Extension of SARIMAX)tsa.statespace.SARIMAX
tsa.statespace.VARMAX
tsa.statespace.ExponentialSmoothing
nyoka requires:
You can install nyoka using:
pip install --upgrade nyoka
Nyoka contains seperate exporters for each library, e.g., scikit-learn, keras, xgboost etc.
library | exporter |
---|---|
scikit-learn | skl_to_pmml |
xgboost | xgboost_to_pmml |
lightgbm | lgbm_to_pmml |
statsmodels | StatsmodelsToPmml & ExponentialSmoothingToPmml |
The main module of Nyoka is nyoka
. To use it for your model, you need to import the specific exporter from nyoka as -
from nyoka import skl_to_pmml, lgb_to_pmml #... so on
The workflow is as follows (For example, a Decision Tree Classifier with StandardScaler) -
Create scikit-learn's Pipeline
object and populate it with any pre-processing steps and the model object.
from sklearn.pipeline import Pipeline
from sklearn.tree import DecisionTreeClassifier
from sklearn.preprocessing import StandardScaler
pipeline_obj = Pipeline([
("scaler",StandardScaler()),
("model",DecisionTreeClassifier())
])
Call Pipeline.fit(X,y)
method to train the model.
from sklearn.dataset import load_iris
iris_data = load_iris()
X = iris_data.data
y = iris_data.target
features = iris_data.feature_names
pipeline_obj.fit(X,y)
Use the specific exporter and pass the pipeline object, feature names of the training dataset, target name and expected name of the PMML to the exporter function. If target name is not given default value target
is used. Similarly, for pmml name, default value from_sklearn.pmml
/from_xgboost.pmml
/from_lighgbm.pmml
is used.
from nyoka import skl_to_pmml
skl_to_pmml(pipeline=pipeline_obj,col_names=features,target_name="species",pmml_f_name="decision_tree.pmml")
import pandas as pd
from statsmodels.tsa.arima_model import ARIMA
from nyoka import StatsmodelsToPmml
sales_data = pd.read_csv('sales-cars.csv', index_col=0, parse_dates = True)
model = ARIMA(sales_data, order = (4, 1, 2))
result = model.fit()
StatsmodelsToPmml(result,"Sales_cars_ARIMA.pmml")
Example jupyter notebooks can be found in nyoka/examples
. These files contain code to showcase how to use different exporters.
Exporting scikit-learn
models into PMML
Exporting XGBoost
model into PMML
Exporting LightGBM
model into PMML
Exporting statsmodels
model into PMML
Nyoka contains one submodule called preprocessing
. This module contains preprocessing classes implemented by Nyoka. Currently there is only one preprocessing class, which is Lag
.
Lag is a preprocessing class implemented by Nyoka. When used inside scikit-learn's pipeline, it simply applies an
aggregation
function for the given features of the dataset by combiningvalue
number of previous records. It takes two arguments- aggregation and value.
The valid
aggregation
functions are - "min", "max", "sum", "avg", "median", "product" and "stddev".
To use Lag -
from nyoka.preprocessing import Lag
lag_obj = Lag(aggregation="sum", value=5)
'''
This means taking previous 5 values and perform `sum`. When used inside pipeline, this will be applied to all the columns.
If used inside DataFrameMapper, the it will be applied to only those columns which are inside DataFrameMapper.
'''
from sklearn.pipeline import Pipeline
from sklearn.tree import DecisionTreeClassifier
from nyoka.preprocessing import Lag
pipeline_obj = Pipeline([
("lag",Lag(aggregation="sum",value=5)),
("model",DecisionTreeClassifier())
])
pip uninstall nyoka
You can ask questions at:
Please note that this project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.
These tools are provided as-is and without warranty or support. They do not constitute part of the Software AG product suite. Users are free to use, fork and modify them, subject to the license agreement. While Software AG welcomes contributions, we cannot guarantee to include every contribution in the master project.
FAQs
Python library for converting a large number of ML models to PMML
We found that nyoka demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
NVD now marks all pre-2018 CVEs as "Deferred," signaling it will no longer enrich older vulnerabilities, further eroding trust in its data.
Research
Security News
Lazarus-linked threat actors expand their npm malware campaign with new RAT loaders, hex obfuscation, and over 5,600 downloads across 11 packages.
Security News
Safari 18.4 adds support for Iterator Helpers and two other TC39 JavaScript features, bringing full cross-browser coverage to key parts of the ECMAScript spec.