New Case Study:See how Anthropic automated 95% of dependency reviews with Socket.Learn More
Socket
Sign inDemoInstall
Socket

xgboost-distribution

Package Overview
Dependencies
Maintainers
1
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

xgboost-distribution

XGBoost for probabilistic prediction.

  • 0.3.1
  • PyPI
  • Socket score

Maintainers
1

.. image:: https://github.com/CDonnerer/xgboost-distribution/actions/workflows/test.yml/badge.svg?branch=main :target: https://github.com/CDonnerer/xgboost-distribution/actions/workflows/test.yml

.. image:: https://coveralls.io/repos/github/CDonnerer/xgboost-distribution/badge.svg?branch=main :target: https://coveralls.io/github/CDonnerer/xgboost-distribution?branch=main

.. image:: https://img.shields.io/badge/code%20style-black-000000.svg :target: https://github.com/psf/black

.. image:: https://readthedocs.org/projects/xgboost-distribution/badge/?version=latest :target: https://xgboost-distribution.readthedocs.io/en/latest/?badge=latest :alt: Documentation Status

.. image:: https://img.shields.io/pypi/v/xgboost-distribution.svg :alt: PyPI-Server :target: https://pypi.org/project/xgboost-distribution/

==================== xgboost-distribution

XGBoost for probabilistic prediction. Like NGBoost, but faster, and in the XGBoost scikit-learn API_.

.. image:: https://raw.githubusercontent.com/CDonnerer/xgboost-distribution/main/imgs/xgb_dist.png :align: center :width: 600px :alt: XGBDistribution example

Installation

.. code-block:: console

$ pip install xgboost-distribution

Dependencies_:

.. code-block::

python_requires = >=3.8

install_requires =
    scikit-learn
    xgboost>=2.1.0

Usage

XGBDistribution follows the XGBoost scikit-learn API, with an additional keyword argument specifying the distribution, which is fit via Maximum Likelihood Estimation:

.. code-block:: python

  from sklearn.datasets import fetch_california_housing
  from sklearn.model_selection import train_test_split

  from xgboost_distribution import XGBDistribution


  data = fetch_california_housing()
  X, y = data.data, data.target
  X_train, X_test, y_train, y_test = train_test_split(X, y)

  model = XGBDistribution(
      distribution="normal",
      n_estimators=500,
      early_stopping_rounds=10
  )
  model.fit(X_train, y_train, eval_set=[(X_test, y_test)])

See the documentation_ for all available distributions.

After fitting, we can predict the parameters of the distribution:

.. code-block:: python

  preds = model.predict(X_test)
  mean, std = preds.loc, preds.scale

Note that this returned a namedtuple_ of numpy arrays_ for each parameter of the distribution (we use the scipy stats_ naming conventions for the parameters, see e.g. scipy.stats.norm_ for the normal distribution).

NGBoost performance comparison

XGBDistribution follows the method shown in the NGBoost_ library, using natural gradients to estimate the parameters of the distribution.

Below, we show a performance comparison of XGBDistribution and the NGBoost_ NGBRegressor, using the California Housing dataset, estimating normal distributions. While the performance of the two models is fairly similar (measured on negative log-likelihood of a normal distribution and the RMSE), XGBDistribution is around 15x faster (timed on both fit and predict steps):

.. image:: https://raw.githubusercontent.com/CDonnerer/xgboost-distribution/main/imgs/performance_comparison.png :align: center :width: 600px :alt: XGBDistribution vs NGBoost

Please see the experiments page_ for results across various datasets.

Full XGBoost features

XGBDistribution offers the full set of XGBoost features available in the XGBoost scikit-learn API, allowing, for example, probabilistic regression with monotonic constraints:

.. image:: https://raw.githubusercontent.com/CDonnerer/xgboost-distribution/main/imgs/monotone_constraint.png :align: center :width: 600px :alt: XGBDistribution monotonic constraints

Acknowledgements

This package would not exist without the excellent work from:

  • NGBoost_ - Which demonstrated how gradient boosting with natural gradients can be used to estimate parameters of distributions. Much of the gradient calculations code were adapted from there.

  • XGBoost_ - Which provides the gradient boosting algorithms used here, in particular the sklearn APIs were taken as a blue-print.

.. _pyscaffold-notes:

Note

This project has been set up using PyScaffold 4.0.1. For details and usage information on PyScaffold see https://pyscaffold.org/.

.. _ngboost: https://github.com/stanfordmlgroup/ngboost .. _faster: https://xgboost-distribution.readthedocs.io/en/latest/experiments.html .. _xgboost scikit-learn api: https://xgboost.readthedocs.io/en/latest/python/python_api.html#module-xgboost.sklearn .. _dependencies: https://github.com/CDonnerer/xgboost-distribution/blob/feature/update-linting/setup.cfg#L37 .. _monotonic constraints: https://xgboost.readthedocs.io/en/latest/tutorials/monotonic.html .. _scipy.stats.norm: https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.norm.html .. _LAPACK gesv: https://www.netlib.org/lapack/lug/node71.html .. _xgboost: https://github.com/dmlc/xgboost .. _documentation: https://xgboost-distribution.readthedocs.io/en/latest/api/xgboost_distribution.XGBDistribution.html#xgboost_distribution.XGBDistribution .. _experiments page: https://xgboost-distribution.readthedocs.io/en/latest/experiments.html .. _numpy arrays: https://numpy.org/doc/stable/reference/generated/numpy.array.html .. _scipy stats: https://docs.scipy.org/doc/scipy/reference/stats.html .. _namedtuple: https://docs.python.org/3/library/collections.html#collections.namedtuple .. _maximum likelihood estimation: https://en.wikipedia.org/wiki/Maximum_likelihood_estimation

FAQs


Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc