Blogs
distfit
is a python package for probability density fitting of univariate distributions for random variables.
With the random variable as an input, distfit can find the best fit for parametric, non-parametric, and discrete distributions.
-
For the parametric approach, the distfit library can determine the best fit across 89 theoretical distributions.
To score the fit, one of the scoring statistics for the good-of-fitness test can be used used, such as RSS/SSE, Wasserstein,
Kolmogorov-Smirnov (KS), or Energy. After finding the best-fitted theoretical distribution, the loc, scale,
and arg parameters are returned, such as mean and standard deviation for normal distribution.
-
For the non-parametric approach, the distfit library contains two methods, the quantile and percentile method.
Both methods assume that the data does not follow a specific probability distribution. In the case of the quantile method,
the quantiles of the data are modeled whereas for the percentile method, the percentiles are modeled.
-
In case the dataset contains discrete values, the distift library contains the option for discrete fitting.
The best fit is then derived using the binomial distribution.
⭐️ Star this repo if you like it ⭐️
Installation
Install distfit from PyPI
pip install distfit
Install from github source (beta version)
install git+https://github.com/erdogant/distfit
Check version
import distfit
print(distfit.__version__)
The following functions are available after installation:
from distfit import distfit
dfit = distfit()
dfit.fit_transform(X)
dfit.predict(y)
dfit.plot()
Examples
After we have a fitted model, we can make some predictions using the theoretical distributions.
After making some predictions, we can plot again but now the predictions are automatically included.
The full list of distributions is listed here: https://erdogant.github.io/distfit/pages/html/Parametric.html
The full list of distributions is listed here: https://erdogant.github.io/distfit/pages/html/Parametric.html
from scipy.stats import binom
n = 8
p = 0.5
X = binom(n, p).rvs(10000)
print(X)
from distfit import distfit
dfit = distfit(method='discrete')
dfit.fit_transform(X)
Contributors
Setting up and maintaining distfit has been possible thanks to users and contributors. Thanks:
Citation
Please cite distfit
in your publications if this is useful for your research. See column right for citation information.
Maintainer
- Erdogan Taskesen, github: erdogant
- Contributions are welcome.
- If you wish to buy me a Coffee for this work, it is very appreciated :)