
Security News
Axios Supply Chain Attack Reaches OpenAI macOS Signing Pipeline, Forces Certificate Rotation
OpenAI rotated macOS signing certificates after a malicious Axios package reached its CI pipeline in a broader software supply chain attack.
Mixture-Models
Advanced tools
A one-stop Python library for fitting a wide range of mixture models such as Mixture of Gaussians, Students'-T, Factor-Analyzers, Parsimonious Gaussians, MCLUST, etc.
While there are several packages in R and Python which support various kinds of mixture-models, each one of them has their own API and syntax. Further, in almost all those libraries, the inference proceeds via Expectation-Maximization (a Quasi first order method) which makes them unsuitable for high-dimensional data.
This library attempts to provide a seamless and unified interface for fitting a wide-range of mixture models. Unlike many existing packages that rely on Expectation-Maximization for inference, our approach leverages Automatic Differentiation tools and gradient-based optimization which makes it well equipped to handle high-dimensional data and second order optimization routines.
Installation is straightforward:
pip install Mixture-Models
The estimation procedure consists of 3 simple steps:
### Simulate some dummy data using the built-in function make_pinwheel
data = make_pinwheel(radial_std=0.3, tangential_std=0.05, num_classes=3,
num_per_class=100, rate=0.4,rs=npr.RandomState(0))
### Plot the three clusters
fig = plt.figure()
ax = fig.add_subplot(111)
ax.plot(data[:, 0], data[:, 1], 'k.')
plt.show()
### STEP 1 - Choose a mixture model to fit on your data
my_model = GMM(data)
### STEP 2 - Initialize your model with some parameters
init_params = my_model.init_params(num_components = 3,scale = 0.5)
### STEP 3 - Learn the parameters using some optimization routine
params_store = my_model.fit(init_params,"Newton-CG")
Once the model is trained on the data (which is a numpy matrix of shape (num_datapoints, num_dim)),
post-hoc analysis can be performed:
for params in params_store:
print("likelihood",my_model.likelihood(params))
print("aic,bic",my_model.aic(params),my_model.bic(params))
np.array(my_model.labels(data,params_store[-1])) ## final predicted labels
Example notebooks are available on the project Github repo.
There are more than 30+ different mixture-models, spread across five model families, currently supported by the library. Here is a brief overview of the different model families supported:
GMM: Standard Gaussian mixture model
GMM_Constrainted: GMM with common covariance across componentsMclust: MCLUST family of constrained GMMsMFA: Mixture-of-factor analyzers
PGMM: Parsimonious GMM extension with constraintsTMM: Mixture of t-distributionsThe project repo 'Examples' folder includes more detailed illustrations for all these models, as well as a README.md for advanced users who want to fit custom mixture models,
or tinker with the settings for the above procedure.
Currently, four main gradient based optimizers are available:
"grad_descent": Stochastic Gradient Descent (SGD) with momentum"rms_prop": Root-mean-squared propagation (RMS-Prop)"adam": Adaptive moments (ADAM)"Newton-CG": Newton-Conjugate Gradient (Newton CG)The details about each optimizer and its optional input parameters are given in the PDF in the 'Examples' folder.
The output of fit method is the set of all points in the parameter space that the optimizer has traversed during the optimization i.e. list of parameters with the final entry in the list being the final fitted solution.
We have a detailed notebook Optimizers_illustration.ipynb in the 'Examples' folder on Github.
We welcome contributions to our library. Our code base is highly modularized, making it easy for new contributors to extend its capabilities and add support for additional models. If you are interested in contributing to the library, check out the contribution guide.
If you're unsure where to start, check out our open issues for inspiration on the kind of problems you can work on. Alternately, you could also open a new issue so we can discuss the best strategy for integrating your work.
If you use this package, please consider citing our research as
@article{kasa2020model, title={Model-based Clustering using Automatic Differentiation: Confronting Misspecification and High-Dimensional Data}, author={Kasa, Siva Rajesh and Rajan, Vaibhav}, journal={arXiv preprint arXiv:2007.12786}, year={2020} }
FAQs
A Python library for fitting mixture models using gradient based inference
We found that Mixture-Models demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Security News
OpenAI rotated macOS signing certificates after a malicious Axios package reached its CI pipeline in a broader software supply chain attack.

Security News
Open source is under attack because of how much value it creates. It has been the foundation of every major software innovation for the last three decades. This is not the time to walk away from it.

Security News
Socket CEO Feross Aboukhadijeh breaks down how North Korea hijacked Axios and what it means for the future of software supply chain security.