bindata 

A python replication of the homonymous R library
bindata,
based on the paper
"Generation of correlated artificial binary data.",
by Friedrich Leisch, Andreas Weingessel, and Kurt Hornik.
The library fully replicates the existing R-package
with the following functions:
bincorr2commonprob
check_commonprob (check.commonprob in R)
commonprob2sigma
condprob
ra2ba
rmvbin
simul_commonprob (simul.commonprob in R)
Precomputed (via Monte Carlo simulations) SimulVals are also available.
Installation
bindata can be installed with pip as:
pip install bindata
How to
Generate uncorrelated variates
import bindata as bnd
margprob = [0.3, 0.9]
X = bnd.rmvbin(N=100_000, margprob=margprob)
Now let's verify the sample marginals and correlations:
import numpy as np
print(X.mean(0))
print(np.corrcoef(X, rowvar=False))
[0.30102 0.9009 ]
[[ 1. -0.00101357]
[-0.00101357 1. ]]
Generate correlated variates
From a correlation matrix
corr = np.array([[1., -0.25, -0.0625],
[-0.25, 1., 0.25],
[-0.0625, 0.25, 1.]])
commonprob = bnd.bincorr2commonprob(margprob=[0.2, 0.5, 0.8],
bincorr=corr)
X = bnd.rmvbin(margprob=np.diag(commonprob),
commonprob=commonprob, N=100_000)
print(X.mean(0))
print(np.corrcoef(X, rowvar=False))
[0.1996 0.50148 0.80076]
[[ 1. -0.25552 -0.05713501]
[-0.25552 1. 0.24412401]
[-0.05713501 0.24412401 1. ]]
From a joint probability matrix
commonprob = [[1/2, 1/5, 1/6],
[1/5, 1/2, 1/6],
[1/6, 1/6, 1/2]]
X = bnd.rmvbin(N=100_000, commonprob=commonprob)
print(X.mean(0))
print(np.corrcoef(X, rowvar=False))
[0.50076 0.50289 0.49718]
[[ 1. -0.20195239 -0.33343712]
[-0.20195239 1. -0.34203855]
[-0.33343712 -0.34203855 1. ]]
For a more comprehensive documentation please consult
the documentation.
Acknowledgements
Author
Luca Mingarelli, 2022
You find this work useful? 