SurPyval - Survival Analysis in Python
Yet another Python survival analysis tool.
This is another pure python survival analysis tool so why was it needed? The intent of this package was to closely mimic the scipy API as close as possible with a simple .fit()
method for any type of distribution (parametric or non-parametric); other survival analysis packages don't completely mimic that API. Further, there is currently (at the time of writing) no pacakage that can take an arbitrary comination of observed, censored, and truncated data. Finally, surpyval is unique in that it can be used with multiple parametric estimation methods. This allows for an analyst to determine a distribution for the parameters if another method fails. The parametric methods available are Maximum Likelihood Estimation (MLE), Probability Plotting (MPP), Mean Square Error (MSE), Method of Moments (MOM), and Maximum Product of Spacing (MPS). Surpyval can, for each type of estimator, take the following types of input data:
Method | Para/Non-Para | Observed | Censored | Truncated |
---|
MLE | Parametric | Yes | Yes | Yes |
MPP | Parametric | Yes | Yes | Limited |
MSE | Parametric | Yes | Yes | Limited |
MOM | Parametric | Yes | No | No |
MPS | Parametric | Yes | Yes | No |
Kaplan-Meier | Non-Parametric | Yes | Right only | Left only |
Nelson-Aalen | Non-Parametric | Yes | Right only | Left only |
Fleming-Harrington | Non-Parametric | Yes | Right only | Left only |
Turnbull | Non-Parametric | Yes | Yes | Yes |
SurPyval also offers many different distributions for users, and because of the flexible implementation adding new distributions is easy. Further, the power of SurPyval lay in the robust parameter estimation, as such, some distributions, those that are supported on the half real line, can be offset to make a three- or four-parameter version. The currently available distributions are:
Distribution | Offsetable |
---|
Weibull | Yes |
Normal | No |
LogNormal | Yes |
Gamma | Yes |
Beta | No |
Uniform | No |
Exponential | Yes |
Exponentiated Weibull | Yes |
Gumbel | No |
Logistic | No |
LogLogistic | Yes |
This project spawned from a Reliaility Engineering project; due to the history of reliability engineers estimating parameters from a probability plot. SurPyval has continued this tradition to ensure that any parametric distribution can have the estimate plotted on a probability plot. These visualisations enable an analyst to get a sense of the goodness of fit of the parametric distribution with the non-parametric distribution.
Install and Quick Intro
SurPyval can be installed via pip using the PyPI repository
pip install surpyval
If you're familiar with survival analysis, and Weibull plotting, the following is a quick start.
from surpyval import Weibull
from surpyval.datasets import BoforsSteel
data = BoforsSteel.df
x = data['x']
n = data['n']
model = Weibull.fit(x=x, n=n, offset=True)
model.plot();
Documentation
SurPyval is well documented, and improving, at the main documentation.
Contact
Email derryn if you want any features or to see how SurPyval can be used for you.