Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More
Socket
Sign inDemoInstall
Socket

phitter

Package Overview
Dependencies
Maintainers
1
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

phitter

Find the best probability distribution for your dataset and simulate processes and queues

  • 1.0.0
  • PyPI
  • Socket score

Maintainers
1

phitter-dark-logo

Downloads License Supported Python versions Tests

Phitter analyzes datasets and determines the best analytical probability distributions that represent them. Phitter studies over 80 probability distributions, both continuous and discrete, 3 goodness-of-fit tests, and interactive visualizations. For each selected probability distribution, a standard modeling guide is provided along with spreadsheets that detail the methodology for using the chosen distribution in data science, operations research, and artificial intelligence.

In addition, Phitter offers the capability to perform process simulations, allowing users to graph and observe minimum times for specific observations. It also supports queue simulations with flexibility to configure various parameters, such as the number of servers, maximum population size, system capacity, and different queue disciplines, including First-In-First-Out (FIFO), Last-In-First-Out (LIFO), and priority-based service (PBS).

This repository contains the implementation of the python library and the kernel of Phitter Web

Installation

Requirements

python: >=3.9

PyPI

pip install phitter

Usage

1. Fit Notebook's Tutorials

TutorialNotebooks
Fit ContinuousOpen In Colab
Fit DiscreteOpen In Colab
Fit Accelerate [Sample>100K]Open In Colab
Fit Specific DisributionOpen In Colab
Working DistributionOpen In Colab

2. Simulation Notebook's Tutorials

Pending

Documentation

Documentation Fit Module

General Fit

import phitter

## Define your dataset
data: list[int | float] = [...]

## Make a continuous fit using Phitter
phi = phitter.PHITTER(data)
phi.fit()

Full continuous implementation

import phitter

## Define your dataset
data: list[int | float] = [...]

## Make a continuous fit using Phitter
phi = phitter.PHITTER(
    data=data,
    fit_type="continuous",
    num_bins=15,
    confidence_level=0.95,
    minimum_sse=1e-2,
    distributions_to_fit=["beta", "normal", "fatigue_life", "triangular"],
)
phi.fit(n_workers=6)

Full discrete implementation

import phitter

## Define your dataset
data: list[int | float] = [...]

## Make a discrete fit using Phitter
phi = phitter.PHITTER(
    data=data,
    fit_type="discrete",
    confidence_level=0.95,
    minimum_sse=1e-2,
    distributions_to_fit=["binomial", "geometric"],
)
phi.fit(n_workers=2)

Phitter: properties and methods

import phitter

## Define your dataset
data: list[int | float] = [...]

## Make a fit using Phitter
phi = phitter.PHITTER(data)
phi.fit(n_workers=2)

## Global methods and properties
phi.summarize(k: int) -> pandas.DataFrame
phi.summarize_info(k: int) -> pandas.DataFrame
phi.best_distribution -> dict
phi.sorted_distributions_sse -> dict
phi.not_rejected_distributions -> dict
phi.df_sorted_distributions_sse -> pandas.DataFrame
phi.df_not_rejected_distributions -> pandas.DataFrame

## Specific distribution methods and properties
phi.get_parameters(id_distribution: str) -> dict
phi.get_test_chi_square(id_distribution: str) -> dict
phi.get_test_kolmmogorov_smirnov(id_distribution: str) -> dict
phi.get_test_anderson_darling(id_distribution: str) -> dict
phi.get_sse(id_distribution: str) -> float
phi.get_n_test_passed(id_distribution: str) -> int
phi.get_n_test_null(id_distribution: str) -> int

Histogram Plot

import phitter
data: list[int | float] = [...]
phi = phitter.PHITTER(data)
phi.fit()

phi.plot_histogram()
phitter_histogram

Histogram PDF Dsitributions Plot

import phitter
data: list[int | float] = [...]
phi = phitter.PHITTER(data)
phi.fit()

phi.plot_histogram_distributions()
phitter_histogram

Histogram PDF Dsitribution Plot

import phitter
data: list[int | float] = [...]
phi = phitter.PHITTER(data)
phi.fit()

phi.plot_distribution("beta")
phitter_histogram

ECDF Plot

import phitter
data: list[int | float] = [...]
phi = phitter.PHITTER(data)
phi.fit()

phi.plot_ecdf()
phitter_histogram

ECDF Distribution Plot

import phitter
data: list[int | float] = [...]
phi = phitter.PHITTER(data)
phi.fit()

phi.plot_ecdf_distribution("beta")
phitter_histogram

QQ Plot

import phitter
data: list[int | float] = [...]
phi = phitter.PHITTER(data)
phi.fit()

phi.qq_plot("beta")
phitter_histogram

QQ - Regression Plot

import phitter
data: list[int | float] = [...]
phi = phitter.PHITTER(data)
phi.fit()

phi.qq_plot_regression("beta")
phitter_histogram

Working with distributions: Methods and properties

import phitter

distribution = phitter.continuous.BETA({"alpha": 5, "beta": 3, "A": 200, "B": 1000})

## CDF, PDF, PPF, PMF receive float or numpy.ndarray. For discrete distributions PMF instead of PDF. Parameters notation are in description of ditribution
distribution.cdf(752) # -> 0.6242831129533498
distribution.pdf(388) # -> 0.0002342575686629883
distribution.ppf(0.623) # -> 751.5512889417921
distribution.sample(2) # -> [550.800114   514.85410326]

## STATS
distribution.mean # -> 700.0
distribution.variance # -> 16666.666666666668
distribution.standard_deviation # -> 129.09944487358058
distribution.skewness # -> -0.3098386676965934
distribution.kurtosis # -> 2.5854545454545454
distribution.median # -> 708.707130841534
distribution.mode # -> 733.3333333333333

Continuous Distributions

1. PDF File Documentation Continuous Distributions
2. Resources Continuous Distributions
DistributionPhitter PlaygroundExcel FileGoogle Sheets Files
alpha▶️phitter:alpha📊alpha.xlsx🌐gs:alpha
arcsine▶️phitter:arcsine📊arcsine.xlsx🌐gs:arcsine
argus▶️phitter:argus📊argus.xlsx🌐gs:argus
beta▶️phitter:beta📊beta.xlsx🌐gs:beta
beta_prime▶️phitter:beta_prime📊beta_prime.xlsx🌐gs:beta_prime
beta_prime_4p▶️phitter:beta_prime_4p📊beta_prime_4p.xlsx🌐gs:beta_prime_4p
bradford▶️phitter:bradford📊bradford.xlsx🌐gs:bradford
burr▶️phitter:burr📊burr.xlsx🌐gs:burr
burr_4p▶️phitter:burr_4p📊burr_4p.xlsx🌐gs:burr_4p
cauchy▶️phitter:cauchy📊cauchy.xlsx🌐gs:cauchy
chi_square▶️phitter:chi_square📊chi_square.xlsx🌐gs:chi_square
chi_square_3p▶️phitter:chi_square_3p📊chi_square_3p.xlsx🌐gs:chi_square_3p
dagum▶️phitter:dagum📊dagum.xlsx🌐gs:dagum
dagum_4p▶️phitter:dagum_4p📊dagum_4p.xlsx🌐gs:dagum_4p
erlang▶️phitter:erlang📊erlang.xlsx🌐gs:erlang
erlang_3p▶️phitter:erlang_3p📊erlang_3p.xlsx🌐gs:erlang_3p
error_function▶️phitter:error_function📊error_function.xlsx🌐gs:error_function
exponential▶️phitter:exponential📊exponential.xlsx🌐gs:exponential
exponential_2p▶️phitter:exponential_2p📊exponential_2p.xlsx🌐gs:exponential_2p
f▶️phitter:f📊f.xlsx🌐gs:f
f_4p▶️phitter:f_4p📊f_4p.xlsx🌐gs:f_4p
fatigue_life▶️phitter:fatigue_life📊fatigue_life.xlsx🌐gs:fatigue_life
folded_normal▶️phitter:folded_normal📊folded_normal.xlsx🌐gs:folded_normal
frechet▶️phitter:frechet📊frechet.xlsx🌐gs:frechet
gamma▶️phitter:gamma📊gamma.xlsx🌐gs:gamma
gamma_3p▶️phitter:gamma_3p📊gamma_3p.xlsx🌐gs:gamma_3p
generalized_extreme_value▶️phitter:gen_extreme_value📊gen_extreme_value.xlsx🌐gs:gen_extreme_value
generalized_gamma▶️phitter:gen_gamma📊gen_gamma.xlsx🌐gs:gen_gamma
generalized_gamma_4p▶️phitter:gen_gamma_4p📊gen_gamma_4p.xlsx🌐gs:gen_gamma_4p
generalized_logistic▶️phitter:gen_logistic📊gen_logistic.xlsx🌐gs:gen_logistic
generalized_normal▶️phitter:gen_normal📊gen_normal.xlsx🌐gs:gen_normal
generalized_pareto▶️phitter:gen_pareto📊gen_pareto.xlsx🌐gs:gen_pareto
gibrat▶️phitter:gibrat📊gibrat.xlsx🌐gs:gibrat
gumbel_left▶️phitter:gumbel_left📊gumbel_left.xlsx🌐gs:gumbel_left
gumbel_right▶️phitter:gumbel_right📊gumbel_right.xlsx🌐gs:gumbel_right
half_normal▶️phitter:half_normal📊half_normal.xlsx🌐gs:half_normal
hyperbolic_secant▶️phitter:hyperbolic_secant📊hyperbolic_secant.xlsx🌐gs:hyperbolic_secant
inverse_gamma▶️phitter:inverse_gamma📊inverse_gamma.xlsx🌐gs:inverse_gamma
inverse_gamma_3p▶️phitter:inverse_gamma_3p📊inverse_gamma_3p.xlsx🌐gs:inverse_gamma_3p
inverse_gaussian▶️phitter:inverse_gaussian📊inverse_gaussian.xlsx🌐gs:inverse_gaussian
inverse_gaussian_3p▶️phitter:inverse_gaussian_3p📊inverse_gaussian_3p.xlsx🌐gs:inverse_gaussian_3p
johnson_sb▶️phitter:johnson_sb📊johnson_sb.xlsx🌐gs:johnson_sb
johnson_su▶️phitter:johnson_su📊johnson_su.xlsx🌐gs:johnson_su
kumaraswamy▶️phitter:kumaraswamy📊kumaraswamy.xlsx🌐gs:kumaraswamy
laplace▶️phitter:laplace📊laplace.xlsx🌐gs:laplace
levy▶️phitter:levy📊levy.xlsx🌐gs:levy
loggamma▶️phitter:loggamma📊loggamma.xlsx🌐gs:loggamma
logistic▶️phitter:logistic📊logistic.xlsx🌐gs:logistic
loglogistic▶️phitter:loglogistic📊loglogistic.xlsx🌐gs:loglogistic
loglogistic_3p▶️phitter:loglogistic_3p📊loglogistic_3p.xlsx🌐gs:loglogistic_3p
lognormal▶️phitter:lognormal📊lognormal.xlsx🌐gs:lognormal
maxwell▶️phitter:maxwell📊maxwell.xlsx🌐gs:maxwell
moyal▶️phitter:moyal📊moyal.xlsx🌐gs:moyal
nakagami▶️phitter:nakagami📊nakagami.xlsx🌐gs:nakagami
non_central_chi_square▶️phitter:non_central_chi_square📊non_central_chi_square.xlsx🌐gs:non_central_chi_square
non_central_f▶️phitter:non_central_f📊non_central_f.xlsx🌐gs:non_central_f
non_central_t_student▶️phitter:non_central_t_student📊non_central_t_student.xlsx🌐gs:non_central_t_student
normal▶️phitter:normal📊normal.xlsx🌐gs:normal
pareto_first_kind▶️phitter:pareto_first_kind📊pareto_first_kind.xlsx🌐gs:pareto_first_kind
pareto_second_kind▶️phitter:pareto_second_kind📊pareto_second_kind.xlsx🌐gs:pareto_second_kind
pert▶️phitter:pert📊pert.xlsx🌐gs:pert
power_function▶️phitter:power_function📊power_function.xlsx🌐gs:power_function
rayleigh▶️phitter:rayleigh📊rayleigh.xlsx🌐gs:rayleigh
reciprocal▶️phitter:reciprocal📊reciprocal.xlsx🌐gs:reciprocal
rice▶️phitter:rice📊rice.xlsx🌐gs:rice
semicircular▶️phitter:semicircular📊semicircular.xlsx🌐gs:semicircular
t_student▶️phitter:t_student📊t_student.xlsx🌐gs:t_student
t_student_3p▶️phitter:t_student_3p📊t_student_3p.xlsx🌐gs:t_student_3p
trapezoidal▶️phitter:trapezoidal📊trapezoidal.xlsx🌐gs:trapezoidal
triangular▶️phitter:triangular📊triangular.xlsx🌐gs:triangular
uniform▶️phitter:uniform📊uniform.xlsx🌐gs:uniform
weibull▶️phitter:weibull📊weibull.xlsx🌐gs:weibull
weibull_3p▶️phitter:weibull_3p📊weibull_3p.xlsx🌐gs:weibull_3p

Discrete Distributions

1. PDF File Documentation Discrete Distributions
2. Resources Discrete Distributions
DistributionPhitter PlaygroundExcel FileGoogle Sheets Files
bernoulli▶️phitter:bernoulli📊bernoulli.xlsx🌐gs:bernoulli
binomial▶️phitter:binomial📊binomial.xlsx🌐gs:binomial
geometric▶️phitter:geometric📊geometric.xlsx🌐gs:geometric
hypergeometric▶️phitter:hypergeometric📊hypergeometric.xlsx🌐gs:hypergeometric
logarithmic▶️phitter:logarithmic📊logarithmic.xlsx🌐gs:logarithmic
negative_binomial▶️phitter:negative_binomial📊negative_binomial.xlsx🌐gs:negative_binomial
poisson▶️phitter:poisson📊poisson.xlsx🌐gs:poisson
uniform▶️phitter:uniform📊uniform.xlsx🌐gs:uniform

Benchmarks

Fit time continuous distributions

Sample Size / Workers1261020
1K8.29817.12428.96679.928716.2246
10K20.871114.264710.561211.600417.8562
100K152.629697.235957.731051.618253.2313
500K914.9291640.8153370.0323267.4597257.7534
1M1580.8501972.3985573.5429496.5569425.7809

Estimation time parameters discrete distributions

Sample Size / Workers124
1K0.16882.64022.8719
10K0.44622.44523.0471
100K4.55986.32467.5869
500K19.017221.804719.8420
1M39.806529.836030.2334

Estimation time parameters continuous distributions

Distribution / Sample Size1K10K100K500K1M10M
alpha0.33450.46252.593318.385639.6533362.2951
arcsine0.00000.00000.00000.00000.00000.0000
argus0.05590.20502.247213.392841.5198362.2472
beta0.18800.17900.19400.21100.18000.3134
beta_prime0.17660.75067.603940.426485.0677812.1323
beta_prime_4p0.07200.36303.947820.270340.2709413.5239
bradford0.01100.00000.00000.00000.00000.0010
burr0.07330.69315.542536.768479.8269668.2016
burr_4p0.15520.79818.471644.454987.7292858.0035
cauchy0.00900.01600.15811.10522.109021.5244
chi_square0.00000.00000.00000.00000.00000.0000
chi_square_3p0.00000.00000.00000.00000.00000.0000
dagum0.33810.82789.690745.585598.6691917.6713
dagum_4p0.36461.330713.343770.9462140.93711396.3368
erlang0.00100.00000.00000.00000.00000.0000
erlang_3p0.00000.00000.00000.00000.00000.0000
error_function0.00000.00000.00000.00000.00000.0000
exponential0.00000.00000.00000.00000.00000.0000
exponential_2p0.00000.00000.00000.00000.00000.0000
f0.05920.29482.692018.945829.9547402.2248
fatigue_life0.03520.11011.70859.009020.4702186.9631
folded_normal0.00200.00200.00200.00220.00330.0040
frechet0.13130.43595.703139.420243.2469671.3343
f_4p0.32690.75170.61830.60370.58090.2073
gamma0.00000.00000.00000.00000.00000.0000
gamma_3p0.00000.00000.00000.00000.00000.0000
generalized_extreme_value0.08330.20542.033710.330122.1340243.3120
generalized_gamma0.02980.01780.02270.02360.01700.0241
generalized_gamma_4p0.03710.01160.07320.07250.07070.0730
generalized_logistic0.10400.10730.10370.08190.09890.0836
generalized_normal0.01540.07360.73672.48315.975255.2417
generalized_pareto0.31890.89788.937051.3813101.68321015.2933
gibrat0.03280.04320.42872.71595.572154.1702
gumbel_left0.00000.00000.00000.00000.00100.0010
gumbel_right0.00000.00000.00000.00000.00000.0000
half_normal0.00100.00000.00000.00100.00000.0000
hyperbolic_secant0.00000.00000.00000.00000.00000.0000
inverse_gamma0.03080.06320.72335.012710.788599.1316
inverse_gamma_3p0.07870.14721.651311.116123.4587227.6125
inverse_gaussian0.00000.00000.00000.00000.00000.0000
inverse_gaussian_3p0.00000.00000.00000.00000.00000.0000
johnson_sb0.29660.74664.070740.202856.2130728.2447
johnson_su0.00700.00100.00100.01430.00100.0010
kumaraswamy0.01640.01200.01300.01230.01250.0150
laplace0.00000.00000.00000.00000.00000.0000
levy0.01000.03140.22961.13652.721126.4966
loggamma0.00850.00500.00500.00700.00620.0080
logistic0.00000.00000.00000.00000.00000.0000
loglogistic0.00000.00000.00000.00000.00000.0000
loglogistic_3p0.00000.00000.00000.00000.00000.0000
lognormal0.00000.00000.00000.00000.00100.0000
maxwell0.00000.00000.00000.00000.00000.0010
moyal0.00000.00000.00000.00000.00000.0000
nakagami0.00000.00300.02130.12150.26492.2457
non_central_chi_square0.00000.00000.00000.00000.00000.0000
non_central_f0.01900.01820.02100.01920.01900.0200
non_central_t_student0.08740.08220.08620.13140.25160.1781
normal0.00000.00000.00000.00000.00000.0000
pareto_first_kind0.00100.00300.03900.24940.52265.5246
pareto_second_kind0.06430.15221.172210.987123.6534201.1626
pert0.00520.00300.00300.00400.00400.0092
power_function0.00750.00400.00400.00300.00400.0040
rayleigh0.00000.00000.00000.00000.00000.0000
reciprocal0.00000.00000.00000.00000.00000.0000
rice0.01820.00300.00400.00600.00300.0050
semicircular0.00000.00000.00000.00000.00000.0000
trapezoidal0.00830.00720.00730.00600.00700.0060
triangular0.00000.00000.00000.00000.00000.0000
t_student0.00000.00000.00000.00000.00000.0000
t_student_3p0.38921.186011.275971.1156143.19391409.8578
uniform0.00000.00000.00000.00000.00000.0000
weibull0.00100.00000.00000.00000.00100.0010
weibull_3p0.00610.00400.00300.00400.00500.0050

Estimation time parameters discrete distributions

Distribution / Sample Size1K10K100K500K1M10M
bernoulli0.00000.00000.00000.00000.00000.0000
binomial0.00000.00000.00000.00000.00000.0000
geometric0.00000.00000.00000.00000.00000.0000
hypergeometric0.07730.00610.00300.00200.00300.0051
logarithmic0.02100.00350.01710.00500.00300.0756
negative_binomial0.02930.00000.00000.00000.00000.0000
poisson0.00000.00000.00000.00000.00000.0000
uniform0.00000.00000.00000.00000.00000.0000
Documentation Simulation Module

Process Simulation

This will help you to understand your processes. To use it, run the following line

from phitter import simulation

# Create a simulation process instance
simulation = simulation.ProcessSimulation()

Add processes to your simulation instance

There are two ways to add processes to your simulation instance:

  • Adding a process without preceding process (new branch)
  • Adding a process with preceding process (with previous ids)
Process without preceding process (new branch)
# Add a new process without preceding process
simulation.add_process(
    prob_distribution="normal",
    parameters={"mu": 5, "sigma": 2},
    process_id="first_process",
    number_of_products=10,
    number_of_servers=3,
    new_branch=True,
)

Process with preceding process (with previous ids)
# Add a new process with preceding process
simulation.add_process(
    prob_distribution="exponential",
    parameters={"lambda": 4},
    process_id="second_process",
    previous_ids=["first_process"],
)

All together and adding some new process

The order in which you add each process matters. You can add as many processes as you need.

# Add a new process without preceding process
simulation.add_process(
    prob_distribution="normal",
    parameters={"mu": 5, "sigma": 2},
    process_id="first_process",
    number_of_products=10,
    number_of_servers=3,
    new_branch=True,
)

# Add a new process with preceding process
simulation.add_process(
    prob_distribution="exponential",
    parameters={"lambda": 4},
    process_id="second_process",
    previous_ids=["first_process"],
)

# Add a new process with preceding process
simulation.add_process(
    prob_distribution="gamma",
    parameters={"alpha": 15, "beta": 3},
    process_id="third_process",
    previous_ids=["first_process"],
)

# Add a new process without preceding process
simulation.add_process(
    prob_distribution="exponential",
    parameters={"lambda": 4.3},
    process_id="fourth_process",
    new_branch=True,
)


# Add a new process with preceding process
simulation.add_process(
    prob_distribution="beta",
    parameters={"alpha": 1, "beta": 1, "A": 2, "B": 3},
    process_id="fifth_process",
    previous_ids=["second_process", "fourth_process"],
)

# Add a new process with preceding process
simulation.add_process(
    prob_distribution="normal",
    parameters={"mu": 15, "sigma": 2},
    process_id="sixth_process",
    previous_ids=["third_process", "fifth_process"],
)

Visualize your processes

You can visualize your processes to see if what you're trying to simulate is your actual process.

# Graph your process
simulation.process_graph()

Simulation

Start Simulation

You can simulate and have different simulation time values or you can create a confidence interval for your process

Run Simulation

Simulate several scenarios of your complete process

# Run Simulation
simulation.run(number_of_simulations=100)

# After run
simulation: pandas.Dataframe

Review Simulation Metrics by Stage

If you want to review average time and standard deviation by stage run this line of code

# Review simulation metrics
simulation.simulation_metrics() -> pandas.Dataframe
Run confidence interval

If you want to have a confidence interval for the simulation metrics, run the following line of code

# Confidence interval for Simulation metrics
simulation.run_confidence_interval(
    confidence_level=0.99,
    number_of_simulations=100,
    replications=10,
) -> pandas.Dataframe

Queue Simulation

If you need to simulate queues run the following code:

from phitter import simulation

# Create a simulation process instance
simulation = simulation.QueueingSimulation(
    a="exponential",
    a_paramters={"lambda": 5},
    s="exponential",
    s_parameters={"lambda": 20},
    c=3,
)

In this case we are going to simulate a (arrivals) with exponential distribution and s (service) as exponential distribution with c equals to 3 different servers.

By default Maximum Capacity k is infinity, total population n is infinity and the queue discipline d is FIFO. As we are not selecting d equals to "PBS" we don't have any information to add for pbs_distribution nor pbs_parameters

Run the simulation

If you want to have the simulation results

# Run simulation
simulation = simulation.run(simulation_time = 2000)
simulation: pandas.Dataframe

If you want to see some metrics and probabilities from this simulation you should use::

# Calculate metrics
simulation.metrics_summary() -> pandas.Dataframe

# Calculate probabilities
number_probability_summary() -> pandas.Dataframe

Run Confidence Interval for metrics and probabilities

If you want to have a confidence interval for your metrics and probabilities you should run the following line

# Calculate confidence interval for metrics and probabilities
probabilities, metrics = simulation.confidence_interval_metrics(
    simulation_time=2000,
    confidence_level=0.99,
    replications=10,
)

probabilities -> pandas.Dataframe
metrics -> pandas.Dataframe

Contribution

If you would like to contribute to the Phitter project, please create a pull request with your proposed changes or enhancements. All contributions are welcome!

Keywords

FAQs


Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc