Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More
Socket
Sign inDemoInstall
Socket

dolvins

Package Overview
Dependencies
Maintainers
1
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

dolvins

Dolvin's Math and Stats Library

  • 0.0.5
  • PyPI
  • Socket score

Maintainers
1

Dolvins

This project provides a set of functions and classes for optimization, probability, and statistical analysis, with a focus on handling multi-dimensional data, hyperplanes, and distribution analysis.


Table of Contents

Installation

Dolvins is built on the following packages:

  • psutil

  • numpy

  • pandas

  • tqdm

  • scipy

To install Dolvins automatically with all its dependencies, please run:


pip install dolvins


Usage

General Math Functions

next_power_of_two(x: int) -> int

Returns the next power of two greater than or equal to x.

Arguments:

  • x (int): The input number.

Returns:

  • int: The next power of two.

Example:


x = 5

next_power = next_power_of_two(x)

print(next_power)



>> 8


round_down_to_nearest_power_of_two(x: int) -> int

Rounds down x to the nearest power of two.

Arguments:

  • x (int): The input number.

Returns:

  • int: The nearest power of two.

Example:


x = 10

nearest_power = round_down_to_nearest_power_of_two(x)

print(nearest_power)



>> 8


gcd_of_list(numbers: list) -> int

Returns the GCD of a list of numbers.

Arguments:

  • numbers (list): A list of integers.

Returns:

  • int: The GCD of the list.

Example:


numbers = [12, 15, 21]

gcd_result = gcd_of_list(numbers)

print(gcd_result)



>> 3


Mathematical Objects

Hyperplane

A class representing a hyperplane.

Methods:

  • __init__(self, normal: np.array, coef: float)

    Initializes a Hyperplane object with a normal vector and coefficient.

    Arguments:

    • normal (np.array): The normal vector to the hyperplane.

    • coef (float): The coefficient of the hyperplane.

  • project_point(self, *point: float) -> np.array

    Projects a point onto the hyperplane.

    Arguments:

    • point (float): The vector/point to project.

    Returns:

    • np.array: The projected point.

Example:


normal = np.array([1, 1, 1])

coef = 3

hyperplane = Hyperplane(normal, coef)

projected_point = hyperplane.project_point(2, 4, 0)

print(projected_point)



>> np.array([1, 2, 0])


Probability and Random Variables Functions

sterlings_approximation(n: int) -> float

Returns an approximation of n! using Sterling's approximation.

Arguments:

  • n (int): The input number.

Returns:

  • float: The approximate factorial of n.

Example:


n = 10

approx_factorial = sterlings_approximation(n)

print(approx_factorial)



>>> 3598695.6187410373


permutate(n: int, r: int) -> int

Calculates permutations of n objects taken r at a time (using Sterling's if n is too large)

Arguments:

  • n (int): Number of objects.

  • r (int): Number you are choosing where order matters.

Returns:

  • int: n permutate r.

Example:


n = 5

r = 3

perm_result = permutate(n, r)

print(perm_result)



>> 60


combinate(n: int, r: int) -> int

Calculates combinations of n objects taken r at a time where order does not matter.

Arguments:

  • n (int): Number of objects.

  • r (int): Number you are choosing.

Returns:

  • int: n combinate r.

Example:


n = 5

r = 3

comb_result = combinate(n, r)

print(comb_result)



>> 10


discrete_distribution_prob(exp: pd.Series, obs: pd.Series) -> float

Calculates the exact probability of observing the observed distribution given the expected distribution. Note: scale does not matter (i.e., the sum of obs vs. the sum of exp does not matter as the exp is converted to a probability)

Arguments:

  • exp (pd.Series): The ground truth (expected) distribution.

  • obs (pd.Series): The observed distribution.

Returns:

  • float: The probability of observing the distribution.

Example:


exp = pd.Series([50, 50, 50])

obs = pd.Series([2, 1, 2])

prob = discrete_distribution_prob(exp, obs)

print(prob)



>>> 0.1234


generate_combinations(num_classes: int, num_obs: int) -> set

Returns a set of all possible combinations of num_classes integers that add up to num_obs.

Arguments:

  • num_classes (int): Number of classes to choose from.

  • num_obs (int): Total number the classes should sum.

Returns:

  • set: The set of all possible combinations.

Example:


num_classes = 2

num_obs = 4

combinations = generate_combinations(num_classes, num_obs)

print(combinations)



>> {(0, 4), (1, 3), (2, 2), (3, 1), (4, 0)}


generate_normal_exponent(mean: float, std_dev: float) -> Callable

Generates a function representing the exponent of a normal distribution with the specified mean and standard deviation.

Arguments:

  • mean (float): Mean (mu) of the normal distribution.

  • std_dev (float): Standard deviation (sigma) of the normal distribution.

Returns:

  • Callable: A function representing the exponent.

Example:


mean = 0

std_dev = 1

normal_exp = generate_normal_exponent(mean, std_dev)

normal_exp = the functional equivalent to $- \frac{1}{2} \cdot (\frac{x - \mu}{\sigma})^2$ where $\mu$ = mean and $\sigma$ = std_dev


generate_joint_pdf(exp: pd.Series, num_obs: int) -> Callable

Generates a joint probability density function (PDF) for all possible outcomes based on the expected distribution and the total number of observations.

Arguments:

  • exp (pd.Series): The ground truth (expected) distribution.

  • num_obs (int): The number of observations.

Returns:

  • Callable: The joint PDF function.

Explanation:

  1. Approximates each classes distribution with a Normal PDF

  2. Multiplies each classes approximation to get a Joint PDF

Example:


exp = pd.Series([4, 6])

num_obs = 100

joint_pdf = generate_joint_pdf(exp, num_obs)

joint_pdf = the functional equivalent to $\frac{1}{\sqrt(2\cdot\pi\cdot40\cdot\frac{6}{10})\sqrt(2\cdot\pi\cdot60\cdot\frac{4}{10})} \cdot e^{- \frac{1}{2} \cdot (\frac{x - 40}{\sqrt(40\cdot\frac{6}{10}})^2 - \frac{1}{2} \cdot (\frac{y - 60}{\sqrt(60\cdot\frac{4}{10}})^2}$


Calculus Functions

hyperplane_integration(f: Callable, hyperplane: list, max_val: float = None, chunk_size: int = "auto", num_samples: int = "auto", random_state: int = 42, pbar: Callable = None) -> float

Integrates the PDF over an N-d hyperplane using quasi-Monte Carlo integration (Sobol sampling) - Currently only supports integration in the positive quadrant.

Arguments:

  • f (Callable): The function to integrate.

  • hyperplane (object): The hyperplane over which to integrate.

  • max_val (float): The max value at which to cap integration (defaulted to None) - any region in which the function goes beyond that value is not counted.

  • chunk_size (int): The amount of samples to handle at one time (defaulted to auto).

  • random_state (int): Random state to use to ensure the integration is deterministic.

  • pbar (tqdm): Progress bar to update with every chunk completed (defaulted to None)

Returns:

  • float: The result of integration.

Example:


f = lambda x, y, z: x + y + z

hyperplane = Hyperplane(normal=np.array([1, 1, 1]), coef=3)

result = hyperplane_integration(f, hyperplane)

print(result)



>> 13.5


Distribution Analysis Functions

E(exp: pd.Series, obs: pd.Series, approximate: bool, chunk_size: int = "auto", num_samples: int = "auto", random_state: int = None) -> float

Performs an E-test on an expected distribution and observed distribution.

Arguments:

  • exp (pd.Series): The expected (ground-truth) distribution.

  • obs (pd.Series): The observed distribution.

  • approximate (bool): If False, the exact discrete probability is calculated; if True, an approximate is calculated based on continuous probability.

  • chunk_size (int): The amount of samples to do simultaneously (defaulted to "auto").

  • num_samples (int): The number of samples to calculate in total - lower is faster but less precise.

  • random_state (int): If specified, leads to deterministic results.

Returns:

  • float: The E-value.

Explanation:

  • The E-test seeks to generate a more interpretable and accurate probability value (p-value) for testing the statistical difference between two distributions

  • The E-test assumes the expected and observed distributions are identical, and under those assumptions, calculates an E-value which is the probability of receiving a distribution more Extreme or as Extreme than that which has been observed.

  • Thus, the lower the E-value (i.e., the lower the chances of receiving a distribution that extreme if the distributions were in fact identical), the greater the indication that the distributions are different

  • The exact E-value can be calculated using discrete probability, however, an continuous probability estimate must be calculated in cases where there are many observations

  • Note: time complexity in either case is exponential so while continuous can approximate larger observations, it may take a significant amount of time for massive samples without some method of scaling them down (to be researched)

Example:


exp = pd.Series([50, 50, 50])

obs = pd.Series([300, 300, 300])

e_value = E(exp, obs, approximate=True)

print(e_value)



>> 1.0





exp = pd.Series([50, 0, 0])

obs = pd.Series([100, 0, 0])

e_value = E(exp, obs, approximate=True)

print(e_value)



>> 0





exp = pd.Series([15, 15, 15])

obs = pd.Series([155, 145, 150])

e_value = E(exp, obs, approximate=True)

print(e_value)



>> 0.77743



License

This project is licensed under the MIT License.

This README file provides detailed documentation for each function and class, including arguments, return values, and example usage. You can adjust the details based on your specific project and needs.

Written with StackEdit.

Keywords

FAQs


Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc