New Research: Supply Chain Attack on Axios Pulls Malicious Dependency from npm.Details →
Socket
Book a DemoSign in
Socket

robust-average

Package Overview
Dependencies
Maintainers
1
Versions
5
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

robust-average

A Python package that intelligently selects the most robust average (mean, median, or mode) for price analysis

pipPyPI
Version
0.1.6
Maintainers
1

Robust Average

A Python package that intelligently selects the most robust average (mean, median, or mode) for price analysis based on outlier and skewness detection.

Installation

pip install robust-average

Quick Start

from robust_average import robust_average

# Example with clean data
prices = [97.87, 109.99, 129.99, 89.99, 119.99]
result = robust_average(prices)
print(f"Selected average: {result['value']} (method: {result['method']})")
# Output: Selected average: 109.99 (method: mean)

# Example with outliers
prices_with_outlier = [97.87, 109.99, 129.99, 89.99, 119.99, 500.00]
result = robust_average(prices_with_outlier)
print(f"Selected average: {result['value']} (method: {result['method']})")
# Output: Selected average: 109.99 (method: median)

Features

  • Automatic Method Selection: Intelligently chooses between mean, median, or mode
  • Outlier Detection: Uses IQR method to identify and handle outliers
  • Skewness Analysis: Measures data distribution asymmetry
  • Transparent Results: Returns the method used and supporting statistics
  • Business Ready: Ensures accurate price reporting for contracts and compliance

How It Works

The function uses a systematic approach to select the most appropriate average:

  • Outlier Detection (IQR Method):

    • Calculates Q1 (25th percentile) and Q3 (75th percentile)
    • Defines outliers as values outside [Q1 - 1.5×IQR, Q3 + 1.5×IQR]
  • Skewness Analysis:

    • Calculates skewness coefficient using scipy.stats.skew()
    • Values close to 0 indicate symmetric distribution
  • Decision Criteria:

    • Use MEAN if: No outliers AND |skewness| < 0.5
    • Use MEDIAN if: Outliers present OR |skewness| ≥ 0.5
    • Use MODE if: A single value appears in >50% of the dataset

🛠️ Process Flow

Robust Average Process Flow

API Reference

robust_average(prices, return_all_stats=False)

Parameters:

  • prices (list or pd.Series): List or Series of numeric prices
  • return_all_stats (bool): If True, returns all computed statistics

Returns:

{
    'value': selected_average_value,
    'method': 'mean' | 'median' | 'mode',
    'mean': mean_value,
    'median': median_value,
    'mode': mode_value_or_None,
    'std': standard_deviation,
    'skew': skewness,
    'outliers': list_of_outlier_values,
    'count': number_of_prices
}

Use Cases

  • Price Analysis: Ensure accurate average prices for reporting
  • Contract Negotiations: Use defensible statistics in pricing discussions
  • Compliance Reporting: Meet regulatory requirements with robust averages
  • Data Quality: Automatically handle messy, real-world price data

Requirements

  • Python >= 3.8
  • numpy >= 1.20.0
  • pandas >= 1.3.0
  • scipy >= 1.7.0

License

MIT License

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

FAQs

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts