daffy

Function decorators for Pandas and Polars Dataframe column name and data type validation

PyPI

Version: 0.16.1

Maintainers: 1

Daffy - DataFrame Column Validator

PyPI - Python Version

Description

Working with DataFrames often means passing them through multiple transformation functions, making it easy to lose track of their structure over time. Daffy adds runtime validation and documentation to your DataFrame operations through simple decorators. By declaring the expected columns and types in your function definitions, you can:

@df_in(columns=["price", "bedrooms", "location"])
@df_out(columns=["price_per_room", "price_category"])
def analyze_housing(houses_df):
    # Transform raw housing data into price analysis
    return analyzed_df

Like type hints for DataFrames, Daffy helps you catch structural mismatches early and keeps your data pipeline documentation synchronized with the code. Compatible with both Pandas and Polars.

Key Features

Validate DataFrame columns at function entry and exit points
Support regex patterns for matching column names (e.g., "r/column_\d+/")
Check data types of columns
Control strictness of validation (allow or disallow extra columns)
Works with both Pandas and Polars DataFrames
Project-wide configuration via pyproject.toml
Integrated logging for DataFrame structure inspection
Enhanced type annotations for improved IDE and type checker support

Documentation

Usage Guide - Detailed usage instructions
Development Guide - Guide for contributing to Daffy
Changelog - Version history and release notes

Installation

Install with your favorite Python dependency manager:

pip install daffy

Quick Start

from daffy import df_in, df_out

@df_in(columns=["Brand", "Price"])  # Validate input DataFrame columns
@df_out(columns=["Brand", "Price", "Discount"])  # Validate output DataFrame columns
def apply_discount(cars_df):
    cars_df = cars_df.copy()
    cars_df["Discount"] = cars_df["Price"] * 0.1
    return cars_df

License

MIT

Keywords

FAQs

What is daffy?

Is daffy well maintained?

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install