Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More
Socket
Sign inDemoInstall
Socket

dataforgetoolkit

Package Overview
Dependencies
Maintainers
1
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

dataforgetoolkit

It is a library that facilitates converting CSV files to various formats (such as DataFrames or other CSV/Excel files) based on a JSON mapping

  • 1.0.6
  • PyPI
  • Socket score

Maintainers
1

Project Title

DataForgeToolkit: Flexible Data Mapping for CSV/XLSX Files

Description

The DataForgeToolkit is a Python library designed to streamline the process of converting CSV or Excel files into customized DataFrames based on user-defined JSON mapping configurations. Whether you're working with financial reports, customer datasets, or any other structured data, this toolkit empowers you to effortlessly transform raw data into actionable insights.

Features: Versatile File Support: Seamlessly process both CSV and Excel files, providing flexibility in handling various data formats commonly encountered in data analysis tasks.

Customizable Mapping: Define transformation mappings using a JSON file, allowing for precise specification of column names, data cleaning, and value substitutions tailored to your specific data requirements.

Efficient Data Processing: Automate data preprocessing tasks such as handling missing values, standardizing column names, and applying complex value mappings with ease.

Installation Usage/Examples

  pip install dataforgetoolkit

Define Transformation Mapping:

Create a JSON file specifying the transformation mappings for your data. Define column mappings, specify new column names, and define value substitutions as needed.

Use the Toolkit:

Import the DataForgeToolkit in your Python script and utilize the map function to convert your report files:

    from dataforgetoolkit import datamapper
    datamapper.map('report file path csv / xlsx format','mapping json file path')

Access Mapped Data:

Access the transformed data as a DataFrame for further analysis or export to other formats.

Transformation Functions Available

DEFAULT_VALUE = "*"
FILTER_VALUE = "FILTER"
REPLACE_VALUE = "REPLACE_"
CONCAT_VALUE = "CONCAT"
UPPERCASE_VALUE = "UPPERCASE"
LOWERCASE_VALUE = "LOWERCASE"
REGEX_VALUE = "REGEX_"

JSON Transformation Mapping

Transformation mappings are specified using a JSON file. Example:

{ "transformation_mapping": [ { "column": "Name", "new_name": "Student Name", "value_mappings": [ { "*": "Amit Singh" } ] }, { "column": "Age_Column", "new_name": "Age", "value_mappings": [ { "FILTER": "30" } ] }, { "column": "Location", "new_name": "Country", "value_mappings": [ { "REPLACE_usa": "United state of America" } ] }, { "column": "Gender", "new_name": "Sex", "value_mappings": [ { "MALE": "M", "FEMALE": "F" } ] }, { "column": "Zipcode_Column", "new_name": "Processed_Text_regex", "value_mappings": [ { "REPLACE_hello": "hi", "REGEX_[0-9]+": "NUMBER" } ] } ] }

Authors

  • @amitsingh

  • Software Engineer

Contributing

Contributions are always welcome!

Please adhere to this project's code of conduct.

Suggest code and open PR/MR

Used By

'Intended Audience' :: Developers , Testers , BA

FAQs


Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc