You're Invited:Meet the Socket Team at BlackHat and DEF CON in Las Vegas, Aug 4-6.RSVP
Socket
Book a DemoInstallSign in
Socket

data-prep-toolkit-transforms

Package Overview
Dependencies
Maintainers
4
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

data-prep-toolkit-transforms

Data Preparation Toolkit Transforms using Ray

1.1.2.post1
pipPyPI
Maintainers
4

DPK Python Transforms

installation

The transforms are delivered as a standard pyton library available on pypi and can be installed using pip install:

python -m pip install data-prep-toolkit-transforms[all] or python -m pip install data-prep-toolkit-transforms[ray, all] or python -m pip install data-prep-toolkit-transforms[language]

installing the python transforms will also install data-prep-toolkit

installing the ray transforms will also install data-prep-toolkit[ray]

Release notes:

1.1.1.dev1

Include all code transforms as extra [code]

1.1.1.dev0

Refactored code transforms (code_uality, code2parquet, header_cleanser, license select, proglang_select)
Added ml-filter and enrichment
renamed PDF2Parquet to Docling2Paruqet 

1.0.1.dev1

Added Gneissweb transforms
fdedup fix for windows

1.0.1.dev0

PR #979 (code_profiler)

1.0.0.a6

Added Profiler
Added Resize

1.0.0.a5

Added Pii Redactor
Relax fasttext requirement >= 0.9.2

1.0.0.a4

Added missing ray implementation for lang_id, doc_quality, tokenization and filter
Added ray notebooks for lang id, Doc Quality, tokenization, and Filter

1.0.0.a3

Added code_profiler

1.0.0.a2

Relax dependencies on pandas (use latest or whatever is installed by application) Relax dependencies on requests (use latest or whatever is installed by application)

Keywords

transforms

FAQs

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts