Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More
Socket
Sign inDemoInstall
Socket

mltable

Package Overview
Dependencies
Maintainers
2
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

mltable

Contains MLTable loading and authoring apis for the mltable package.

  • 1.6.1
  • PyPI
  • Socket score

Maintainers
2

mltable: machine learning table data toolkit

MLTable is a Python package that provides fast, flexible data loading functions designed to make accessing "tabular" data easy and intuitive. MLTable will help you to abstract the schema definition for tabular data so that it is easier to materialize the table into a Pandas dataframe. MlTable can be leveraged upon delimited text files, parquet files, delta lake, json-lines files from a cloud object store or local disk.

Main Features

Here are a few things that mltable does well:

  • Flexible sampling and filtering functionality on large data

  • Robust IO tools for loading data from  flat files (CSV and delimited), parquet files, delta lake and json-lines files

  • Capturing and defining schema contained in flat files

  • Fast materialization of data into Pandas DataFrame

Getting started

You can install MLTable package via pip.

pip install mltable

Please note MLTable package is pre-installed on AzureML compute instances.

Documentation

The official documentation is hosted on working with tables.

MLTable artifact’s metadata file is called  MLTable which adheres to the AzureML MLTable schema.

Release History

1.6.1 (2024-01-24)

Features Added

  • added authrization support
  • MLTable.save() bug fixes

1.5.0 (2023-08-14)

Features Added

  • MLTable.save() supports cloud storage. Please find more details here.
  • from_delta_lake supports pulling latest version by default

Bugs Fixed

  • Fix support_multi_line issue for MLTable.from_delimited_files

1.4.1 (2023-06-19)

Bugs Fixed

  • Relaxing cryptography library dependency to allow versions greater than 41..

1.4.0 (2023-05-31)

Features Added

  • Updating runtime dependencies
  • Improved error handling and argument validation

1.3.0 (2023-04-07)

Features Added

  • bugfix (user error mapping, mltable save/load roundtrip)

1.2.0 (2023-02-22)

Features Added

  • bugfix (mltable save/load, validation schema)

1.1.0 (2023-01-26)

Features Added

  • bugfix (fix schema, flake8 errors)
  • improve logging and exception message

1.0.0 (2022-12-05)

Features Added

  • factory apis(from_delta_lake)
  • Authoring apis(convert_column_types, save, skip etc)

0.1.0b4 (2022-10-05)

Features Added

  • Factory apis(from_paths, from_delimited_files, from_parquet_files, from_json_lines_files).
  • Authoring apis(keep_columns, drop_columns, take_random_sample, take etc).
  • Support mltable load from data asset uri

0.1.0b3 (2022-06-30)

0.1.0b2 (2022-05-23)

0.1.0b1 (2022-05-17)

Features Added

  • Initial public preview release to load into pandas dataframe

FAQs


Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc