Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More
Socket
Sign inDemoInstall
Socket

blosc2

Package Overview
Dependencies
Maintainers
3
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

blosc2

Python wrapper for the C-Blosc2 library

  • 2.7.1
  • PyPI
  • Socket score

Maintainers
3

============= Python-Blosc2

A Python wrapper for the extremely fast Blosc2 compression library

:Author: The Blosc development team :Contact: blosc@blosc.org :Github: https://github.com/Blosc/python-blosc2 :Actions: |actions| :PyPi: |version| :NumFOCUS: |numfocus| :Code of Conduct: |Contributor Covenant|

.. |version| image:: https://img.shields.io/pypi/v/blosc2.svg :target: https://pypi.python.org/pypi/blosc2 .. |Contributor Covenant| image:: https://img.shields.io/badge/Contributor%20Covenant-v2.0%20adopted-ff69b4.svg :target: https://github.com/Blosc/community/blob/master/code_of_conduct.md .. |numfocus| image:: https://img.shields.io/badge/powered%20by-NumFOCUS-orange.svg?style=flat&colorA=E1523D&colorB=007D8A :target: https://numfocus.org .. |actions| image:: https://github.com/Blosc/python-blosc2/actions/workflows/build.yml/badge.svg :target: https://github.com/Blosc/python-blosc2/actions/workflows/build.yml

What it is

C-Blosc2 <https://github.com/Blosc/c-blosc2>_ is the new major version of C-Blosc <https://github.com/Blosc/c-blosc>_, and is backward compatible with both the C-Blosc1 API and its in-memory format. Python-Blosc2 is a Python package that wraps C-Blosc2, the newest version of the Blosc compressor.

Currently Python-Blosc2 already reproduces the API of Python-Blosc <https://github.com/Blosc/python-blosc>, so it can be used as a drop-in replacement. However, there are a few exceptions for a full compatibility. <https://github.com/Blosc/python-blosc2/blob/main/RELEASE_NOTES.md#changes-from-python-blosc-to-python-blosc2>

In addition, Python-Blosc2 aims to leverage the full C-Blosc2 functionality to support super-chunks (SChunk <https://www.blosc.org/python-blosc2/reference/schunk_api.html>), multi-dimensional arrays (NDArray <https://www.blosc.org/python-blosc2/reference/ndarray_api.html>), metadata, serialization and other bells and whistles introduced in C-Blosc2.

Note: Python-Blosc2 is meant to be backward compatible with Python-Blosc data. That means that it can read data generated with Python-Blosc, but the opposite is not true (i.e. there is no forward compatibility).

SChunk: a 64-bit compressed store

A SChunk <https://www.blosc.org/python-blosc2/reference/schunk_api.html>_ is a simple data container that handles setting, expanding and getting data and metadata. Contrarily to chunks, a super-chunk can update and resize the data that it contains, supports user metadata, and it does not have the 2 GB storage limitation.

Additionally, you can convert a SChunk into a contiguous, serialized buffer (aka cframe <https://github.com/Blosc/c-blosc2/blob/main/README_CFRAME_FORMAT.rst>_) and vice-versa; as a bonus, the serialization/deserialization process also works with NumPy arrays and PyTorch/TensorFlow tensors at a blazing speed:

.. |compress| image:: https://github.com/Blosc/python-blosc2/blob/main/images/linspace-compress.png?raw=true :width: 100% :alt: Compression speed for different codecs

.. |decompress| image:: https://github.com/Blosc/python-blosc2/blob/main/images/linspace-decompress.png?raw=true :width: 100% :alt: Decompression speed for different codecs

+----------------+---------------+ | |compress| | |decompress| | +----------------+---------------+

while reaching excellent compression ratios:

.. image:: https://github.com/Blosc/python-blosc2/blob/main/images/pack-array-cratios.png?raw=true :width: 75% :align: center :alt: Compression ratio for different codecs

Also, if you are a Mac M1/M2 owner, make you a favor and use its native arm64 arch (yes, we are distributing Mac arm64 wheels too; you are welcome ;-):

.. |pack_arm| image:: https://github.com/Blosc/python-blosc2/blob/main/images/M1-i386-vs-arm64-pack.png?raw=true :width: 100% :alt: Compression speed for different codecs on Apple M1

.. |unpack_arm| image:: https://github.com/Blosc/python-blosc2/blob/main/images/M1-i386-vs-arm64-unpack.png?raw=true :width: 100% :alt: Decompression speed for different codecs on Apple M1

+------------+--------------+ | |pack_arm| | |unpack_arm| | +------------+--------------+

Read more about SChunk features in our blog entry at: https://www.blosc.org/posts/python-blosc2-improvements

NDArray: an N-Dimensional store

One of the latest and more exciting additions in Python-Blosc2 is the NDArray <https://www.blosc.org/python-blosc2/reference/ndarray_api.html>_ object. It can write and read n-dimensional datasets in an extremely efficient way thanks to a n-dim 2-level partitioning, allowing to slice and dice arbitrary large and compressed data in a more fine-grained way:

.. image:: https://github.com/Blosc/python-blosc2/blob/main/images/b2nd-2level-parts.png?raw=true :width: 75%

To wet you appetite, here it is how the NDArray object performs on getting slices orthogonal to the different axis of a 4-dim dataset:

.. image:: https://github.com/Blosc/python-blosc2/blob/main/images/Read-Partial-Slices-B2ND.png?raw=true :width: 75%

We have blogged about this: https://www.blosc.org/posts/blosc2-ndim-intro

We also have a ~2 min explanatory video on why slicing in a pineapple-style (aka double partition) is useful <https://www.youtube.com/watch?v=LvP9zxMGBng>_:

.. image:: https://github.com/Blosc/blogsite/blob/master/files/images/slicing-pineapple-style.png?raw=true :width: 50% :alt: Slicing a dataset in pineapple-style :target: https://www.youtube.com/watch?v=LvP9zxMGBng

Installing

Blosc is now offering Python wheels for the main OS (Win, Mac and Linux) and platforms. You can install binary packages from PyPi using pip:

.. code-block:: console

pip install blosc2

Documentation

The documentation is here:

https://blosc.org/python-blosc2/python-blosc2.html

Also, some examples are available on:

https://github.com/Blosc/python-blosc2/tree/main/examples

Building from sources

python-blosc2 comes with the C-Blosc2 sources with it and can be built in-place:

.. code-block:: console

git clone https://github.com/Blosc/python-blosc2/
cd python-blosc2
git submodule update --init --recursive
python -m pip install -r requirements-build.txt
python setup.py build_ext --inplace

That's all. You can proceed with testing section now.

Testing

After compiling, you can quickly check that the package is sane by running the tests:

.. code-block:: console

python -m pip install -r requirements-tests.txt
python -m pytest  (add -v for verbose mode)

Benchmarking

If curious, you may want to run a small benchmark that compares a plain NumPy array copy against compression through different compressors in your Blosc build:

.. code-block:: console

 PYTHONPATH=. python bench/pack_compress.py

License

The software is licenses under a 3-Clause BSD license. A copy of the python-blosc2 license can be found in LICENSE.txt <https://github.com/Blosc/python-blosc2/tree/main/LICENSE.txt>_.

Mailing list

Discussion about this module is welcome in the Blosc list:

blosc@googlegroups.com

https://groups.google.es/group/blosc

Twitter

Please follow @Blosc2 <https://twitter.com/Blosc2>_ to get informed about the latest developments.

Citing Blosc

You can cite our work on the different libraries under the Blosc umbrella as:

.. code-block:: console

@ONLINE{blosc, author = {{Blosc Development Team}}, title = "{A fast, compressed and persistent data store library}", year = {2009-2023}, note = {https://blosc.org} }


Enjoy!

FAQs


Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc