Security News
Research
Data Theft Repackaged: A Case Study in Malicious Wrapper Packages on npm
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
This package provides encoding and decoding routines that enable the serialization and deserialization of numerical and array data types provided by numpy using the highly efficient msgpack format. Serialization of Python's native complex data types is also supported.
msgpack-numpy requires msgpack-python and numpy. If you have pip installed on your system, run
pip install msgpack-numpy
to install the package and all dependencies. You can also download the source tarball, unpack it, and run
python setup.py install
from within the source directory.
The easiest way to use msgpack-numpy is to call its monkey patching function after importing the Python msgpack package:
import msgpack
import msgpack_numpy as m
m.patch()
This will automatically force all msgpack serialization and deserialization routines (and other packages that use them) to become numpy-aware. Of course, one can also manually pass the encoder and decoder provided by msgpack-numpy to the msgpack routines:
import msgpack
import msgpack_numpy as m
import numpy as np
x = np.random.rand(5)
x_enc = msgpack.packb(x, default=m.encode)
x_rec = msgpack.unpackb(x_enc, object_hook=m.decode)
msgpack-numpy will try to use the binary (fast) extension in msgpack by default.
If msgpack was not compiled with Cython (or if the MSGPACK_PUREPYTHON
variable is set), it will fall back to using the slower pure Python msgpack
implementation.
The primary design goal of msgpack-numpy is ensuring preservation of numerical data types during msgpack serialization and deserialization. Inclusion of type information in the serialized data necessarily incurs some storage overhead; if preservation of type information is not needed, one may be able to avoid some of this overhead by writing a custom encoder/decoder pair that produces more efficient serializations for those specific use cases.
Numpy arrays with a dtype of 'O' are serialized/deserialized using pickle as a fallback solution to enable msgpack-numpy to handle such arrays. As the additional overhead of pickle serialization negates one of the reasons to use msgpack, it may be advisable to either write a custom encoder/decoder to handle the specific use case efficiently or else not bother using msgpack-numpy.
Note that numpy arrays deserialized by msgpack-numpy are read-only and must be copied if they are to be modified.
The latest source code can be obtained from GitHub.
msgpack-numpy maintains compatibility with python versions 2.7 and 3.5+.
Install tox
to support testing
across multiple python versions in your development environment. If you
use conda
to install python
use
tox-conda
to automatically manage
testing across all supported python versions.
# Using a system python
pip install tox
# Additionally, using a conda-provided python
pip install tox tox-conda
Execute tests across supported python versions:
tox
See the included AUTHORS.md file for more information.
This software is licensed under the BSD License. See the included LICENSE.md file for more information.
FAQs
Numpy data serialization using msgpack
We found that msgpack-numpy demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Research
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
Research
Security News
Attackers used a malicious npm package typosquatting a popular ESLint plugin to steal sensitive data, execute commands, and exploit developer systems.
Security News
The Ultralytics' PyPI Package was compromised four times in one weekend through GitHub Actions cache poisoning and failure to rotate previously compromised API tokens.