![Oracle Drags Its Feet in the JavaScript Trademark Dispute](https://cdn.sanity.io/images/cgdhsj6q/production/919c3b22c24f93884c548d60cbb338e819ff2435-1024x1024.webp?w=400&fit=max&auto=format)
Security News
Oracle Drags Its Feet in the JavaScript Trademark Dispute
Oracle seeks to dismiss fraud claims in the JavaScript trademark dispute, delaying the case and avoiding questions about its right to the name.
Pure Python implementation of the XZ file format with random access support
Leveraging the lzma module for fast (de)compression
A XZ file can be composed of several streams and blocks. This allows for fast random
access when reading, but this is not supported by Python's builtin lzma
module (which
would read all previous blocks for nothing).
lzma | lzmaffi | python-xz | |
---|---|---|---|
module type | builtin | cffi (C extension) | pure Python |
📄 read | |||
random access | ❌ no1 | ✔️ yes2 | ✔️ yes2 |
several blocks | ✔️ yes | ✔️✔️ yes3 | ✔️✔️ yes3 |
several streams | ✔️ yes | ✔️ yes | ✔️✔️ yes4 |
stream padding | ❌ no5 | ✔️ yes | ✔️ yes |
📝 write | |||
w mode | ✔️ yes | ✔️ yes | ✔️ yes |
x mode | ✔️ yes | ❌ no | ✔️ yes |
a mode | ✔️ new stream | ✔️ new stream | ⏳ planned |
r+ /w+ /… modes | ❌ no | ❌ no | ✔️ yes |
several blocks | ❌ no | ❌ no | ✔️ yes |
several streams | ❌ no6 | ❌ no6 | ✔️ yes |
stream padding | ❌ no | ❌ no | ⏳ planned |
block_boundaries
attributestream_boundaries
attributeInstall python-xz
with pip:
$ python -m pip install python-xz
An unofficial package for conda is also available, see issue #5 for more information.
The API is similar to lzma: you can use either xz.open
or xz.XZFile
.
>>> with xz.open('example.xz') as fin:
... fin.read(18)
... fin.stream_boundaries # 2 streams
... fin.block_boundaries # 4 blocks in first stream, 2 blocks in second stream
... fin.seek(1000)
... fin.read(31)
...
b'Hello, world! \xf0\x9f\x91\x8b'
[0, 2000]
[0, 500, 1000, 1500, 2000, 3000]
1000
b'\xe2\x9c\xa8 Random access is fast! \xf0\x9f\x9a\x80'
Opening in text mode works as well, but notice that seek arguments as well as boundaries
are still in bytes (just like with lzma.open
).
>>> with xz.open('example.xz', 'rt') as fin:
... fin.read(15)
... fin.stream_boundaries
... fin.block_boundaries
... fin.seek(1000)
... fin.read(26)
...
'Hello, world! 👋'
[0, 2000]
[0, 500, 1000, 1500, 2000, 3000]
1000
'✨ Random access is fast! 🚀'
Writing is only supported from the end of file. It is however possible to truncate the file first. Note that truncating is only supported on block boundaries.
>>> with xz.open('test.xz', 'w') as fout:
... fout.write(b'Hello, world!\n')
... fout.write(b'This sentence is still in the previous block\n')
... fout.change_block()
... fout.write(b'But this one is in its own!\n')
...
14
45
28
Advanced usage:
r+
/w+
/x+
allow to open for both read and write at the same time;
however in the current implementation, a block with writing in progress is
automatically closed when reading data from it.check
, preset
and filters
arguments to xz.open
and xz.XZFile
allow to
configure the default values for new streams and blocks.change_block
method (the preset
and filters
attributes can
be changed beforehand to apply to the new block).change_stream
method (the check
attribute can be changed
beforehand to apply to the new stream).XZ files are made of a number of streams, and each stream is composed of a number of
block. This can be seen with xz --list
:
$ xz --list file.xz
Strms Blocks Compressed Uncompressed Ratio Check Filename
1 13 16.8 MiB 297.9 MiB 0.056 CRC64 file.xz
To read data from the middle of the 10th block, we will decompress the 10th block from its start it until we reach the middle (and drop that decompressed data), then returned the decompressed data from that point.
Choosing the good block size is a tradeoff between seeking time during random access and compression ratio.
You can open the file for writing and use the change_block
method to create several
blocks.
Other tools allow to create XZ files with several blocks as well:
$ xz -T0 file # threading mode
$ xz --block-size 16M file # same size for all blocks
$ xz --block-list 16M,32M,8M,42M file # specific size for each block
$ pixz file
As a general rule, all Python versions that are both released and still officially
supported are supported by python-xz
and tested against (both
CPython and PyPy implementations).
If you have other use cases or find issues with some Python versions, feel free to open a ticket!
FAQs
Pure Python implementation of the XZ file format with random access support
We found that python-xz demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Oracle seeks to dismiss fraud claims in the JavaScript trademark dispute, delaying the case and avoiding questions about its right to the name.
Security News
The Linux Foundation is warning open source developers that compliance with global sanctions is mandatory, highlighting legal risks and restrictions on contributions.
Security News
Maven Central now validates Sigstore signatures, making it easier for developers to verify the provenance of Java packages.