zlib-state
Low-level interface to the zlib library that enables capturing the decoding state.
Install
From PyPi:
pip install zlib-state
From source:
pip install .
Tested on Ubuntu/macOs/Windows with Python 3.7-3.12.
GzipStateFile
Wraps Decompressor as a buffered reader.
Based on my benchmarking, this is somewhat slower than python's gzip.
A typical usage pattern looks like:
import zlib_state
TARGET_LINE = 5000
with zlib_state.GzipStateFile('testdata/frankenstein.txt.gz', keep_last_state=True) as f:
for i, line in enumerate(f):
if i == TARGET_LINE:
state, pos = f.last_state, f.last_state_pos
with zlib_state.GzipStateFile('testdata/frankenstein.txt.gz') as f:
f.zseek(pos, state)
remainder = f.read()
Decompressor
Very basic decompression object that's picky and unforgiving.
Based on my benchmarking, this can iterate over gzip files faster than python's gzip.
A typical usage pattern looks like:
import zlib_state
decomp = zlib_state.Decompressor(32 + 15)
block_count = 0
with open('testdata/frankenstein.txt.gz', 'rb') as f:
while not decomp.eof():
needed_input = decomp.needs_input()
if needed_input > 0:
decomp.feed_input(f.read(needed_input))
next_chunk = decomp.read()
if decomp.block_boundary():
block_count += 1
if block_count == 4:
state = decomp.get_state()
pos = decomp.total_in()
print(f'{block_count} blocks processed')
f.seek(pos)
decomp = zlib_state.Decompressor(-15)
decomp.set_state(*state)
while not decomp.eof():
needed_input = decomp.needs_input()
if needed_input > 0:
decomp.feed_input(f.read(needed_input))
next_chunk = decomp.read()