HySnappy
Snappy decompression with WebAssembly.
A fast, minimal snappy decompression implementation in C built for WASM.
Snappy compression was released by Google in 2011 with the goal of very high speeds and reasonable compression.
Snappy is used in various applications.
For example, snappy is the default compression format for Apache Parquet files.
Usage
The snappyUncompress
function expects as arguments: a typed array compressed
, and an outputLength
parameter.
The length is needed to know how much wasm memory to allocate.
For formats like parquet, this length will generally be known in advance.
To decompress a Uint8Array
with known output length:
import { snappyUncompress } from 'hysnappy'
const compressed = new Uint8Array([
0x0a, 0x24, 0x68, 0x79, 0x70, 0x65, 0x72, 0x70, 0x61, 0x72, 0x61, 0x6d
])
const outputLength = 10
const output = snappyUncompress(compressed, outputLength)
Hyparquet
Hysnappy was built specifically to accelerate the the hyparquet parquet parsing library.
Hysnappy exports a loader function snappyUncompressor()
which loads the WASM module once, and returns a pre-loaded version of snappyUncompress
function.
To use hysnappy with hyparquet:
import { parquetRead } from 'hyparquet'
import { snappyUncompressor } from 'hysnappy'
parquetRead({ file, compressors: {
SNAPPY: snappyUncompressor(),
}})
Development
The build uses clang without emscripten, in order to produce the smallest possible binary.
Run make
to build from source. The build process consists of:
- Compile from
snappy.c
to hysnappy.wasm
using clang
. - Encode
hysnappy.wasm
as base64 to hysnappy.wasm.base64
. - Insert base64 string into
hysnappy.js
for distribution.
WASM Loading
By keeping hysnappy.wasm
under 4kb, we can include it directly in the hysnappy.js
file and load the WASM blob synchronously, which is faster than loading a separate .wasm
file. [web.dev]
References