You're Invited: Meet the Socket team at BSidesSF and RSAC - April 27 - May 1.RSVP
Socket
Sign inDemoInstall
Socket

msms-compression

Package Overview
Dependencies
Maintainers
1
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

msms-compression

A tool for compressing MS/MS data

0.3.0
PyPI
Maintainers
1

MS/MS Data Compression Package

Description

This Python package is designed for efficient compression of Mass Spectrometry (MS/MS) data. It is based on the MassComp algorithm, which is described in the following paper: https://doi.org/10.1186/s12859-019-2962-7

Version

0.2.0

Features

  • Delta and Hex Encoding: Efficiently encodes m/z values and intensities to optimize the compression.
  • Brotli Compression: Utilizes Brotli, a high-performance compression algorithm, offering superior compression ratios and speeds compared to gzip.

Installation

To install the MS/MS Data Compression package, run:

pip install msms-compression

Usage

The package includes the following main compressor classes:

  • SpectrumCompressorUrl: Utilizes URL-safe Base64 encoding.
  • SpectrumCompressor: Uses Base85 encoding.
  • Note: The m/z values must be sorted in ascending order before compression, and contain only positive values.

Example:

from msms_compression import SpectrumCompressorF32

# Sample data
mz_values, intensity_values = [100.0, 101.0, 102.0], [10.0, 20.0, 30.0]

# Initialize the compressor
compressor = SpectrumCompressorF32()

# Compress data
compressed_data = compressor.compress(mz_values, intensity_values)
print("Compressed Data:", compressed_data)

# Decompress data
decompressed_mz, decompressed_intensity = compressor.decompress(compressed_data)
assert decompressed_mz == mz_values
assert decompressed_intensity == intensity_values

Compression Strategy Comparison

strategyCompression RatioCompression Ratio RankURL Compression RatioURL Compression Ratio RankCompression TimeCompression Time RankDecompression TimeDecompression Time Rank
SpectrumCompressorLossy5.95215.02320.03050.0084
SpectrumCompressorUrlLossy5.57926.92610.03040.0071
SpectrumCompressor3.89033.29960.05370.0106
SpectrumCompressorUrl3.64644.52830.05160.0083
SpectrumCompressorGzip3.14852.65870.02320.0095
SpectrumCompressorUrlGzip2.95163.66540.02210.0072
SpectrumCompressorUrlLzstring2.80073.41850.02630.0977
scanstrategyoriginal_sizecompressed_sizeurl_encoded_sizecompression_ratiourl_compression_ratiocompressed_timedecompressed_time
0SpectrumCompressor5612414428211393.889936235098423.2990680732295760.0530495643615722660.009985208511352539
0SpectrumCompressorUrl5612415392154013.6463097713097714.5282124537367710.0510156154632568360.00789642333984375
0SpectrumCompressorGzip5612417829262363.14790509843513362.65814148498246670.022999763488769530.009003639221191406
0SpectrumCompressorUrlGzip5612419020190292.9507886435331233.6648799201219190.0219960212707519530.007005453109741211
0SpectrumCompressorUrlLzstring5612420041204022.80045905892919533.4182433094794630.0260980129241943360.09739089012145996
0SpectrumCompressorLossy561249429138845.9522748965956095.0229760875828290.0301141738891601560.007976055145263672
0SpectrumCompressorUrlLossy5612410060100695.5789264413518886.92610984208958150.0300149917602539060.006910562515258789

The method compresses intensity values into two-character hexadecimal strings, offering 256 unique representations. This is a lossy approach, effectively reducing data size. Meanwhile, m/z values are compressed losslessly using delta encoding, maintaining their exact accuracy.

FAQs

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts