oxli
oxli is a powerful Rust library with a simple Python interface for counting k-mers
in genomic sequencing data.
Use oxli to bring fast kmer counting and comparison operations to your Python projects.
This library is written on top of the
sourmash
rust library, and the underlying
code for dealing with sequence data is well tested.
Installation
Quick setup
oxli is
available on conda-forge for Linux, Mac OS X, and Windows for Python versions 3.10, 3.11, and 3.12:
conda install oxli
This will install the oxli library for Python.
For developers
You can also try building oxli yourself and using it in development mode:
mamba env create -f environment.yml -n oxli
pip install -e '.[test]'
Getting Started
See the the oxli Wiki for documentation on the Python API.
Basic Usage
Initialise a new KmerCountTable
from oxli import KmerCountTable
kct = KmerCountTable(ksize=4)
Adding k-mer counts.
kct.count("AAAA")
>>> 1
kct.count("AAAA")
>>> 2
kct.count("TTTT")
>>> 3
kct.consume("GGGGGGGGGG")
Lookup counts by k-mer.
kct.get('GGGG')
>>> 7
kct.get('AAAA')
>>> 3
Extracting k-mers from files.
import screed
counts = KmerCountTable(ksize=21)
for record in screed.open('doc/example.fa'):
counts.consume(record.sequence)
>>> 349910
What's the history here?
First, oxli is channeling
khmer, a package written by
@ctb and many others. You shouldn't be too surprised to see useful
functionality from khmer making an appearance in oxli.
The khmer package was useful for inspecting large collections of
k-mers, but was hard to maintain and evolve.
In ~2016 @ctb's lab more or less switched over to developing
sourmash, which was initially built on a similar tech stack to khmer
(Python & C++).
At some point, @luizirber rewrote the sourmash C++ code into Rust.
This forced @ctb to learn Rust to maintain sourmash.
@ctb then decided he liked Rust an awful lot, and missed some of the
khmer functionality.
And, voila! oxli was born.
Authors
Author: C. Titus Brown (@ctb), ctbrown@ucdavis.edu
with with miscellaneous features by @Adamtaranto