![Create React App Officially Deprecated Amid React 19 Compatibility Issues](https://cdn.sanity.io/images/cgdhsj6q/production/04fa08cf844d798abc0e1a6391c129363cc7e2ab-1024x1024.webp?w=400&fit=max&auto=format)
Security News
Create React App Officially Deprecated Amid React 19 Compatibility Issues
Create React App is officially deprecated due to React 19 issues and lack of maintenance—developers should switch to Vite or other modern alternatives.
This is a Python (Cython) wrapper for the BBHash codebase for building minimal perfect hash functions.
Right now, this is supporting k-mer-based hashing needs from spacegraphcats, using hash values generated (mostly) by murmurhash, e.g. from khmer's Nodetable and sourmash hashing. As such, I am focused on building MPHF for 64-bit hashes and am wrapping only that bit of the interface; the rest should be ~straightforward (hah!).
I've also added a Python-accessible "values table", BBHashTable
, in
the bbhash_table
module. This is a table that supports a dictionary-like
feature where you can associate a hash with a value, and then query the
table with the hash to retrieve the value. The only tricky bit here is
that unlike the bbhash module, this table supports queries with hashes
that are not in the MPHF.
I would like to be able to use generic Python iterators in the PyMPHF construction. Right now there is a round of memory-inefficient copying of hashes, which is bad when you have a lot of k-mers!
I would like to be able to save to/load from strings, not just files.
I also need to investigate thread safety.
import bbhash
# some collection of 64-bit (or smaller) hashes
uint_hashes = [10, 20, 50, 80]
num_threads = 1 # hopefully self-explanatory :)
gamma = 1.0 # internal gamma parameter for BBHash
mph = bbhash.PyMPHF(uint_hashes, len(uint_hashes), num_threads, gamma)
for val in uint_hashes:
print('{} now hashes to {}'.format(val, mph.lookup(val)))
# can also use 'mph.save(filename)' and 'mph = bbhash.load_mphf(filename)'.
import random
from collections import defaultdict
from bbhash_table import BBHashTable
all_hashes = [ random.randint(100, 2**32) for i in range(200) ]
half_hashes = all_hashes[:100]
table = BBHashTable()
# hash the first 100 of the hashes
table.initialize(half_hashes)
# store associated values
for hashval, value in zip(half_hashes, [ 1, 2, 3, 4, 5 ] *20):
table[hashval] = value
# retrieve & count for all (which will include hashes not in MPHF)
d = defaultdict(int)
for hashval in all_hashes:
value = table[hashval]
d[value] += 1
assert d[1] == 20
assert d[None] == 100
The last for loop can be done quickly, in Cython, using
d = table.get_unique_values(all_hashes)
Motivation: the table is a useful way to (just for one hypothetical example :) store a mapping from k-mers to compact De Bruijn graph node IDs. (We use this in several places in spacegraphcats!)
CTB Oct 2020
FAQs
A Python wrapper for the BBHash Minimal Perfect Hash Function
We found that bbhash demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Create React App is officially deprecated due to React 19 issues and lack of maintenance—developers should switch to Vite or other modern alternatives.
Security News
Oracle seeks to dismiss fraud claims in the JavaScript trademark dispute, delaying the case and avoiding questions about its right to the name.
Security News
The Linux Foundation is warning open source developers that compliance with global sanctions is mandatory, highlighting legal risks and restrictions on contributions.