vtrie
Trie structure supporting approximate string matching (substitutions only) for
Python (2.x and 3.x).
.. NOTE::
Only ascii strings are supported. Hence, Python 3 users should use
'b' prefix to insert strings into the trie.
Installation
::
pip install vtrie
Features
Usage
Create a trie::
>>> from vtrie import Trie
>>> t = Trie()
Add strings to the trie. Currently, only ascii strings are supported::
>>> t[b"Hello"] = 123
>>> t[b"World"] = {"my": "dict"}
>>> t[b"foo"] = None
Check if "Hello" is in the trie::
>>> b"Hello" in t
True
Show all inserted strings sharing the same prefix::
>>> t[b"foo"] = 0
>>> t[b"foobar"] = 1
>>> t[b"fooo"] = 2
>>> t[b"hello"] = 3
>>> list(t.suffixes(b"fo"))
[('o', 0), ('obar', 1), ('oo', 2)]
Search for keys that differ by less than a given number of substitutions from
the provided key. The results are tuples with first the Hamming distance
between the given key and the found key, then the found key and its value::
>>> t[b"hello world"] = 0
>>> t[b"*ello world"] = "a"
>>> t[b"*ell* world"] = "b"
>>> t[b"*ell* w*rld"] = "c"
>>> t[b"hell* w*rl*"] = "d"
>>> list(t.neighbors(b"hello world", maxhd = 1))
[(1, '*ello world', 'a')]
>>> list(t.neighbors(b"hello world", maxhd = 2))
[(1, '*ello world', 'a'), (2, '*ell* world', 'b')]
>>> print("\n".join(map(str,list(t.neighbors(b"hello world", 3)))))
(3, 'hell* w*rl*', 'd')
(1, '*ello world', 'a')
(2, '*ell* world', 'b')
(3, '*ell* w*rld', 'c')
Search for all keys of a certain length that are within a certain Hamming of
each other. The results are tuples with first the Hamming distance between the
found keys, then the first key and its value, and then the second key and
its value::
>>> print("\n".join(map(str,list(t.pairs(keylen = 11, maxhd = 2)))))
(1, 'hello world', 0, '*ello world', 'a')
(2, 'hello world', 0, '*ell* world', 'b')
(2, 'hell* w*rl*', 'd', '*ell* w*rld', 'c')
(1, '*ello world', 'a', '*ell* world', 'b')
(2, '*ello world', 'a', '*ell* w*rld', 'c')
(1, '*ell* world', 'b', '*ell* w*rld', 'c')