fuzzyset - A fuzzy string set for javascript.
fuzzyset is a data structure that performs something akin to fulltext search
against data to determine likely mispellings and approximate string matching.
Note that this is a javascript port of a python library.
Usage
The usage is simple. Just add a string to the set, and ask for it later
by using .get
:
a = FuzzySet();
a.add("michael axiak");
a.get("micael asiak");
The result will be an array of [score, matched_value]
arrays.
The score is between 0 and 1, with 1 being a perfect match.
Construction Arguments
array
: An array of strings to initialize the data structure withuseLevenshtein
: Whether or not to use the levenshtein distance to determine the match scoring. Default: TruegramSizeLower
: The lower bound of gram sizes to use, inclusive (see Theory of operation). Default: 2gramSizeUpper
: The upper bound of gram sizes to use, inclusive (see Theory of operation). Default: 3
Methods
get(value, [default], [minScore=.33])
: try to match a string to entries with a score of at least minScore (defaulted to .33), otherwise return null
or default
if it is given.add(value)
: add a value to the set returning false
if it is already in the set.length()
: return the number of items in the set.isEmpty()
: returns true if the set is empty.values()
: returns an array of the values in the set.
Interactive Documentation
To play with the library or see how it works internally, check out the amazing interactive documentation:
Install
this:
<script type="text/javascript" src="/path/to/fuzzyset.js"></script>
or:
npm install fuzzyset.js
(the .js
is important). In a CommonJS environment, the fuzzyset.js
module exports the FuzzySet
function.
License
This package is licensed under the Prosperity Public License 3.0.
That means that this package is free to use for non-commercial projects — personal projects, public benefit projects, research, education, etc. (see the license for full details). If your project is commercial (even for internal use at your company), you have 30 days to try this package for free before you have to pay a one-time licensing fee of $49 (~one hour of a junior developer's time).
You can purchase a commercial license instantly here.
Why this license scheme? Since I quit tech to become a therapist, my income is much lower (due to the unjust costs of mental health care in the US, but don't get me started). I'm asking for paid licenses for Fuzzyset.js to support all the free work I've done on this project over the past 8 years (!) and so I can live a sustainable life in service of my therapy clients. If you're a small operation that would like to use Fuzzyset.js but can't swing the license, please reach out to me and we can work something out.