New Case Study:See how Anthropic automated 95% of dependency reviews with Socket.Learn More →

wuzzy

Package Overview

Dependencies

Advanced tools

Install Socket

Detect and block malicious and high-risk dependencies

Install

wuzzy

library for simularity identification

0.1.8
latest
Source
npm

Version published: 4 years ago

Maintainers: 1

Created: 11 years ago

Source

Overview

Wuzzy was created to provide a smattering of some similarity identification stuff. Several simularity identification algorithm implementations are provided, including:

Jaccard similarity coefficient
Tanimoto coefficient
Pearson correlation
N-gram edit distance
Levenshtein distance
Jaro-Winkler distance

Fuzzy wuzzy was a bear, fuzzy wuzzy had no hair, fuzzy wuzzy wasn't very fuzzy, was he? Well, if you aren't sure maybe this library can help! :)

Installing

Wuzzy can be installed via npm (npm install wuzzy).

Examples

Some examples of using Wuzzy can be found in the real-wuzzy repository.

Methods

All bad jokes aside, below is a listing of the available functions. Have fun!

jarowinkler(a, b, t)

Computes the jaro-winkler distance for two given arrays.

NOTE: this implementation is based on the one found in the Lucene Java library.

Examples:

wuzzy.jarowinkler(
        ['D', 'W', 'A', 'Y', 'N', 'E'],
        ['D', 'U', 'A', 'N', 'E']
    );
    // -> 0.840

wuzzy.jarowinkler(
        'DWAYNE',
        'DUANE'
    );
    // -> 0.840

Params:

String|Array a - the first string/array to compare
String|Array b - the second string/array to compare
Number t - the threshold for adding

Return:

Number returns the jaro-winkler distance for

levenshtein(a, b, w)

Calculates the levenshtein distance for the two provided arrays and returns the normalized distance.

Examples:

wuzzy.levenshtein(
        ['D', 'W', 'A', 'Y', 'N', 'E'],
        ['D', 'U', 'A', 'N', 'E']
    );
    // -> 0.66666667

    or

wuzzy.levenshtein(
        'DWAYNE',
        'DUANE'
    );
    // -> 0.66666667

Params:

String|Array a - the first string/array to compare
String|Array b - the second string/array to compare
Object w - (optional) a set of key/value pairs

Return:

Number returns the levenshtein distance for

ngram(a, b, ng)

Computes the n-gram edit distance for any n (defaults to 2).

NOTE: this implementation is based on the one found in the Lucene Java library.

Examples:

wuzzy.ngram(
        ['D', 'W', 'A', 'Y', 'N', 'E'],
        ['D', 'U', 'A', 'N', 'E']
    );
    // -> 0.583

    or

wuzzy.ngram(
        'DWAYNE',
        'DUANE'
    );
    // -> 0.583

Params:

String|Array a - the first string/array to compare
String|Array b - the second string/array to compare
Number ng - (optional) the n-gram size to work with (defaults to 2)

Return:

Number returns the ngram distance for

pearson(a, b)

Calculates a pearson correlation score for two given objects (compares values of similar keys).

Examples:

wuzzy.pearson(
        {a: 2.5, b: 3.5, c: 3.0, d: 3.5, e: 2.5, f: 3.0},
        {a: 3.0, b: 3.5, c: 1.5, d: 5.0, e: 3.5, f: 3.0, g: 5.0}
    );
    // -> 0.396

    or

wuzzy.pearson(
        {a: 2.5, b: 1},
        {o: 3.5, e: 6.0}
    );
    // -> 1.0

Params:

Object a - the first object to compare
Object b - the second object to compare

Return:

Number returns the pearson correlation for

jaccard(a, b)

Calculates the jaccard index for the two provided arrays.

Examples:

wuzzy.jaccard(
        ['a', 'b', 'c', 'd', 'e', 'f'],
        ['a', 'e', 'f']
    );
    // -> 0.5

    or

wuzzy.jaccard(
        'abcdef',
        'aef'
    );
    // -> 0.5

    or 

wuzzy.jaccard(
        ['abe', 'babe', 'cabe', 'dabe', 'eabe', 'fabe'],
        ['babe']
    );
    // -> 0.16666667

Params:

String|Array a - the first string/array to compare
String|Array b - the second string/array to compare

Return:

Number returns the jaccard index for

tanimoto(a, b)

Calculates the tanimoto distance (weighted jaccard index).

Examples:

wuzzy.tanimoto(
        ['a', 'b', 'c', 'd', 'd', 'e', 'f', 'f'],
        ['a', 'e', 'f']
    );
    // -> 0.375

    or

wuzzy.tanimoto(
        'abcddeff',
        'aef'
    );
    // -> 0.375

    or 

wuzzy.tanimoto(
        ['abe', 'babe', 'cabe', 'dabe', 'eabe', 'fabe', 'fabe'],
        ['babe']
    );
    // -> 0.14285714

Params:

String|Array a - the first string/array to compare
String|Array b - the second string/array to compare

Return:

Number returns the tanimoto distance for

Keywords

FAQs

What is wuzzy?

Is wuzzy well maintained?

Package last updated on 26 Jun 2021

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

wuzzy

Overview

Installing

Examples

Methods

jarowinkler(a, b, t)

Examples:

Params:

Return:

levenshtein(a, b, w)

Examples:

Params:

Return:

ngram(a, b, ng)

Examples:

Params:

Return:

pearson(a, b)

Examples:

Params:

Return:

jaccard(a, b)

Examples:

Params:

Return:

tanimoto(a, b)

Examples:

Params:

Return:

Keywords

Related posts

Bybit Hack Puts Crypto Losses at $1.6B, Surpassing All of Last Year in Just Two Months

OpenSSF Launches Open Source Project Security Baseline to Strengthen Software Supply Chain