You're Invited:Meet the Socket Team at BlackHat and DEF CON in Las Vegas, Aug 4-6.RSVP
Socket
Book a DemoInstallSign in
Socket

hermetrics

Package Overview
Dependencies
Maintainers
1
Versions
15
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

hermetrics

Javascript version of hermetrics.py

1.0.10
Source
npmnpm
Version published
Weekly downloads
1.2K
-18.45%
Maintainers
1
Weekly downloads
 
Created
Source

Javascript library for distance and similarity metrics. Javascript translation from hermetrics.py.

Build Status License: MIT Codacy Badge semantic-release

Content

  • Installation
  • Usage
  • Metrics
    • Levenshtein
    • Hamming (work in progress)
    • OSA (work in progress)
    • Damerau-Levenshtein (work in progress)
    • Jaccard (work in progress)
    • Dice (work in progress)
    • Jaro (work in progress)
    • Jaro-Winkler (work in progress)
    • Metric comparator (work in progress)

Installation

From npm

$ npm i hermetrics --save

Usage

Require the package and import the desired class:

const { Levenshtein } = require('hermetrics');

const levenshtein = new Levenshtein();

levenshtein.distance('start', 'end');
levenshtein.maxDistance('start', 'end');

Using custom operation costs:

const { Levenshtein } = require('hermetrics');

const levenshtein = new Levenshtein();

const opts = {
  deletionCost: 3,
  substitutionCost: 2,
  deletionCost: 5
};

levenshtein.distance('start', 'end', opts);
levenshtein.maxDistance('start', 'end', opts);

Metrics

Overview

Hermetrics is a library designed for use in experimentation with string metrics. The library features a base class Metric which is highly configurable and can be used to implement custom metrics.

Based on Metric are some common string metrics already implemented to compute the distance between two strings. Some common edit distance metrics such as Levenshtein can be parametrized with different costs for each edit operation, althought have been only thoroughly tested with costs equal to 1. Also, the implemented metrics can be used to compare any iterable in addition to strings, but more tests are needed.

A metric has three main methods distance, normalizeDistance and similarity. In general the distance method computes the absolute distance between two strings, whereas normalizeDistance can be used to scale the distance to a particular range, usually (0,1), and the similarity method being normally defined as (1-normalizeDistance).

The normalization of the distance can be customized overriding the auxiliary methods for its computation. Those methods are maxDistance, minDistance and normalize.

Metric class

Metric is a base class that can receive as arguments an metric name, and contains six specific functions to be used as methods for the metric being implemented.

Default methods

Description of default methods for the Metric class.

In general a method of a metric receives three parameters:

  • source: The source string or iterable to compare.
  • target: The target string or iterable to compare.
  • costs: An optional object that contains the insertion, deletion and substitution custom value. By default the value is 1.
MethodDescription
DistanceThe distance method computes the total cost of transforming the source string on the target string. The default method just return 0 if the strings are equal and 1 otherwise.
maxDistanceReturns the maximum value of the distance between source and target given a specific cost for edit operations. The default method just return 1 given source and target don't have both length=0, in that case just return 0.
minDistancework in progress
normalizework in progress
normalized distancework in progress
similaritywork in progress

Levenshtein metric

Levenshtein distance is usually known as "the" edit distance. It is defined as the minimum number of edit operations (deletion, insertion and substitution) to transform the source string into the target string. The algorithm for distance computation is implemented using the dynamic programming approach with the full matrix construction, althought there are optimizations for time and space complexity those are not implemented here.

Contributors

Keywords

hermetrics

FAQs

Package last updated on 04 Mar 2020

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts