Security News
Opengrep Emerges as Open Source Alternative Amid Semgrep Licensing Controversy
Opengrep forks Semgrep to preserve open source SAST in response to controversial licensing changes.
string-similarity
Advanced tools
Finds degree of similarity between strings, based on Dice's Coefficient, which is mostly better than Levenshtein distance.
The string-similarity npm package provides functions to find the similarity between strings and to find the best match among a set of strings compared to a target string. It uses various algorithms to calculate a similarity score and can be used in applications such as fuzzy matching, search optimizations, and data deduplication.
Comparing two strings for similarity
This feature allows you to compare two strings and get a similarity score between 0 and 1, where 1 means the strings are identical.
const stringSimilarity = require('string-similarity');
const similarity = stringSimilarity.compareTwoStrings('string1', 'string2');
Finding the best match in an array of strings
This feature allows you to compare a target string against an array of strings and find the one that is most similar to the target. It returns an object with the best match and ratings for all strings.
const stringSimilarity = require('string-similarity');
const matches = stringSimilarity.findBestMatch('string', ['string1', 'string2', 'string3']);
The levenshtein package provides a way to calculate the Levenshtein distance between two strings, which is a measure of the difference between two sequences. It is more focused on edit distance rather than similarity score.
Fuzzyset.js is a fuzzy string set for JavaScript. It uses Levenshtein distance to compute the difference between strings and is useful for making fuzzy string matching more efficient by using a set data structure.
Natural is a general natural language facility for Node.js. It includes a variety of string comparison algorithms, including Jaro-Winkler and Levenshtein distance, and provides more comprehensive natural language processing features beyond string comparison.
Finds degree of similarity between two strings, based on Dice's Coefficient, which is mostly better than Levenshtein distance.
Install using:
npm install string-similarity --save
In your code:
var stringSimilarity = require("string-similarity");
var similarity = stringSimilarity.compareTwoStrings("healed", "sealed");
var matches = stringSimilarity.findBestMatch("healed", [
"edward",
"sealed",
"theatre",
]);
Include <script src="//unpkg.com/string-similarity/umd/string-similarity.min.js"></script>
to get the latest version.
Or <script src="//unpkg.com/string-similarity@4.0.1/umd/string-similarity.min.js"></script>
to get a specific version (4.0.1) in this case.
This exposes a global variable called stringSimilarity
which you can start using.
<script>
stringSimilarity.compareTwoStrings('what!', 'who?');
</script>
(The package is exposed as UMD, so you can consume it as such)
The package contains two methods:
Returns a fraction between 0 and 1, which indicates the degree of similarity between the two strings. 0 indicates completely different strings, 1 indicates identical strings. The comparison is case-sensitive.
Order does not make a difference.
(number): A fraction from 0 to 1, both inclusive. Higher number indicates more similarity.
stringSimilarity.compareTwoStrings("healed", "sealed");
// → 0.8
stringSimilarity.compareTwoStrings(
"Olive-green table for sale, in extremely good condition.",
"For sale: table in very good condition, olive green in colour."
);
// → 0.6060606060606061
stringSimilarity.compareTwoStrings(
"Olive-green table for sale, in extremely good condition.",
"For sale: green Subaru Impreza, 210,000 miles"
);
// → 0.2558139534883721
stringSimilarity.compareTwoStrings(
"Olive-green table for sale, in extremely good condition.",
"Wanted: mountain bike with at least 21 gears."
);
// → 0.1411764705882353
Compares mainString
against each string in targetStrings
.
(Object): An object with a ratings
property, which gives a similarity rating for each target string, a bestMatch
property, which specifies which target string was most similar to the main string, and a bestMatchIndex
property, which specifies the index of the bestMatch in the targetStrings array.
stringSimilarity.findBestMatch('Olive-green table for sale, in extremely good condition.', [
'For sale: green Subaru Impreza, 210,000 miles',
'For sale: table in very good condition, olive green in colour.',
'Wanted: mountain bike with at least 21 gears.'
]);
// →
{ ratings:
[ { target: 'For sale: green Subaru Impreza, 210,000 miles',
rating: 0.2558139534883721 },
{ target: 'For sale: table in very good condition, olive green in colour.',
rating: 0.6060606060606061 },
{ target: 'Wanted: mountain bike with at least 21 gears.',
rating: 0.1411764705882353 } ],
bestMatch:
{ target: 'For sale: table in very good condition, olive green in colour.',
rating: 0.6060606060606061 },
bestMatchIndex: 1
}
compareTwoStrings(..)
: now O(n) instead of O(n^2)bestMatchIndex
to the results for findBestMatch(..)
to point to the best match in the supplied targetStrings
arraysubstring
instead of substr
FAQs
Finds degree of similarity between strings, based on Dice's Coefficient, which is mostly better than Levenshtein distance.
The npm package string-similarity receives a total of 1,717,662 weekly downloads. As such, string-similarity popularity was classified as popular.
We found that string-similarity demonstrated a not healthy version release cadence and project activity because the last version was released a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Opengrep forks Semgrep to preserve open source SAST in response to controversial licensing changes.
Security News
Critics call the Node.js EOL CVE a misuse of the system, sparking debate over CVE standards and the growing noise in vulnerability databases.
Security News
cURL and Go security teams are publicly rejecting CVSS as flawed for assessing vulnerabilities and are calling for more accurate, context-aware approaches.