Security News
Bun 1.2 Released with 90% Node.js Compatibility and Built-in S3 Object Support
Bun 1.2 enhances its JavaScript runtime with 90% Node.js compatibility, built-in S3 and Postgres support, HTML Imports, and faster, cloud-first performance.
synonym-optimizer
Advanced tools
Gives a score to a string depending on the variety of the synonyms used.
For instance, let's compare The coffee is good. I love that coffee with The coffee is good. I love that bewerage. The second alternative is better because a synonym is used for coffee. This module will give a better score to the second alternative.
The lowest score the better.
Fully supported languages are French German English Italian and Spanish.
What it does / How it works:
wink-tokenizer
snowball-stemmer
(for all other languages: no stemming)Designed primarly to test the output of a NLG (Natural Language Generation) system.
The stemmer is not perfect. For instance in Italian, cameriere and cameriera have the same stem (camerier), while camerieri and cameriera have a different one (camer and camerier).
npm install synonym-optimizer
var synOptimizer = require('synonym-optimizer');
alts = [
'The coffee is good. I love that coffee.',
'The coffee is good. I love that bewerage.'
]
/*
The coffee is good. I love that coffee.: 0.5
The coffee is good. I love that bewerage.: 0
*/
alts.forEach((alt) => {
let score = synOptimizer.scoreAlternative('en_US', alt, null, null, null, null);
console.log(`${alt}: ${score}`);
});
The main function is scoreAlternative
. It takes a string and returns its score. Arguments are:
lang
(string, mandatory): the language.
fr_FR
, en_US
, de_DE
, it_IT
and es_ES
nl_NL
) stemming is disabled and stopwords are not removedalternative
(string, mandatory): the string to scorestopWordsToAdd
(string[], optional): list of stopwords to add to the standard stopwords liststopWordsToRemove
(string[], optional): list of stopwords to remove to the standard stopwords liststopWordsOverride
(string[], optional): replaces the standard stopword listidenticals
(string[][], optional): list of words that should be considered as beeing identical, for instance [ ['phone', 'cellphone', 'smartphone'] ]
.You can also use the getBest
function. Most arguments are exactly the same, but instead of alternative
, use alternatives
(string[]). The output number will not be the score, but simply the index of the best alternative.
The tokenizer is wink-tokenizer
, it does works with many languages (English, French, German, Hindi, Sanskrit, Marathi etc.) but not asian languages. Therefore the module will not work properly with Japanese, Chinese etc.
stopwords-*
snowball-stemmer
collection (or plug another stemmer)wink-tokenizer
does not workwink-tokenizer
to tokenize sentences in multiple languages (MIT).stopwords-en/de/fs/it/es
for standard stopwords lists per language (MIT).snowball-stemmer
to stem words per language (MIT).FAQs
Finds the text which has the least number of repetitions
The npm package synonym-optimizer receives a total of 223 weekly downloads. As such, synonym-optimizer popularity was classified as not popular.
We found that synonym-optimizer demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 0 open source maintainers collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Bun 1.2 enhances its JavaScript runtime with 90% Node.js compatibility, built-in S3 and Postgres support, HTML Imports, and faster, cloud-first performance.
Security News
Biden's executive order pushes for AI-driven cybersecurity, software supply chain transparency, and stronger protections for federal and open source systems.
Security News
Fluent Assertions is facing backlash after dropping the Apache license for a commercial model, leaving users blindsided and questioning contributor rights.