Research
Security News
Quasar RAT Disguised as an npm Package for Detecting Vulnerabilities in Ethereum Smart Contracts
Socket researchers uncover a malicious npm package posing as a tool for detecting vulnerabilities in Etherium smart contracts.
synonym-optimizer
Advanced tools
Gives a score to a string depending on the variety of the synonyms used.
For instance, let's compare The coffee is good. I love that coffee with The coffee is good. I love that bewerage. The second alternative is better because a synonym is used for coffee. This module will give a better score to the second alternative.
The lowest score the better.
Fully supported languages are French German English Italian and Spanish.
What it does / How it works:
wink-tokenizer
snowball-stemmer
(for all other languages: no stemming)Designed primarly to test the output of a NLG (Natural Language Generation) system.
The stemmer is not perfect. For instance in Italian, cameriere and cameriera have the same stem (camerier), while camerieri and cameriera have a different one (camer and camerier).
npm install synonym-optimizer
var synOptimizer = require('synonym-optimizer');
alts = [
'The coffee is good. I love that coffee.',
'The coffee is good. I love that bewerage.'
]
/*
The coffee is good. I love that coffee.: 0.5
The coffee is good. I love that bewerage.: 0
*/
alts.forEach((alt) => {
let score = synOptimizer.scoreAlternative('en_US', alt, null, null, null, null);
console.log(`${alt}: ${score}`);
});
The main function is scoreAlternative
. It takes a string and returns its score. Arguments are:
lang
(string, mandatory): the language.
fr_FR
, en_US
, de_DE
, it_IT
and es_ES
nl_NL
) stemming is disabled and stopwords are not removedalternative
(string, mandatory): the string to scorestopWordsToAdd
(string[], optional): list of stopwords to add to the standard stopwords liststopWordsToRemove
(string[], optional): list of stopwords to remove to the standard stopwords liststopWordsOverride
(string[], optional): replaces the standard stopword listidenticals
(string[][], optional): list of words that should be considered as beeing identical, for instance [ ['phone', 'cellphone', 'smartphone'] ]
.You can also use the getBest
function. Most arguments are exactly the same, but instead of alternative
, use alternatives
(string[]). The output number will not be the score, but simply the index of the best alternative.
The tokenizer is wink-tokenizer
, it does works with many languages (English, French, German, Hindi, Sanskrit, Marathi etc.) but not asian languages. Therefore the module will not work properly with Japanese, Chinese etc.
stopwords-*
snowball-stemmer
collection (or plug another stemmer)wink-tokenizer
does not workThe build writes stopwords a asciidoc in the rosaenlg-doc
module.
wink-tokenizer
to tokenize sentences in multiple languages (MIT).stopwords-en/de/fs/it/es
for standard stopwords lists per language (MIT).snowball-stemmer
to stem words per language (MIT).FAQs
Finds the text which has the least number of repetitions
The npm package synonym-optimizer receives a total of 237 weekly downloads. As such, synonym-optimizer popularity was classified as not popular.
We found that synonym-optimizer demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 0 open source maintainers collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Research
Security News
Socket researchers uncover a malicious npm package posing as a tool for detecting vulnerabilities in Etherium smart contracts.
Security News
Research
A supply chain attack on Rspack's npm packages injected cryptomining malware, potentially impacting thousands of developers.
Research
Security News
Socket researchers discovered a malware campaign on npm delivering the Skuld infostealer via typosquatted packages, exposing sensitive data.