Security News
New Python Packaging Proposal Aims to Solve Phantom Dependency Problem with SBOMs
PEP 770 proposes adding SBOM support to Python packages to improve transparency and catch hidden non-Python dependencies that security tools often miss.
synonym-optimizer
Advanced tools
Gives a score to a string depending on the variety of the synonyms used.
For instance, let's compare The coffee is good. I love that coffee with The coffee is good. I love that bewerage. The second alternative is better because a synonym is used for coffee. This module will give a better score to the second alternative.
The lowest score the better.
Fully supported languages are French German English Italian and Spanish.
What it does / How it works:
wink-tokenizer
snowball-stemmer
(for all other languages: no stemming)Designed primarly to test the output of a NLG (Natural Language Generation) system.
The stemmer is not perfect. For instance in Italian, cameriere and cameriera have the same stem (camerier), while camerieri and cameriera have a different one (camer and camerier).
npm install synonym-optimizer
var synOptimizer = require('synonym-optimizer');
alts = [
'The coffee is good. I love that coffee.',
'The coffee is good. I love that bewerage.'
]
/*
The coffee is good. I love that coffee.: 0.5
The coffee is good. I love that bewerage.: 0
*/
alts.forEach((alt) => {
let score = synOptimizer.scoreAlternative('en_US', alt, null, null, null, null);
console.log(`${alt}: ${score}`);
});
The main function is scoreAlternative
. It takes a string and returns its score. Arguments are:
lang
(string, mandatory): the language.
fr_FR
, en_US
, de_DE
, it_IT
and es_ES
nl_NL
) stemming is disabled and stopwords are not removedalternative
(string, mandatory): the string to scorestopWordsToAdd
(string[], optional): list of stopwords to add to the standard stopwords liststopWordsToRemove
(string[], optional): list of stopwords to remove to the standard stopwords liststopWordsOverride
(string[], optional): replaces the standard stopword listidenticals
(string[][], optional): list of words that should be considered as beeing identical, for instance [ ['phone', 'cellphone', 'smartphone'] ]
.You can also use the getBest
function. Most arguments are exactly the same, but instead of alternative
, use alternatives
(string[]). The output number will not be the score, but simply the index of the best alternative.
The tokenizer is wink-tokenizer
, it does works with many languages (English, French, German, Hindi, Sanskrit, Marathi etc.) but not asian languages. Therefore the module will not work properly with Japanese, Chinese etc.
stopwords-*
snowball-stemmer
collection (or plug another stemmer)wink-tokenizer
does not workwink-tokenizer
to tokenize sentences in multiple languages (MIT).stopwords-en/de/fs/it/es
for standard stopwords lists per language (MIT).snowball-stemmer
to stem words per language (MIT).FAQs
Finds the text which has the least number of repetitions
The npm package synonym-optimizer receives a total of 283 weekly downloads. As such, synonym-optimizer popularity was classified as not popular.
We found that synonym-optimizer demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 0 open source maintainers collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
PEP 770 proposes adding SBOM support to Python packages to improve transparency and catch hidden non-Python dependencies that security tools often miss.
Security News
Socket CEO Feross Aboukhadijeh discusses open source security challenges, including zero-day attacks and supply chain risks, on the Cyber Security Council podcast.
Security News
Research
Socket researchers uncover how threat actors weaponize Out-of-Band Application Security Testing (OAST) techniques across the npm, PyPI, and RubyGems ecosystems to exfiltrate sensitive data.