
Security News
Risky Biz Podcast: Making Reachability Analysis Work in Real-World Codebases
This episode explores the hard problem of reachability analysis, from static analysis limits to handling dynamic languages and massive dependency trees.
lex-helpers
Advanced tools
Functions for calculating Open Vocabulary lexical statistics.
See lexFrequencyPipeline API documentation below. Also see
index.test.js
for a recreation of the
WWBP example.
// import your lexicon data as a JSON object (i.e., Record<string, number>)
import lexicon from "./lexicon.json" with { type: "json" }; // example
// import the lexFrequencyPipeline function from this module
import { lexFrequencyPipeline } from "lex-helpers";
// define your intercept
const intercept = 10.523; // example
// create your custom pipeline
const pipeline = lexFrequencyPipeline(lexicon, intercept);
// get some tokens
const doc1Tokens = ["the", "cat", "sat", "on", "the", "mat"]; // example
const doc2Tokens = ["the", "dog", "sat", "on", "the", "hat"]; // example
// run the pipeline
const doc1result = pipeline(doc1tokens); // {number}
const doc2result = pipeline(doc2tokens); // {number}
Corrects IEEE 754 floating point errors using toFixed
.
Example:
import { correctFloat } from "lex-helpers";
const initial = 0.1 + 0.2; // 0.30000000000000004
const corrected = correctFloat(initial); // 0.3
Get the frequencies of tokens in a corpus.
Example:
import { getFrequencies } from "lex-helpers";
const tokens = ["the", "cat", "sat", "on", "the", "mat"];
const frequencies = getFrequencies(tokens); // Map<{ the: 2, cat: 1, sat: 1, on: 1, mat: 1 }>
Get the weighted relative frequencies of tokens in a corpus.
Example:
import { getWeightedRelativeFrequencies } from "lex-helpers";
const lexicon = { the: -93, cat: 100, sat: 50, on: -10, mat: 5 };
const frequencies = { the: 2, cat: 1, sat: 1, on: 1, mat: 1 };
const weightedRelativeFrequencies = getWeightedRelativeFrequencies(
lexicon,
frequencies,
); // IterableIterator<[string, number]>
Get the final value of a token in a corpus.
Example:
import { getLexiconValue } from "lex-helpers";
const weightedRelativeFrequencies = {
the: 0.4,
cat: 0.1,
sat: 0.1,
on: 0.1,
mat: 0.1,
};
const intercept = 10.523;
const lexiconValue = getLexiconValue(weightedRelativeFrequencies, intercept); // {number}
Create a custom pipeline for calculating lexical statistics.
N.B. This is provided for simple use cases. For larger datasets it is recommended that you create your own pipeline using the functions provided in this module.
The pipeline is
getFrequencies -> getWeightedRelativeFrequencies -> getLexiconValue -> correctFloat
Example:
import { lexFrequencyPipeline } from "lex-helpers";
const lexicon = { the: 0.2, cat: 0.1, sat: 0.1, on: 0.1, mat: 0.1 };
const intercept = 10.523;
const pipeline = lexFrequencyPipeline(lexicon, intercept);
const tokens = ["the", "cat", "sat", "on", "the", "mat"];
const result = pipeline(tokens); // 12.523
The same as lexFrequencyPipeline
but getFrequencies simply counts a token as present (1) or absent (0), instead of the token's frequency.
Sum and correct the values of an object.
Example:
import { sumValues } from "lex-helpers";
const values = [{ value: 0.1 }, { value: 0.2 }, { value: 0.3 }];
const sum = sumValues(values); // 0.6
(C) 2017-24 P. Hughes. All rights reserved.
Released under the MIT licence.
FAQs
Functions for calculating Open Vocabulary lexical statistics.
The npm package lex-helpers receives a total of 0 weekly downloads. As such, lex-helpers popularity was classified as not popular.
We found that lex-helpers demonstrated a not healthy version release cadence and project activity because the last version was released a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
This episode explores the hard problem of reachability analysis, from static analysis limits to handling dynamic languages and massive dependency trees.
Security News
/Research
Malicious Nx npm versions stole secrets and wallet info using AI CLI tools; Socket’s AI scanner detected the supply chain attack and flagged the malware.
Security News
CISA’s 2025 draft SBOM guidance adds new fields like hashes, licenses, and tool metadata to make software inventories more actionable.