Research
Security News
Quasar RAT Disguised as an npm Package for Detecting Vulnerabilities in Ethereum Smart Contracts
Socket researchers uncover a malicious npm package posing as a tool for detecting vulnerabilities in Etherium smart contracts.
nesity-statistics
Advanced tools
This package provides a set of functions for statistical analysis.
This package provides a set of functions for statistical analysis.
It focuses on comparing datasets acquired from benchmarks.
yarn install nesity-statistics
The compare
function can be used to assess whether two datasets differ from each other, and if so, what's the effect size.
It uses the t-test to assess whether the two datasets differ from each other, and Cohen's d to assess the effect size.
Special features:
For best results ensure that:
import { compare } from 'nesity-statistics'
const result = compare({
data1: [
/*...*/
],
data2: [
/*...*/
],
confidenceLevel: 0.95, // use alpha = 0.05 for the t-test
minimalModalitySize: 4, // reject all modalities with less than 4 samples
denoisingAndModalitySplittingOptions: {
// see section on splitting modalities for the differences between 'quantile' and 'kde' splitting
type: 'quantiles',
},
})
Two functions are provided to split a dataset's modalities:
splitMultimodalDistributionWithQuantiles
- a very fast function that uses large jump in upper quantiles to find the splits in the dataset's modalities.splitMultimodalDistributionWithKDE
- a computationally intensive function that uses Kernel Density Estimation to find the splits in the dataset's modalities.The quality of the results is similar in most cases, but the splitMultimodalDistributionWithQuantiles
function is much faster.
In certain cases using KDE will result in better results, but it's not recommended to use it unless you have a good reason to do so.
Function matchModalities
can be used to match the modalities of two datasets by their means.
For example, given the following datasets:
const data1 = [
[1, 2, 3],
[4, 5, 6],
]
const data2 = [
[2, 3, 4],
[5, 6, 7],
[8, 9, 10],
]
const expectedResult = [
// modality 1:
[
// data from 1
[1, 2, 3],
// data from 2
[2, 3, 4],
],
// modality 2:
[
[4, 5, 6],
[5, 6, 7],
],
// modality 3:
[undefined, [8, 9, 10]],
]
The function will prioritize modalities with similar number of samples, and will try to match them by their means. You may adjust how the modalities are prioritized using the following parameters:
meanDistanceWeight
- the weight of the mean distance in the comparisonsizeRatioWeight
- the weight of the sample size ratio in the comparisonprioritizeSizeRatioAbove
- prioritize modalities with a sample size ratio above this valueThe optimize function is a generic utility function to find the "best" input parameter/output result combination. It works by computing results, and then comparing them with each other result. It can be used to optimize an objective function by repeatedly evaluating it with different input parameters, and comparing the results of these evaluations to determine the best result and input parameter combination. The optimization is performed in a series of iterations, with each iteration producing a set of results. Each iteration result is compared with each other iteration result, and the results are sorted by their comparison rank.
It returns an array of optimization results sorted by their comparison rank. Each result is an array with the iteration argument, the iteration result, the comparison metadata, the comparison rank and a done flag.
One example usage of the optimize function could be to find the parameters to pass to a noise reduction function, which result in the best noise reduction in a 2-sample dataset.
Let's say we want to find the parameters that result in the lowest standard deviation:
import { optimize } from 'nesity-statistics'
const data = [
[
/* set 1: ... */
],
[
/* set 2: ... */
],
]
const iterate = (parameters): number[] => {
return reduceNoise(data, parameters)
}
const getNextIterationArgument = (iteration: number) => {
return {
x: iteration * 0.1,
}
}
const compare = (
[a1, a2]: number[],
[b1, b2]: number[],
): [number, { diff: number }] | INVALID_LEFT | INVALID_RIGHT => {
// we should have at least 2 values for the result to be considered valid:
if (a1.length < 2 || a2.length < 2) {
return INVALID_LEFT
}
if (b1.length < 2 || b2.length < 2) {
return INVALID_RIGHT
}
const diffA = stdev(a1) - stdev(a2)
const diffB = stdev(b1) - stdev(b2)
const comparisonDiff = Math.abs(diffA) - Math.abs(diffB)
// the 2nd argument of the return value is the comparison metadata
// if the comparison is expensive,
// it could be used to keep a reference to any data about the "winning" result,
// used to perform the comparison
return [comparisonDiff, comparisonDiff <= 0 ? diffA : diffB]
}
const sortedResults = optimize({
iterate,
iterations: 100,
getNextIterationArgument,
compare,
})
const bestResult = sortedResults[0]
const [bestParameters, bestResult, bestMeta] = bestResult
FAQs
This package provides a set of functions for statistical analysis.
We found that nesity-statistics demonstrated a not healthy version release cadence and project activity because the last version was released a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Research
Security News
Socket researchers uncover a malicious npm package posing as a tool for detecting vulnerabilities in Etherium smart contracts.
Security News
Research
A supply chain attack on Rspack's npm packages injected cryptomining malware, potentially impacting thousands of developers.
Research
Security News
Socket researchers discovered a malware campaign on npm delivering the Skuld infostealer via typosquatted packages, exposing sensitive data.