ALS Statistics
ALS Statistics is a modular JS toolkit for statistical work. It’s designed to be:


- Quality: Numerics verified: this release matches Python (NumPy/SciPy) reference outputs across modules and passes the deterministic Golden Test Suite on Node.js 20.x, all within published EPS tolerances. Reproducible via
node goldens/test.js and npm test.
- Easy to use like
Math for small one-liners;
- Composable for multi-step analyses (filter → group → compare → summarize);
- Runtime-agnostic — the same API in Node.js and in the browser;
- Data-model light — works with plain arrays (
number[]) and small helpers like Column and Table.
- Browser-ready. No native dependencies; works in the browser (as ESM or via the included UMD bundle).
Think of it as a “batteries-included” stats toolbox rather than a full data-frame ecosystem. If you know SPSS: ALS gives you many of the common procedures (correlations, t-tests, ANOVA, reliability, basic clustering, regression) with code-first ergonomics. If you know NumPy/SciPy: ALS focuses on analytics primitives and convenience wrappers (no heavy data containers, no plotting).
Why the rewrite?
The v1 architecture had grown too complex (intertwined modules, heavy abstractions), which made adding features and maintaining consistency difficult.
v2 was rebuilt from scratch with a simpler core (plain arrays + lightweight Column/Table), clear module boundaries, and predictable numerics—so new analytical tools can be added quickly without increasing complexity.
Key ideas
- Plain data in / plain results out. Most functions take
{ [name]: number[] } or number[] and return simple objects (e.g. { r, t, df, p }).
- Two modes of use:
- One-liners via descriptive helpers (mean, stdDev, percentiles…).
- Structured analyzers for correlations, mean comparisons, regressions, clustering, etc.
- Table utilities. Sort, filter, split by group, compute derived columns, and feed the result to an analyzer.
Installation
npm i als-statistics
Usage in browser
<script type="module" src="/node_modules/als-statistics/lib/index.js"></script>
or
<script src="/node_modules/als-statistics/statistics.js"></script>
or
<script type="module">
import Statistics from '/node_modules/als-statistics/lib/index.js'
</script>
NodeJS
import { Analyze, Stats, Table, Column } from 'als-statistics';
const { Analyze, Stats, Table, Column } = require('als-statistics/statistics.cjs')
const { CDF, CompareMeans, Correlate, Clustering, Regression } = Analyze;
const { constants, t, f, phi } = CDF;
const { IndependentTTest, OneWayAnova, PairedTTest, OneSampleTTest } = CompareMeans;
const { CronbachAlpha, Pearson, Spearman, Kendall } = Correlate;
const { Dbscan, Hdbscan, computeDistances } = Clustering;
const { LinearRegression, LogisticRegression } = Regression;
const {
sum, mean,median,mode, min, max,
variance, varianceSample, stdDev, stdDevSample, cv, range, iqr, mad,
percentile, q1, q3, p10, p90,
zScore, zScores, zScoresSorted, outliersZScore, outliersIQR,
weightedMean, confidenceInterval, slope, regressionSlope,
spectralPowerDensityArray, spectralPowerDensityMetric,
sorted, ma, sumOfSquares, flatness, skewness, kurtosis,
skewnessSample, kurtosisSample, geometricMean, harmonicMean,
noiseStability, frequencies, relativeFrequencies,
relativeDispersion, normalizedValues, xValues,
recode,
} = Stats;
The package is modular — import only what you use.
Quick starts
1) Use it like Math (one-liners)
import { Stats } from 'als-statistics';
const X = [10, 12, 13, 9, 14];
const mu = Stats.mean(X);
const sd = Stats.stdDevSample(X);
const p90 = Stats.p90(X);
console.log({ mu, sd, p90 });
You can also access many metrics via Column:
import { Column } from 'als-statistics';
const col = new Column([10, 12, 13, 9, 14], 'Score');
const { mean, stdDev, median, frequencies, flatness } = col;
2) Quick analysis: correlation in one line
import { Analyze } from 'als-statistics';
const data = {
gender: [0, 1, 0, 1, 1, 0],
score: [62, 75, 70, 81, 64, 78],
};
const pearson = new Analyze.Correlate(data).pearson('gender', 'score');
const { r, t, df, p } = pearson;
console.log({ r, t, df, p });
3) Compare means: Welch t-test (unequal variances)
import { Analyze } from 'als-statistics';
const data = {
men: [62, 75, 70, 81, 64],
women: [78, 73, 69, 71, 74, 77],
};
const test = new Analyze.CompareMeans(data).independentWelch('men', 'women');
console.log({ t: test.t, df: test.df, p: test.p });
4) One-way ANOVA (classic & Welch)
import { Analyze } from 'als-statistics';
const { CompareMeans } = Analyze;
const data = {
A: [10, 11, 9, 10],
B: [10, 30, -10, 50, -20],
C: [12, 13, 12, 11, 14],
};
const classic = new CompareMeans(data).anova();
const welch = new CompareMeans(data).anovaWelch();
console.log({
classic: { F: classic.F, df1: classic.dfBetween, df2: classic.dfWithin, p: classic.p },
welch: { F: welch.F, df1: welch.dfBetween, df2: welch.dfWithin, p: welch.p },
});
5) Table-first workflow (filter → split → analyze)
import { Table } from 'als-statistics';
const t = new Table(
{ gender: [0,1,0,1,1,0], age: [21,22,20,23,19,22], score: [62,75,70,81,64,78] },
{ name: 'Survey' }
);
t.filterRowsBy('age', a => a >= 21);
import { Analyze } from 'als-statistics';
const { CompareMeans } = Analyze;
const cm = new CompareMeans({ men: [...], women: [...] }).independentWelch('men', 'women');
const groups = t.splitBy('gender');
const test = new CompareMeans(groups).independentWelch('0', '1');
Data managing (Tables and Columns)
This section explains how data flows through Columns, Tables and Statistics: validation rules, caching, safe updates, and the most common operations you’ll use before running analytics.
Notes & pitfalls
- Always mutate via API. Use
Column mutators or the values setter; avoid direct array mutation to keep caches correct.
- Invalids.
Column.invalid stores indices of rejected values; descriptives and analyses ignore them.
- Mutability. Most
Table methods are in-place and return this. Prefer clone() when you need a safe branch.
- Alignment. If you disable alignment and keep ragged columns, be mindful when exporting rows or running analyses that expect equal lengths.
- HTML output.
htmlTable() is for quick previews; for full reports, prefer exporting rows and rendering via your own templates.
Column
Quick API snapshot
import Statistics ,{ Table, Column } from 'als-statistics';
Column.key(name, ...parts)
col.name
col.labels?
col.invalid
col.values
col.n
col.$(key, compute)
col.onchange(fn)
col.addValue(value, index?)
col.deleteValue(index)
col.clone(name?)
col.insertAt(index, ...items)
col.setAt(index, item)
col.removeAt(index, deleteCount=1)
col.splice(start, deleteCount, ...items)
col.push(...items)
col.sum, col.mean, col.median, col.mode
col.variance, col.varianceSample, col.stdDev, col.stdDevSample, col.cv, col.range, col.iqr, col.mad
col.percentile(p), col.q1, col.q3, col.p10, col.p90
col.zScore(v), col.zScores(), col.zScoresSorted(), col.outliersZScore(z=3), col.outliersIQR()
col.weightedMean(weights), col.confidenceInterval, col.slope, col.regressionSlope(customX)
col.spectralPowerDensityArray, col.spectralPowerDensityMetric
How it works (principles)
- Validation-first. Columns accept only finite numbers. Any non-finite input (
NaN, ±Infinity, non-number) is rejected or tracked via col.invalid, and excluded from descriptive metrics.
- Cached results. Many results are cached (e.g.,
col.mean, col.stdDev). To keep caches correct, you must not mutate the underlying array directly.
Instead, either:
- assign a new array via the validated setter:
col.values = [...newNumbers], or
- use the provided mutators (
setAt, splice, push, …).
These paths automatically invalidate caches and fire onchange events.
- Alignment in tables. By default, a
Table aligns columns to a common length (truncates to the shortest column). You can change this behavior with constructor options (e.g., alignColumns: false, minK) or call t.alignColumns() explicitly.
- In-place transforms. Most
Table methods mutate. Chain them freely, or use clone() to keep the original around.
Creating and validating
import { Column } from 'als-statistics';
const scores = new Column([10, 12, 13, 9, 14], 'Score');
scores.values = [11, 11, 10, 12, 15];
scores.values = [11, 12, NaN, 10, 9, Infinity];
console.log(scores.invalid);
console.log(scores.mean);
Do not mutate scores.values in place (e.g., scores.values[0] = 999), as caches won’t know about it. Use setAt(...) instead.
Safe mutations (cache-aware)
scores.push(10, 11);
scores.insertAt(1, 99);
scores.setAt(0, 12);
scores.deleteValue(2);
scores.splice(3, 1, 50, 51);
All of these invalidate caches and emit onchange:
scores.onchange((col, prev, meta) => {
console.log('column changed:', meta.type)
});
Caching your own computations
const kurt = scores.$('kurtosis', () => {
return scores.kurtosis;
});
Descriptives on Column
Every descriptive method available in Stats exists on Column too and always respects validation/caching:
console.log({
mean: scores.mean,
sd : scores.stdDevSample,
q1 : scores.q1,
p90 : scores.p90,
outliersZ: scores.outliersZScore(3)
});
Table
Quick API snapshot
import { Table } from 'als-statistics';
const t = new Table(data?, { name?, minK?, alignColumns? })
t.n
t.k
t.columns
t.colNames
t.colValues
t.json
t.addColumn(name, values, labels?) -> Column
t.deleteColumn(name) -> this
t.addRow(row, index?) -> this
t.addRows(rows, index?) -> this
t.deleteRow(index) -> this
t.alignColumns() -> this
t.recode(colName, mapper, newColName?) -> void
t.compute(fn, name) -> Column
t.filterRows(indexes) -> this
t.filterRowsBy(colName, predicate) -> this
t.sortBy(colName, asc=true) -> this
t.clone(name?, colFilter=[]) -> Table
t.splitBy(colName, labels?) -> Statistics
t.transpose(colNames=[]) -> Table
t.where(rowPredicate) -> number[]
t.rows(withKeys=true) -> object[] | any[][]
t.htmlTable(colFilter=[], options?) -> string
t.descriptive(...metricNames) -> Object{}
t.correlate(...colFilter) -> Correlate
t.compareMeans(...colFilter) -> CompareMeans
t.dbscan(colFilter, options?) -> Dbscan
t.hdbscan(colFilter, options?) -> Hdbscan
t.regression(yName, xNames, type='linear'|'logistic') -> Regression
t.linear(yName, xNames)
t.logistic(yName, xNames)
Tip: operations on Table are mutable by default (they change the same instance). Use t.clone(...) to branch a copy for “what-if” scenarios.
Constructing and alignment
import { Table } from 'als-statistics';
const t = new Table(
{ gender: [0,1,0,1,1,0], age: [21,22,20,23,19], score: [62,75,70,81,64,78] },
{ name: 'Survey', alignColumns: true, minK: 2 }
);
console.log(t.n, t.k, t.colNames);
const scoreCol = t.columns['score'];
console.log(scoreCol.mean);
Rows & columns (synchronization)
t.addColumn('bmi', [22.1, 24.0, 23.7, 25.3, 21.8]);
t.deleteColumn('age');
t.addRow({ gender: 0, score: 71, bmi: 23.1 });
t.addRows([
{ gender: 1, score: 68, bmi: 24.2 },
{ gender: 0, score: 77, bmi: 22.7 }
]);
t.deleteRow(0);
t.alignColumns();
Data shaping
t.recode('gender', g => (g === 0 ? 'F' : 'M'), 'genderLabel');
t.compute(row => row.score / (row.bmi ?? 1), 'scorePerBmi');
t.filterRowsBy('score', s => s >= 70);
t.sortBy('score', false);
const adultIdx = t.where(row => row.bmi >= 22 && row.bmi <= 25);
const rowsAsObjects = t.rows(true);
const rowsAsArrays = t.rows(false);
const html = t.htmlTable(['genderLabel','score','bmi']);
Split & analyze
const groups = t.splitBy('genderLabel');
import { Analyze } from 'als-statistics';
const test = new Analyze.CompareMeans(groups).independentWelch('F', 'M');
console.log({ t: test.t, df: test.df, p: test.p });
const corr = t.correlate('score','bmi').pearson();
console.log({ r: corr.r, p: corr.p });
Transpose and clone
const t2 = t.transpose(['score','bmi']);
const tClone = t.clone('scenario: filtered', ['score','bmi']);
Statistics (multi-table manager)
Statistics is a lightweight coordinator for multiple Tables. It lets you:
- register tables (
addTable),
- compute the union of available column names (
colNames),
- combine the same columns from different tables into a new
Table (columns(...)),
- remove tables (
deleteTable),
- and access the module namespace (static):
Statistics.Table, Statistics.Stats, Statistics.Analyze, Statistics.Column.
It’s especially handy for before/after designs, or when you split one table by a factor and then want to analyze the resulting groups together.
API
new Statistics(name?: string)
statistics.addTable(obj: Record<string, number[]>, options?: { name?: string, minK?: number, alignColumns?: boolean }): Table
statistics.deleteTable(tableName: string): void
statistics.colNames: string[]
statistics.columns(name: string, ...colFilter: (string|RegExp)[]): Table
Statistics.Table
Statistics.Stats
Statistics.Analyze
Statistics.Column
Column selection (colFilter)
columns(name, ...colFilter) uses the same filtering helper as Table:
- pass exact names:
columns('X', 'score')
- pass regex:
columns('X', /^score|age$/)
- exclude by prefixing with
-: columns('X', 'score', '-score_z')
Examples
1) Before/After (paired)
import Statistics from 'als-statistics';
const { CompareMeans } = Statistics.Analyze;
const S = new Statistics('A/B');
S.addTable({ score: [62, 71, 69, 73, 75] }, { name: 'before' });
S.addTable({ score: [70, 76, 70, 78, 79] }, { name: 'after' });
const merged = S.columns('Scores', 'score');
const paired = merged.compareMeans('before_score', 'after_score').paired();
console.log({ t: paired.t, df: paired.df, p: paired.p });
2) Split → Combine → Independent Welch
import { Table } from 'als-statistics';
import Statistics from 'als-statistics';
const { CompareMeans } = Statistics.Analyze;
const t = new Table(
{ group: [0,1,0,1,0,1], score: [62,75,70,81,64,78] },
{ name: 'Survey' }
);
const S = t.splitBy('group', { 0: 'control', 1: 'treat' });
const merged = S.columns('scored', 'score');
const test = merged.compareMeans('control_score','treat_score').independentWelch();
console.log({ t: test.t, df: test.df, p: test.p });
3) Cross-table correlation
const merged = S.columns('ab', 'score');
const corr = merged.correlate('before_score','after_score').pearson();
console.log({ r: corr.r, p: corr.p });
Scenarios
A) Before/After (pre→post) in separate tables
import Statistics from 'als-statistics';
const S = new Statistics();
S.addTable('pre', preTable);
S.addTable('post', postTable);
const merged = S.columns('merged', 'score');
const cm = merged.compareMeans('pre_score','post_score').paired();
console.log({ t: cm.t, df: cm.df, p: cm.p });
B) Split → Combine workflow
const S = new Statistics();
S.addTable('raw', rawTable);
const { control, treat } = S.split('raw', by => by.group === 'A' ? 'control' : 'treat');
const merged = S.columns('combined', 'score');
const res = merged.compareMeans('control_score','treat_score').independentWelch();
How‑to recipes
- Compute cross-table correlation between
before_score and after_score
const merged = S.columns('ab', 'score');
merged.correlate('before_score','after_score').pearson();
- Build a summary sheet for multiple tables (mean, sd, n)
const names = S.colNames();
const rows = names.map(col => {
const t = S.columns('tmp', col);
const d = t.describe(`${col}_0`);
return { col, mean: d.mean, sd: d.stdDevSample, n: d.n };
});
Live CodePen demos: add your links here.
Practical patterns
A. Pipeline “sort → split → test”
import { Table } from 'als-statistics';
import { Analyze } from 'als-statistics';
const { CompareMeans } = Analyze;
const t = new Table(data).sortBy('score', false);
const top = t.clone('Top').filterRows([...Array(100).keys()]);
const groups = top.splitBy('gender');
const cm = new CompareMeans(groups);
const res = cm.independentWelch('0','1');
console.log(res.p < 0.05 ? 'Significant' : 'NS');
B. Correlations with filters
import { Table } from 'als-statistics';
const t = new Table(data);
t.filterRowsBy('age', a => a >= 25 && a <= 40);
const corr = t.correlate('height','weight').pearson();
console.log(corr.r, corr.p);
C. Quick reliability check
import { Analyze } from 'als-statistics';
const items = { Q1: [...], Q2: [...], Q3: [...], Q4: [...] };
const alpha = new Analyze.Correlate.CronbachAlpha(items);
console.log(alpha.alpha, alpha.htmlTable);
D. Minimal regression report
import { Analyze } from 'als-statistics';
const reg = new Analyze.Regression(dataset, { yName: 'y', xNames: ['x1','x2'], type: 'linear' });
reg.steps[0].calculate();
console.log(reg.steps[0].result);
Analyze · CDF
Cumulative distribution functions used by other tests.
Exports
CDF.regularizedIncompleteBeta(x, a, b): number – Regularized incomplete beta Iₓ(a,b). Clamps to [0,1] when x≤0 or x≥1.
CDF.t(x, df): number – CDF of the Student t distribution. df must be positive.
CDF.f(x, df1, df2): number – CDF of the F distribution. df1, df2 must be positive.
CDF.phi(x): number – Standard normal CDF Φ(x). Returns 0/1 for large negative/positive tails and supports ±Infinity.
Analyze · Compare Means
Entry-point wrapper CompareMeans for mean-comparison tests (t-tests, ANOVA).
Class: CompareMeans
Constructor
new CompareMeans(data: Record<string, number[]>)
data – object mapping group name → numeric array.
Methods
paired(...colNames): PairedTTest – paired t-test on two named columns; trims to equal length.
independent(...colNames): IndependentTTest – two-sample Student t-test (pooled variance).
independentWelch(...colNames): IndependentTTest – two-sample Welch t-test.
anova(...colNames): OneWayAnova – one‑way ANOVA (pooled/“classic”).
anovaWelch(...colNames): OneWayAnova – Welch’s one‑way ANOVA.
oneSample(colName?, mu0=0): OneSampleTTest – one-sample t‑test for a single column (defaults to the first key if colName omitted).
All methods accept optional column names. If omitted, the test uses all keys from the constructor data.
One‑Way ANOVA
Classic (pooled) and Welch’s ANOVA.
Class: OneWayAnova (returned by CompareMeans.anova / anovaWelch)
Constructor
new OneWayAnova(data: Record<string, number[]>, welch=false)
data – group → values.
- Set
welch=true for Welch ANOVA.
Public fields
F: number
dfBetween: number
dfWithin: number
p: number – right‑tail p‑value via F CDF.
k: number – number of groups.
msw: number – mean square within.
Independent Samples T‑Test
Two-sample t‑test. Supports Student (pooled) and Welch variants.
Class: IndependentTTest (returned by CompareMeans.independent / independentWelch)
Constructor
new IndependentTTest({ g1: number[], g2: number[] }, welch=false)
- Set
welch=true for Welch’s unequal-variance t‑test.
Public fields
t: number – t statistic.
df: number – degrees of freedom (Welch uses Satterthwaite).
p: number – two‑sided p‑value (getter).
F: number – ANOVA-equivalent t² (getter).
leveneF: number – Levene’s F for equality of variances.
leveneDf1: number, leveneDf2: number, leveneP: number – Levene’s test details.
k: number – number of groups (always 2 here).
One-Sample T‑Test
Class: OneSampleTTest (returned by CompareMeans.oneSample)
Constructor
new OneSampleTTest({ X: number[] }, mu0=0)
Public fields
t: number – t statistic.
df: number – n - 1.
p: number – two‑sided p‑value.
sd: number – sample standard deviation.
se: number – standard error sd / sqrt(n).
mu0: number – hypothesized mean.
Paired T‑Test
Paired (dependent) samples t‑test.
Class: PairedTTest (returned by CompareMeans.paired)
Constructor
new PairedTTest({ A: number[], B: number[] })
- Requires at least two arrays; internally trims to the same length.
Public fields
t: number – t statistic.
df: number – degrees of freedom (n - 1).
p: number – two‑sided p‑value (Student t).
meanDelta: number – mean of pairwise differences.
sdDelta: number – sample SD of differences.
n: number – number of paired observations.
diffs: number[] – raw pairwise deltas (A[i] − B[i]).
Correlate — practical usage
Two columns vs matrix
import { Analyze } from 'als-statistics';
const { Correlate } = Analyze;
const one = new Correlate({ X: [1,2,3], Y: [2,4,9] }).pearson('X', 'Y');
console.log(one.r, one.t, one.df, one.p);
const all = new Correlate({ A:[...], B:[...], C:[...] }).pearson();
console.log(Object.keys(all));
console.log(all['A|B'].r, all['A|B'].p);
Population vs sample (Pearson)
pearson() — uses population covariance in the r-formula.
pearsonSample() — uses sample covariance.
- Both provide two-sided
p via the t-distribution with df = n - 2.
const p1 = new Correlate(data).pearson();
const p2 = new Correlate(data).pearsonSample();
Spearman & Kendall (ties handled)
const s = new Correlate({ X:[...], Y:[...] }).spearman('X','Y');
const k = new Correlate({ X:[...], Y:[...] }).kendall('X','Y');
console.log(s.r, s.p, k.tau, k.p);
Two-sided helpers: .spearmanTwoSided() и .kendallTwoSided().
Reliability — Cronbach’s alpha
import { CronbachAlpha } from 'als-statistics/analyze/correlate/cronbach-alpha.js';
import { Analyze } from 'als-statistics';
const { Correlate } = Analyze;
const items = { Q1:[...], Q2:[...], Q3:[...] };
const alpha = new CronbachAlpha(items);
console.log(alpha.alpha);
console.log(alpha.ifItemsDeleted);
console.log(alpha.htmlTable);
Notes:
Correlate methods auto-trim vectors to the shortest length where needed (e.g., Spearman).
- Pairwise matrices return a plain object of test instances keyed as
'A|B'.
Cronbach’s Alpha
Class: CronbachAlpha
Constructor
new CronbachAlpha(data: Record<string, number[]>)
- Requires ≥ 2 parallel scales/items of equal length.
Public fields
alpha: number – reliability estimate.
sumOfVariances: number – sum of item variances (sample).
sumColumnVariance: number – variance of summed score across rows.
bessel: number – correction factor k/(k-1).
ifItemsDeleted: Record<string, number> – getter recomputed lazily.
htmlTable: string – formatted summary table (getter).
Kendall Rank Correlation (τ)
Class: Kendall
Constructor
new Kendall({ X: number[], Y: number[] }, twoSided=true)
Public fields
tau: number
z: number – normal approximation for significance
p: number – p‑value (two‑sided by default)
t: number – alias of z (for consistency with other tests)
df: number – Infinity (normal approximation)
Pearson Correlation
Class: Pearson
Constructor
new Pearson({ X: number[], Y: number[] }, population=false)
- When
population=true, covariance uses population denominator.
Public fields
covariance: number
df: number – n - 2
r: number – correlation coefficient in [-1, 1]
t: number – test statistic
p: number – two‑sided p‑value
Spearman Rank Correlation
Class: Spearman
Constructor
new Spearman({ X: number[], Y: number[] })
Public fields
r: number – Spearman’s rho
t: number – t‑approximation of significance
p: number – two‑sided p‑value
n: number – number of paired observations (shorter input is trimmed)
Analyze · Clustering
Density-based clustering over columns using precomputed distances between series.
Class: Dbscan
Constructor
new Dbscan(data: Record<string, number[]>, options?: { eps?: number, minPts?: number, metric?: 'mad' })
eps (default 0.4), minPts (default 3), metric (default 'mad').
Public fields
metric: string
eps: number · minPts: number
labels: number[] – 0 unvisited, -1 noise, 1.. cluster id per column.
clusters: Array<{ id:number, columns:string[] }> – built by buildClusters.
distances: number[][] – symmetric distance matrix.
- Core methods (invoked by constructor):
findNeighbors(i), expandCluster(i, clusterId), run().
Class: Hdbscan
Constructor
new Hdbscan(data: Record<string, number[]>, options?: { metric?: 'mad', minClusterSize?: number })
minClusterSize defaults to 2.
Public fields
metric: string, minClusterSize: number
labels: number[] – final labels per column.
clusters: Array<{ id:number, columns:string[] }>
mreachDistances: number[][] – mutual reachability distances.
mst: Array<[i,j,weight]> – minimum spanning tree.
hierarchy: Array<{ clusterId, lambdaBirth, lambdaDeath, points, size, children }>
Regression — practical usage
The Regression wrapper builds a sequence of models (steps). Start with a baseline, then call next([...]) to add more predictors. Interaction terms are supported via the 'X*Z' notation.
new Regression(data, { yName: string, xNames?: string[], type?: 'linear'|'logistic' })
reg.next(newPredictors: string[]): this
reg.steps: Array<RegressionBase>
reg.results: Array<Record<string, any>>
reg.htmlTables: string
A) Linear — baseline, then moderator (interaction)
import { Analyze } from 'als-statistics';
const { Regression } = Analyze;
const data = { X:[1,2,3,4,5], Z:[0,1,0,1,0], Y:[2,3,6,7,10] };
const reg = new Regression(data, { yName:'Y', xNames:['X'], type:'linear' });
reg.next(['Z', 'X*Z']);
const step0 = reg.steps[0].result;
const step1 = reg.steps[1].result;
console.log(step1.Variable.includes('X*Z'));
B) “Mediator‐like” step (add M, compare steps)
There’s no built-in mediation test (Sobel/bootstrapping).
However, you can model a putative mediator by adding it as a predictor on the next step and comparing coefficients/R².
const data = { X:[1,2,3,4,5,6], M:[2,4,5,7,7,9], Y:[3,5,7,9,10,13] };
const reg = new Regression(data, { yName:'Y', xNames:['X'], type:'linear' });
reg.next(['M']);
console.log(reg.steps[0].r2, reg.steps[1].r2);
console.log(reg.steps[1].result.Variable.includes('M'));
C) Logistic — classification with accuracy
const data = { X:[0,1,2,3,4], Y:[0,0,0,1,1] };
const logit = new Regression(data, { yName:'Y', xNames:['X'], type:'logistic' });
const s0 = logit.steps[0];
console.log(s0.result.Accuracy);
console.log(s0.predict(s0.X));
console.log(s0.predictProba(s0.X));
Notes & tips
- If you omit
xNames, the wrapper uses all columns except yName as predictors.
next([...]) creates a clone of the previous step’s columns and (if a name contains '*') generates the interaction term by multiplying the two source predictors element-wise.
- Linear steps expose
StdError[] and pValue[]. Logistic steps expose Accuracy.
- The wrapper and cores are deterministic for the same inputs.
Linear Regression (Core)
Class: Regression.LinearRegression
Constructor
new Regression.LinearRegression(table: Record<string, number[]>, yName: string, xNames: string[], stepIndex: number)
Public fields / getters
coefficients: number[] – [Intercept, β1, …].
y: number[], X: number[][], yHat: number[]
residuals: number[]
r2: number
standardErrors: number[]
pValues: number[]
n: number, k: number (obs & parameters)
result: { step, n, Variable, Coefficient, StdError, pValue }
htmlTable: string
Methods
calculate(): this
predict(X: number[][]): number[]
Logistic Regression (Core)
Class: Regression.LogisticRegression
Constructor
new Regression.LogisticRegression(table: Record<string, number[]>, yName: string, xNames: string[], stepIndex: number, learningRate=0.01, epochs=1000)
Public fields / getters
coefficients: number[] – [Intercept, β1, …]
y: number[], X: number[][], yHat: number[] (predicted classes)
accuracy: number
n: number, k: number
result: { step, n, Variable, Coefficient, Accuracy }
htmlTable: string
Methods
calculate(): this
predictProba(X: number[][]): number[] – probabilities via sigmoid.
predict(X: number[][], threshold=0.5): number[] – hard labels.
Analyze — overview & patterns
This section ties together the shortcuts across Table, Statistics, and Analyze.
From Table to analysis
import { Table } from 'als-statistics';
import { Analyze } from 'als-statistics';
const t = new Table(data, { name: 'Survey' });
const r1 = t.correlate('height','weight').pearson();
const rAll = t.correlate('height','weight','age').pearson();
const w = t.compareMeans('groupA','groupB').independentWelch();
const a1 = t.compareMeans('A','B','C').anova();
const aW = t.compareMeans('A','B','C').anovaWelch();
const lin = new Analyze.Regression(t.columns, { yName:'score', xNames:['age','hours'] });
const log = new Analyze.Regression(t.columns, { yName:'passed', xNames:['score'], type:'logistic' });
Split → Combine (Statistics) → Analyze
const S = t.splitBy('group', { 0:'ctrl', 1:'treat' });
const merged = S.columns('byGroup', 'score');
const test = merged.compareMeans('ctrl_score','treat_score').independentWelch();
Keep mutations API-only (addRow, setAt, splice, values=). Avoid in-place array edits to preserve caches and consistent results.
Descriptive Statistics
Static utility functions used across the library. These are also mixed into Column (arity‑1 functions as getters; others as methods).
Selected functions
sum(values) · mean(values) · median(values) · mode(values)
variance(values) · varianceSample(values) · stdDev(values) · stdDevSample(values) · cv(values)
min(values) · max(values) · range(values) · iqr(values) · mad(values)
zScore(values, v) · zScores(values) · zScoresSorted(values)
percentile(values, p) · q1(values) · q3(values) · p10(values) · p90(values)
weightedMean(values, weights)
confidenceInterval({ mean, stdDevSample, values })
outliersZScore(values, z=3) · outliersIQR(values)
slope({ values }) · regressionSlope({ X, Y })
spectralPowerDensityArray(values) · spectralPowerDensityMetric(values)
Refer to JSDoc in code for exact parameter objects where applicable.
Utils
General helper utilities.
Functions
htmlTable(rows, headers, options?) → string – render a simple HTML table (escapes content; supports firstColHeader, fixed decimals, transposition).
round(value, fixed=8) → number | string – numeric rounding with fixed decimals.
range(start=0, end, step=1) → number[] – numeric range.
filterKeys(keys: string[], filters: (string|number|RegExp)[]): string[] – include names, regex filter, and '-name' exclusions.
Counter – simple name counter with getName(name?).
EPS (Golden Test Tolerances)
| Descriptives | stat | 1e-6 | Means, medians, quantiles, variance |
| Z-scores | z | 1e-9 | Summary mean/std of z |
| Regression | reg | 1e-5 | Coefficients, metrics |
| CDF | cdf | 1e-9 | CDF/PPF checks |
| Correlations | r | 1e-7 | Pearson/Spearman/Kendall |
| Degrees of Fr. | df | 1e-6 | Welch df (float) |
| ANOVA F | anovaF | 1e-6 | |
| Flatness | flatness | 1e-12 | GM/AM stability |
| SPD Flatness | spd | 1e-12 | GM/AM on SPD |
| p-values | p | 1e-6 | |
Change these in goldens/settings.js if needed.
How‑to
-
Split a table by predicate and compare groups (Welch):
const { A, B } = Table.split(raw, r => r.group === 'A' ? 'A' : 'B');
const t = A.compareMeans('score','B.score').independentWelch();
-
Detect z‑outliers and keep sorted indices:
const { zScores, indexes } = Stats.zScoresSorted({ values });
const top3 = indexes.slice(-3);
-
Compute spectral flatness of a spectrum:
const spd = Stats.spectralPowerDensityArray({ values: magnitudes });
const flat = Stats.spectralPowerDensityMetric({ spectralPowerDensityArray: spd, values: magnitudes });
Live CodePen demos: add your links here.
Changelog [2.1.0] - 2025-09-04
Breaking change: als-statistics v2 is a ground-up rewrite with no backward compatibility with v1.x.
If you rely on v1: pin your dependency to the latest 1.x release.
npm i als-statistics@^1
Changed
Stats.harmonicMean(...) — inputs ≤ 0 are now clamped to ε (1e-12) before computation (aligns with Python goldens), preventing NaN/division-by-zero surprises.
Stats.zScores({ values }, sample = false) — added a second parameter:
sample = false (default): population std (ddof = 0) — backward-compatible.
sample = true: sample std (ddof = 1) — matches NumPy/SciPy z-scores and golden summaries.
Stats.flatness({ values }) — now returns 0 when the arithmetic mean is 0 (previously NaN), making all-zero vectors well-defined.
Fixed
Stats.mad(...) — corrected median absolute deviation for edge cases.
Tests
- Added golden cross-checks against Python (NumPy/SciPy) and HDBSCAN labels; all pass within documented EPS tolerances.
Notes: Default behavior remains the same for zScores (ddof=0) unless sample=true is provided. If your code relied on NaN from flatness/harmonicMean for zero/negative inputs, update downstream checks accordingly.