What is @jest/diff-sequences?
@jest/diff-sequences is a utility package used to efficiently compare sequences, such as arrays or strings, to find differences between them. It is particularly useful in testing scenarios where you need to assert the differences between expected and actual outputs.
What are @jest/diff-sequences's main functionalities?
Diffing Sequences
This feature allows you to compare two sequences and find the differences between them. The code sample demonstrates how to use the diffSequences function to compare two arrays and output the differences and commonalities.
const diffSequences = require('@jest/diff-sequences');
const a = [1, 2, 3, 4, 5];
const b = [1, 2, 4, 5, 6];
const result = [];
diffSequences(a.length, b.length, (indexA, indexB) => a[indexA] === b[indexB], (nCommon, aStart, bStart) => {
result.push({ type: 'common', aStart, bStart, nCommon });
}, (nCommon, aStart, bStart) => {
result.push({ type: 'different', aStart, bStart, nCommon });
});
console.log(result);
Other packages similar to @jest/diff-sequences
diff
The 'diff' package is a popular library for comparing sequences, such as strings or arrays, and generating a diff output. It provides a more comprehensive set of features for diffing, including support for different diff algorithms and output formats. Compared to @jest/diff-sequences, 'diff' is more feature-rich and versatile, but may be more complex to use for simple diffing tasks.
fast-diff
The 'fast-diff' package is a lightweight and fast library for comparing strings. It is optimized for performance and is suitable for real-time applications where speed is crucial. While it is limited to string comparisons, it offers a simpler and faster alternative to @jest/diff-sequences for string-specific diffing tasks.
diff-sequences
Compare items in two sequences to find a longest common subsequence.
The items not in common are the items to delete or insert in a shortest edit script.
To maximize flexibility and minimize memory, you write callback functions as configuration:
Input function isCommon(aIndex, bIndex)
compares items at indexes in the sequences and returns a truthy/falsey value. This package might call your function more than once for some pairs of indexes.
- Because your function encapsulates comparison, this package can compare items according to
===
operator, Object.is
method, or other criterion.
- Because your function encapsulates sequences, this package can find differences in arrays, strings, or other data.
Output function foundSubsequence(nCommon, aCommon, bCommon)
receives the number of adjacent items and starting indexes of each common subsequence. If sequences do not have common items, then this package does not call your function.
If N is the sum of lengths of sequences and L is length of a longest common subsequence, then D = N – 2L is the number of differences in the corresponding shortest edit script.
An O(ND) Difference Algorithm and Its Variations by Eugene W. Myers is fast when sequences have few differences.
This package implements the linear space variation with optimizations so it is fast even when sequences have many differences.
Usage
To add this package as a dependency of a project, do either of the following:
npm install diff-sequences
yarn add diff-sequences
To use diff
as the name of the default export from this package, do either of the following:
var diff = require('@jest/diff-sequences').default; // CommonJS modules
import diff from '@jest/diff-sequences'; // ECMAScript modules
Call diff
with the lengths of sequences and your callback functions:
const a = ['a', 'b', 'c', 'a', 'b', 'b', 'a'];
const b = ['c', 'b', 'a', 'b', 'a', 'c'];
function isCommon(aIndex, bIndex) {
return a[aIndex] === b[bIndex];
}
function foundSubsequence(nCommon, aCommon, bCommon) {
}
diff(a.length, b.length, isCommon, foundSubsequence);
Example of longest common subsequence
Some sequences (for example, a
and b
in the example of usage) have more than one longest common subsequence.
This package finds the following common items:
a[2] === b[0] | 'c' | foundSubsequence(1, 2, 0) |
a[4] === b[1] | 'b' | foundSubsequence(1, 4, 1) |
a[5] === b[3] && a[6] === b[4] | 'b', 'a' | foundSubsequence(2, 5, 3) |
The “edit graph” analogy in the Myers paper shows the following common items:
a[2] === b[0] | 'c' |
a[3] === b[2] && a[4] === b[3] | 'a', 'b' |
a[6] === b[4] | 'a' |
Various packages which implement the Myers algorithm will always agree on the length of a longest common subsequence, but might sometimes disagree on which items are in it.
Example of callback functions to count common items
function countCommonItems(a, b) {
let n = 0;
function isCommon(aIndex, bIndex) {
return a[aIndex] === b[bIndex];
}
function foundSubsequence(nCommon) {
n += nCommon;
}
diff(a.length, b.length, isCommon, foundSubsequence);
return n;
}
const commonLength = countCommonItems(
['a', 'b', 'c', 'a', 'b', 'b', 'a'],
['c', 'b', 'a', 'b', 'a', 'c'],
);
in common | commonLength | 4 |
to delete from a | a.length - commonLength | 3 |
to insert from b | b.length - commonLength | 2 |
If the length difference b.length - a.length
is:
- negative: its absolute value is the minimum number of items to delete from
a
- positive: it is the minimum number of items to insert from
b
- zero: there is an equal number of items to delete from
a
and insert from b
- non-zero: there is an equal number of additional items to delete from
a
and insert from b
In this example, 6 - 7
is:
- negative:
1
is the minimum number of items to delete from a
- non-zero:
2
is the number of additional items to delete from a
and insert from b
Example of callback functions to find common items
const findCommonItems = (a, b) => {
const array = [];
diff(
a.length,
b.length,
(aIndex, bIndex) => Object.is(a[aIndex], b[bIndex]),
(nCommon, aCommon) => {
for (; nCommon !== 0; nCommon -= 1, aCommon += 1) {
array.push(a[aCommon]);
}
},
);
return array;
};
const commonItems = findCommonItems(
['a', 'b', 'c', 'a', 'b', 'b', 'a'],
['c', 'b', 'a', 'b', 'a', 'c'],
);
Example of callback functions to diff index intervals
Instead of slicing array-like objects, you can adjust indexes in your callback functions.
const diffIndexIntervals = (a, aStart, aEnd, b, bStart, bEnd) => {
diff(
aEnd - aStart,
bEnd - bStart,
(aIndex, bIndex) => Object.is(a[aStart + aIndex], b[bStart + bIndex]),
(nCommon, aCommon, bCommon) => {
},
);
};
Example of callback functions to emulate diff command
Linux or Unix has a diff
command to compare files line by line. Its output is a shortest edit script:
- change adjacent lines from the first file to lines from the second file
- delete lines from the first file
- append or insert lines from the second file
const getRange = (start, end) =>
start + 1 === end ? `${start + 1}` : `${start + 1},${end}`;
const pushDelIns = (aLines, aIndex, aEnd, bLines, bIndex, bEnd, array) => {
const deleteLines = aIndex !== aEnd;
const insertLines = bIndex !== bEnd;
const changeLines = deleteLines && insertLines;
if (changeLines) {
array.push(`${getRange(aIndex, aEnd)}c${getRange(bIndex, bEnd)}`);
} else if (deleteLines) {
array.push(`${getRange(aIndex, aEnd)}d${String(bIndex)}`);
} else if (insertLines) {
array.push(`${String(aIndex)}a${getRange(bIndex, bEnd)}`);
} else {
return;
}
for (; aIndex !== aEnd; aIndex += 1) {
array.push(`< ${aLines[aIndex]}`);
}
if (changeLines) {
array.push('---');
}
for (; bIndex !== bEnd; bIndex += 1) {
array.push(`> ${bLines[bIndex]}`);
}
};
const findShortestEditScript = (a, b) => {
const aLines = a.split('\n');
const bLines = b.split('\n');
const aLength = aLines.length;
const bLength = bLines.length;
const isCommon = (aIndex, bIndex) => aLines[aIndex] === bLines[bIndex];
let aIndex = 0;
let bIndex = 0;
const array = [];
const foundSubsequence = (nCommon, aCommon, bCommon) => {
pushDelIns(aLines, aIndex, aCommon, bLines, bIndex, bCommon, array);
aIndex = aCommon + nCommon;
bIndex = bCommon + nCommon;
};
diff(aLength, bLength, isCommon, foundSubsequence);
pushDelIns(aLines, aIndex, aLength, bLines, bIndex, bLength, array);
return array.length === 0 ? '' : `${array.join('\n')}\n`;
};
Example of callback functions to format diff lines
Here is simplified code to format changed and unchanged lines in expected and received values after a test fails in Jest:
const formatDiffLines = (a, b) => {
const aLinesUn = format(a, {indent: 0 }).split('\n');
const bLinesUn = format(b, {indent: 0 }).split('\n');
const aLinesIn = format(a, {indent: 2 }).split('\n');
const bLinesIn = format(b, {indent: 2 }).split('\n');
const aLength = aLinesIn.length;
const bLength = bLinesIn.length;
const isCommon = (aIndex, bIndex) => aLinesUn[aIndex] === bLinesUn[bIndex];
let aIndex = 0;
let bIndex = 0;
const array = [];
const foundSubsequence = (nCommon, aCommon, bCommon) => {
for (; aIndex !== aCommon; aIndex += 1) {
array.push(`-·${aLinesIn[aIndex]}`);
}
for (; bIndex !== bCommon; bIndex += 1) {
array.push(`+·${bLinesIn[bIndex]}`);
}
for (; nCommon !== 0; nCommon -= 1, aIndex += 1, bIndex += 1) {
array.push(`··${bLinesIn[bIndex]}`);
}
};
diff(aLength, bLength, isCommon, foundSubsequence);
for (; aIndex !== aLength; aIndex += 1) {
array.push(`-·${aLinesIn[aIndex]}`);
}
for (; bIndex !== bLength; bIndex += 1) {
array.push(`+·${bLinesIn[bIndex]}`);
}
return array;
};
const expected = {
searching: '',
sorting: {
ascending: true,
fieldKey: 'what',
},
};
const received = {
searching: '',
sorting: [
{
descending: false,
fieldKey: 'what',
},
],
};
const diffLines = formatDiffLines(expected, received);
If N is the sum of lengths of sequences and L is length of a longest common subsequence, then N – L is length of an array of diff lines. In this example, N is 7 + 9, L is 5, and N – L is 11.
0 | '··Object {' | 0 | 0 |
1 | '····"searching": "",' | 1 | 1 |
2 | '-···"sorting": Object {' | 2 | |
3 | '-·····"ascending": true,' | 3 | |
4 | '+·····"sorting": Array [' | | 2 |
5 | '+·······Object {' | | 3 |
6 | '+·········"descending": false,' | | 4 |
7 | '··········"fieldKey": "what",' | 4 | 5 |
8 | '········},' | 5 | 6 |
9 | '+·····],' | | 7 |
10 | '··}' | 6 | 8 |
Example of callback functions to find diff items
Here is simplified code to find changed and unchanged substrings within adjacent changed lines in expected and received values after a test fails in Jest:
const findDiffItems = (a, b) => {
const isCommon = (aIndex, bIndex) => a[aIndex] === b[bIndex];
let aIndex = 0;
let bIndex = 0;
const array = [];
const foundSubsequence = (nCommon, aCommon, bCommon) => {
if (aIndex !== aCommon) {
array.push([-1, a.slice(aIndex, aCommon)]);
}
if (bIndex !== bCommon) {
array.push([1, b.slice(bIndex, bCommon)]);
}
aIndex = aCommon + nCommon;
bIndex = bCommon + nCommon;
array.push([0, a.slice(aCommon, aIndex)]);
};
diff(a.length, b.length, isCommon, foundSubsequence);
if (aIndex !== a.length) {
array.push([-1, a.slice(aIndex)]);
}
if (bIndex !== b.length) {
array.push([1, b.slice(bIndex)]);
}
return array;
};
const expectedDeleted = ['"sorting": Object {', '"ascending": true,'].join(
'\n',
);
const receivedInserted = [
'"sorting": Array [',
'Object {',
'"descending": false,',
].join('\n');
const diffItems = findDiffItems(expectedDeleted, receivedInserted);
0 | 0 | '"sorting": ' |
1 | 1 | 'Array [\n' |
2 | 0 | 'Object {\n"' |
3 | -1 | 'a' |
4 | 1 | 'de' |
5 | 0 | 'scending": ' |
6 | -1 | 'tru' |
7 | 1 | 'fals' |
8 | 0 | 'e,' |
The length difference b.length - a.length
is equal to the sum of diffItems[i][0]
values times diffItems[i][1]
lengths. In this example, the difference 48 - 38
is equal to the sum 10
.
in common | 0 | 11 + 10 + 11 + 2 | 0 |
to delete from a | –1 | 1 + 3 | -4 |
to insert from b | 1 | 8 + 2 + 4 | 14 |
Instead of formatting the changed substrings with escape codes for colors in the foundSubsequence
function to save memory, this example spends memory to gain flexibility before formatting, so a separate heuristic algorithm might modify the generic array of diff items to show changes more clearly:
For expected and received strings of serialized data, the result of finding changed lines, and then finding changed substrings within adjacent changed lines (as in the preceding two examples) sometimes displays the changes in a more intuitive way than the result of finding changed substrings, and then splitting them into changed and unchanged lines.