JSum
Consistent checksum calculation of JSON objects.
Quick start
const JSum = require('jsum')
const obj1 = {foo: [{c: 1}, {d: 2, e: 3}], bar: {a: 2, b: undefined}}
const obj2 = {bar: {b: undefined, a: 2}, foo: [{c: 1}, {e: 3, d: 2}]}
console.log(JSum.digest(obj1, 'SHA256', 'hex'))
console.log(JSum.digest(obj2, 'SHA256', 'hex'))
Why this module?
My main goal was to create Etag
s from JSON objects. The most intuitive approach
would have been something like:
const crypto = require('crypto')
function checksum (obj) {
return crypto.createHash('MD5').update(JSON.stringify(myObj)).digest('hex')
}
However, this approach would yield two different results for semantically same JSON objects:
console.log(checksum({"a": 1, "b": 2}))
console.log(checksum({"b": 2, "a": 1}))
JSum
on other hand makes sure that semantically same JSON objects always get the same checksum! Moreover, it provides a good deal
of time advantage over some other viable modules*:
Module | Time (ms) to hash a 181 MB JSON file (from memory) |
---|
json-hash | 81537 |
json-stable-stringify | 12134 |
JSum | 9656 |
json-checksum | FATAL ERROR: [...] - process out of memory |
*NOTE: The measurements above are not from formal benchmarking. A huge random JSON file
(181 MB) was taken as the base for benchmarking. The listed modules were used to create SHA256
hash of that file. To measure the time,
internal console.time(()
and console.timeEnd()
methods were used.
I don't whant this :-(
Fair enough! Just copy (check the license first!) this for your own code and hash as you will:
function serialize (obj) {
if (Array.isArray(obj)) {
return JSON.stringify(obj.map(i => serialize(i)))
} else if (typeof obj === 'object' && obj !== null) {
return Object.keys(obj)
.sort()
.map(k => `${k}:${serialize(obj[k])}`)
.join('|')
}
return obj
}