What is spark-md5?
The spark-md5 npm package is a fast and lightweight library for generating MD5 hashes in JavaScript. It is particularly useful for hashing large files or data streams in the browser or in Node.js environments.
What are spark-md5's main functionalities?
Hashing a String
This feature allows you to generate an MD5 hash from a simple string input. It is useful for hashing small pieces of data quickly.
const SparkMD5 = require('spark-md5');
const hash = SparkMD5.hash('Hello, world!');
console.log(hash); // Outputs the MD5 hash of the string
Hashing an ArrayBuffer
This feature allows you to generate an MD5 hash from an ArrayBuffer, which is useful for hashing binary data or files.
const SparkMD5 = require('spark-md5');
const buffer = new TextEncoder().encode('Hello, world!');
const hash = SparkMD5.ArrayBuffer.hash(buffer);
console.log(hash); // Outputs the MD5 hash of the ArrayBuffer
Incremental Hashing
This feature allows you to generate an MD5 hash incrementally, which is useful for hashing large data streams or files in chunks.
const SparkMD5 = require('spark-md5');
const spark = new SparkMD5();
spark.append('Hello, ');
spark.append('world!');
const hash = spark.end();
console.log(hash); // Outputs the MD5 hash of the concatenated string
Other packages similar to spark-md5
crypto-js
Crypto-js is a widely-used library that provides a variety of cryptographic algorithms, including MD5, SHA-1, SHA-256, and more. It is more comprehensive than spark-md5, offering a broader range of hashing and encryption functionalities.
md5
The md5 package is a simple and straightforward library for generating MD5 hashes. It is similar to spark-md5 in terms of functionality but does not offer incremental hashing capabilities.
hash.js
Hash.js is a versatile library that supports multiple hashing algorithms, including MD5, SHA-1, and SHA-256. It is more flexible than spark-md5, providing a wider range of hashing options.
SparkMD5
SparkMD5 is a fast md5 implementation of the MD5 algorithm.
This script is based in the JKM md5 library which is the fastest algorithm around. This is most suitable for browser usage, because nodejs
version might be faster.
NOTE: Please disable Firebug while performing the test!
Firebug consumes a lot of memory and CPU and slows the test by a great margin.
Demo
Install
npm install --save spark-md5
Improvements over the JKM md5 library
- Strings are converted to utf8, like most server side algorithms
- Fix computation for large amounts of data (overflow)
- Incremental md5 (see below)
- Support for array buffers (typed arrays)
- Functionality wrapped in a closure, to avoid global assignments
- Object oriented library
- CommonJS (it can be used in node) and AMD integration
- Code passed through JSHint and JSCS
Incremental md5 performs a lot better for hashing large amounts of data, such as
files. One could read files in chunks, using the FileReader & Blob's, and append
each chunk for md5 hashing while keeping memory usage low. See example below.
Usage
Normal usage
var hexHash = SparkMD5.hash('Hi there');
var rawHash = SparkMD5.hash('Hi there', true);
Incremental usage
var spark = new SparkMD5();
spark.append('Hi');
spark.append(' there');
var hexHash = spark.end();
var rawHash = spark.end(true);
Hash a file incrementally
NOTE: If you test the code bellow using the file:// protocol in chrome you must start the browser with -allow-file-access-from-files argument.
Please see: http://code.google.com/p/chromium/issues/detail?id=60889
document.getElementById('file').addEventListener('change', function () {
var blobSlice = File.prototype.slice || File.prototype.mozSlice || File.prototype.webkitSlice,
file = this.files[0],
chunkSize = 2097152,
chunks = Math.ceil(file.size / chunkSize),
currentChunk = 0,
spark = new SparkMD5.ArrayBuffer(),
fileReader = new FileReader();
fileReader.onload = function (e) {
console.log('read chunk nr', currentChunk + 1, 'of', chunks);
spark.append(e.target.result);
currentChunk++;
if (currentChunk < chunks) {
loadNext();
} else {
console.log('finished loading');
console.info('computed hash', spark.end());
}
};
fileReader.onerror = function () {
console.warn('oops, something went wrong.');
};
function loadNext() {
var start = currentChunk * chunkSize,
end = ((start + chunkSize) >= file.size) ? file.size : start + chunkSize;
fileReader.readAsArrayBuffer(blobSlice.call(file, start, end));
}
loadNext();
});
You can see some more examples in the test folder.
Documentation
SparkMD5 class
SparkMD5#append(str)
Appends a string, encoding it to UTF8 if necessary.
SparkMD5#appendBinary(str)
Appends a binary string (e.g.: string returned from the deprecated readAsBinaryString).
SparkMD5#end(raw)
Finishes the computation of the md5, returning the hex result.
If raw
is true, the result as a binary string will be returned instead.
SparkMD5#reset()
Resets the internal state of the computation.
SparkMD5#getState()
Returns an object representing the internal computation state.
You can pass this state to setState(). This feature is useful to resume an incremental md5.
SparkMD5#setState(state)
Sets the internal computation state. See: getState().
SparkMD5#destroy()
Releases memory used by the incremental buffer and other additional resources.
SparkMD5.hash(str, raw)
Hashes a string directly, returning the hex result.
If raw
is true, the result as a binary string will be returned instead.
Note that this function is static
.
SparkMD5.hashBinary(str, raw)
Hashes a binary string directly (e.g.: string returned from the deprecated readAsBinaryString), returning the hex result.
If raw
is true, the result as a binary string will be returned instead.
Note that this function is static
.
SparkMD5.ArrayBuffer class
SparkMD5.ArrayBuffer#append(arr)
Appends an array buffer.
SparkMD5.ArrayBuffer#end(raw)
Finishes the computation of the md5, returning the hex result.
If raw
is true, the result as a binary string will be returned instead.
SparkMD5.ArrayBuffer#reset()
Resets the internal state of the computation.
SparkMD5.ArrayBuffer#destroy()
Releases memory used by the incremental buffer and other additional resources.
SparkMD5.ArrayBuffer#getState()
Returns an object representing the internal computation state.
You can pass this state to setState(). This feature is useful to resume an incremental md5.
SparkMD5.ArrayBuffer#setState(state)
Sets the internal computation state. See: getState().
SparkMD5.ArrayBuffer.hash(arr, raw)
Hashes an array buffer directly, returning the hex result.
If raw
is true, the result as a binary string will be returned instead.
Note that this function is static
.
License
The project is double licensed, being WTF2 the master license and MIT the alternative license.
The reason to have two licenses is that some entities refuse to use the master license (WTF2) due to
bad language. If that's also your case, you can choose the alternative license.
Credits
Joseph Myers