Persistent Merkle Tree

A binary merkle tree implemented as a persistent data structure.
Example
import {LeafNode, BranchNode} from "@chainsafe/persistent-merkle-tree";
const leaf = LeafNode.fromRoot(Buffer.alloc(32, 0xaa));
const otherLeaf = LeafNode.fromRoot(Buffer.alloc(32, 0xbb));
const branch = new BranchNode(leaf, otherLeaf);
const r: Uint8Array = branch.root;
branch.isLeaf() === false;
leaf.isLeaf() === true;
import {zeroNode} from "@chainsafe/persistent-merkle-tree";
const zero0 = zeroNode(0);
const zero1 = zeroNode(1);
const zero1 = zeroNode(2);
import {Tree} from "@chainsafe/persistent-merkle-tree";
const tree = new Tree(zeroNode(10));
const rootNode: Node = tree.rootNode;
const rr: Uint8Array = tree.root;
const gindex = BigInt(...);
const n: Node = tree.getNode(gindex);
const rrr: Uint8Array = tree.getRoot(gindex);
const subtree: Tree = tree.getSubtree(gindex);
const proof: Uint8Array[] = tree.getSingleProof(gindex);
import {ProofType} from "@chainsafe/persistent-merkle-tree";
const gindices: BigInt[] = [...];
const proof: Proof = tree.getProof({
type: ProofType.treeOffset,
gindices,
});
const partialTree: Tree = Tree.createFromProof(proof);
const unknownGindex: BigInt = ...;
gindices.includes(unknownGindex)
partialTree.getRoot(unknownGindex)
Motivation
When dealing with large datasets, it is very expensive to merkleize them in their entirety. In cases where large datasets are remerkleized often between updates and additions, using ephemeral structures for intermediate hashes results in significant duplicated work, as many intermediate hashes will be recomputed and thrown away on each merkleization. In these cases, maintaining structures for the entire tree, intermediate nodes included, can mitigate these issues and allow for additional usecases (eg: proof generation). This implementation also uses the known immutability of nodes to share data between common subtrees across different versions of the data.
Features
Intermediate nodes with cached, lazily computed hashes
The tree is represented as a linked tree of Node
s, currently either BranchNode
s or LeafNode
s.
A BranchNode
has a left
and right
child Node
, and a root
, 32 byte Uint8Array
.
A LeafNode
has a root
.
The root
of a Node
is not computed until requested, and cached thereafter.
Shared data betwen common subtrees
Any update to a tree (either to a leaf or intermediate node) is performed as a rebinding that yields a new, updated tree that maximally shares data between versions. Garbage collection allows memory from unused nodes to be eventually reclaimed.
Mutable wrapper for the persistent core
A Tree
object wraps Node
and provides an API for tree navigation and transparent rebinding on updates.
Navigation by gindex or (depth, index)
Many tree methods allow navigation with a gindex. A gindex (or generalized index) describes a path through the tree, starting from the root and nagivating downwards.
1
/ \
2 3
/ \ / \
4 5 6 7
It can also be interpreted as a bitstring, starting with "1", then appending "0" for each navigation left, or "1" for each navigation right.
1
/ \
10 11
/ \ / \
100 101 110 111
Alternatively, several tree methods, with names ending with AtDepth
, allow navigation by (depth, index). Depth and index navigation works by first navigating down levels into the tree from the top, starting at 0 (depth), and indexing nodes from the left, starting at 0 (index).
0 <- depth 0
/ \
0 1 <- depth 1
/ \ / \
0 1 2 3 <- depth 2
Memory efficiency
As an implementation detail, nodes hold their values as a HashObject
(a javascript object with h0
, ... h7
uint32 number
values) internally, rather than as a 32 byte Uint8Array
. Memory benchmarking shows that this results in a ~3x reduction in memory over Uint8Array
. This optimization allows this library to practically scale to trees with millions of nodes.
Navigation efficiency
In performance-critical applications performing many reads and writes to trees, being smart with tree navigation is crucial. This library correctly provides tree navigation methods that handle several important optimized cases: multi-node get and set, and get-then-set operations.
See also:
https://github.com/protolambda/remerkleable
Audit
This repo was audited by Least Authority as part of this security audit, released 2020-03-23. Commit 8b5ad7
verified in the report.
License
Apache-2.0