Automerge
Join the Automerge Slack community
Automerge is a library of data structures for building collaborative applications in JavaScript.
A common approach to building JavaScript apps involves keeping the state of your application in
model objects, such as a JSON document. For example, imagine you are developing a task-tracking app
in which each task is represented by a card. In vanilla JavaScript you might write the following:
var doc = {cards: []}
doc.cards.push({title: 'Reticulate splines', done: false})
doc.cards[0].done = true
Automerge is used in a similar way, but the big difference is that it supports automatic syncing
and merging:
-
You can have a copy of the application state locally on several devices (which may belong to the
same user, or to different users). Each user can independently update the application state on
their local device, even while offline, and save the state to local disk.
(Similar to git, which allows you to edit files and commit changes offline.)
-
When a network connection is available, Automerge figures out which changes need to be synced from
one device to another, and brings them into the same state.
(Similar to git, which lets you push your own changes, and pull changes from other developers,
when you are online.)
-
If the state was changed concurrently on different devices, Automerge automatically merges the
changes together cleanly, so that everybody ends up in the same state, and no changes are lost.
(Different from git: no merge conflicts to resolve!)
Features and Design Principles
- Network-agnostic. Automerge is a pure data structure library that does not care about what kind of
network you use: client/server, peer-to-peer, Bluetooth, carrier pigeon, whatever, anything goes.
Bindings to particular networking technologies are handled by separate libraries. For example, see
MPL for an implementation that uses Automerge in a
peer-to-peer model using WebRTC, and
Hypermerge is a peer-to-peer networking layer that uses
Hypercore, part of the Dat project.
- Immutable state. An Automerge object is an immutable snapshot of the application state at one
point in time. Whenever you make a change, or merge in a change that came from the network, you
get back a new state object reflecting that change. This fact makes Automerge compatible with the
functional reactive programming style of Redux and
Elm, for example. Internally, Automerge is built upon Facebook's
Immutable.js, but the Automerge API uses regular
JavaScript objects (using
Object.freeze
to prevent accidental mutation). - Automatic merging. Automerge is a so-called Conflict-Free Replicated Data Type
(CRDT), which allows
concurrent changes on different devices to be merged automatically without requiring any central
server. It is based on academic research on JSON CRDTs, but
the details of the algorithm in Automerge are different from the JSON CRDT paper, and we are
planning to publish more detail about it in the future.
- Fairly portable. We're not yet making an effort to support old platforms, but we have tested
Automerge in Node.js, Chrome, Firefox, Safari, MS Edge, and Electron.
Setup
If you're using npm, npm install automerge
.
If you're using yarn, yarn add automerge
.
Then you can import it with require('automerge')
as in the example below.
Otherwise, clone this repository, and then you can use the following commands:
yarn install
— installs dependencies.yarn test
— runs the test suite in Node.yarn run browsertest
— runs the test suite in web browsers.yarn build
— creates a bundled JS file dist/automerge.js
for web browsers.
It includes the dependencies and is set up so that you can load through a script tag.
Example Usage
The following code samples give a quick overview of how to use Automerge.
For examples of real-life applications built upon Automerge, check out:
const Automerge = require('automerge')
let doc1 = Automerge.init()
doc1 = Automerge.change(doc1, 'Initialize card list', doc => {
doc.cards = []
})
doc1 = Automerge.change(doc1, 'Add card', doc => {
doc.cards.push({title: 'Rewrite everything in Clojure', done: false})
})
doc1 = Automerge.change(doc1, 'Add another card', doc => {
doc.cards.insertAt(0, {title: 'Rewrite everything in Haskell', done: false})
})
let doc2 = Automerge.init()
doc2 = Automerge.merge(doc2, doc1)
doc1 = Automerge.change(doc1, 'Mark card as done', doc => {
doc.cards[0].done = true
})
doc2 = Automerge.change(doc2, 'Delete card', doc => {
delete doc.cards[1]
})
let finalDoc = Automerge.merge(doc1, doc2)
Automerge.getHistory(finalDoc)
.map(state => [state.change.message, state.snapshot.cards.length])
Documentation
Automerge document lifecycle
Automerge.init(actorId)
creates a new, empty Automerge document.
You can optionally pass in an actorId
, which is a string that uniquely identifies the current
node; if you omit actorId
, a random UUID is generated.
If you pass in your own actorId
, you must ensure that there can never be two different processes
with the same actor ID. Even if you have two different processes running on the same machine, they
must have distinct actor IDs. Unless you know what you are doing, it is recommended that you stick
with the default, and let actorId
be auto-generated.
Automerge.save(doc)
serializes the state of Automerge document doc
to a string, which you can
write to disk. The string contains an encoding of the full change history of the document
(a bit like a git repository).
Automerge.load(string, actorId)
unserializes an Automerge document from a string
that was
produced by Automerge.save()
. The actorId
argument is optional, and allows you to specify
a string that uniquely identifies the current node, like with Automerge.init()
. Unless you know
what you are doing, it is recommended that you omit the actorId
argument.
Manipulating and inspecting state
Automerge.change(doc, message, callback)
enables you to modify an Automerge document doc
.
The doc
object is not modified directly, since it is immutable; instead, Automerge.change()
returns an updated copy of the document. The callback
function is called with a mutable copy of
doc
, as shown below. The message
argument allows you to attach an arbitrary string to the
change, which is not interpreted by Automerge, but saved as part of the change history. The message
argument is optional; if you want to omit it, you can simply call Automerge.change(doc, callback)
.
Within the callback you can use standard JavaScript object manipulation operations to change the
document:
newDoc = Automerge.change(currentDoc, doc => {
doc.property = 'value'
doc['property'] = 'value'
delete doc['property']
doc.stringValue = 'value'
doc.numberValue = 1
doc.boolValue = true
doc.nullValue = null
doc.nestedObject = {}
doc.nestedObject.property = 'value'
doc.otherObject = {key: 'value', number: 42}
})
The top-level Automerge document is always an object (i.e. a mapping from properties to values).
You can use arrays (lists) by assigning a JavaScript array object to a property within a document.
Then you can use most of the standard
Array functions
to manipulate the array:
newDoc = Automerge.change(currentDoc, doc => {
doc.list = []
doc.list.push(2, 3)
doc.list.unshift(0, 1)
doc.list[3] = Math.PI
for (let i = 0; i < doc.list.length; i++) doc.list[i] *= 2
doc.list.insertAt(1, 'hello', 'world')
doc.list.deleteAt(5)
doc.list.splice(2, 2, 'automerge')
doc.list[4] = {key: 'value'}
})
The newDoc
returned by Automerge.change()
is a regular JavaScript object containing all the
edits you made in the callback. Any parts of the document that you didn't change are carried over
unmodified. The only special things about it are:
- All objects in the document are made immutable using
Object.freeze()
,
to ensure you don't accidentally modify them outside of an Automerge.change()
callback. - Every object has a unique ID, which you can get by passing the object to the
Automerge.getObjectId()
function. This ID is used by Automerge to track which object is which. - Objects also have information about conflicts, which is used when several users make changes
to the same property concurrently (see below). You can get conflicts using the
Automerge.getConflicts()
function.
Counters
If you have a numeric value that is only ever changed by adding or subtracting (e.g. counting how
many times the user has done a particular thing), it is recommmended that you use the
Automerge.Counter
datatype instead of a plain number. You set it up like this:
state = Automerge.change(state, doc => {
doc.buttonClicks = new Automerge.Counter()
})
To get the current counter value, use doc.buttonClicks.value
.
Whenever you want to increase or decrease the counter value, you can use the .increment()
or .decrement()
method:
state = Automerge.change(state, doc => {
doc.buttonClicks.increment()
doc.buttonClicks.increment(4)
doc.buttonClicks.decrement(3)
})
Using the Automerge.Counter
datatype is safer than changing a number value using the ++
or
+= 1
operators: if several users concurrently change a Automerge.Counter
value, all the
changes are added up as you'd expect, whereas concurrent ++
or += 1
operations will result
in conflicts that need to be resolved (see "Conflicting changes" below).
Another note: in relational databases it is common to use an auto-incrementing counter to generate
primary keys for rows in a table, but this is not safe in Automerge, since several users may end
up generating the same counter value! See the section "Relational table support" below for
implementing a relational-like table with a primary key.
Text editing support
Automerge.Text
provides experimental support for collaborative text editing.
Under the hood, text is represented as a list of characters, which is edited by inserting or
deleting individual characters. Compared to using a regular JavaScript array,
Automerge.Text
offers better performance.
(Side note: technically, text should be represented as a list of
Unicode grapheme clusters.
What the user thinks of as a "character" may actually be a series of several Unicode code points,
including accents, diacritics, and other combining marks. A grapheme cluster is the smallest
editable unit of text: that is, the thing that gets deleted if you press the delete key once, or the
thing that the cursor skips over if you press the right-arrow key once. Emoji make a good test case,
since many emoji consist of a sequence of several Unicode code points — for example, the
skintone modifier is a combining mark.)
You can create a Text object inside a change callback.
Then you can use insertAt()
and deleteAt()
to insert and delete characters (same API as for
list modifications, shown above):
newDoc = Automerge.change(currentDoc, doc => {
doc.text = new Automerge.Text()
doc.text.insertAt(0, 'h', 'e', 'l', 'l', 'o')
doc.text.deleteAt(0)
doc.text.insertAt(0, 'H')
})
To inspect a text object and render it, you can use the following methods
(outside of a change callback):
newDoc.text.length
newDoc.text.get(0)
newDoc.text.join('')
for (let char of newDoc.text) console.log(char)
Relational table support
Automerge.Table
provides a collection datatype that is similar to a table in a relational
database. It is intended for a set of objects (rows) that have the same properties (columns
in a relational table). Unlike a list, the objects have no order. You can scan over the objects
in a table, or look up individual objects by their primary key. An Automerge document can contain
as many tables as you want.
Each object is assigned a primary key (a unique ID) by Automerge. When you want to reference one
object from another, it is important that you use this Automerge-generated ID; do not generate
your own IDs.
You can create new tables and insert rows like this:
let database = Automerge.change(Automerge.init(), doc => {
doc.authors = new Automerge.Table(['surname', 'forename'])
doc.publications = new Automerge.Table(['type', 'authors', 'title', 'publisher',
'edition', 'year'])
const martinID = doc.authors.add({surname: 'Kleppmann', forename: 'Martin'})
const ddia = doc.publications.add({
type: 'book',
authors: [martinID],
title: 'Designing Data-Intensive Applications',
publisher: "O'Reilly Media",
year: 2017
})
})
You can read the contents of a table like this:
database.publications.rows
database.publications.ids
database.publications.byId('29f6cd15-61ff-460d-b7fb-39a5594f32d5')
database.publications.count
database.publications.filter(pub => pub.title.startsWith('Designing'))
database.publications.map(pub => pub.publisher)
Note that currently the Automerge.Table
type does not enforce a schema; the list of columns is
given because it is useful metadata, but it doesn't actually change how rows are stored. It's
possible to have row objects that don't have values for all columns (e.g. in the example above,
the "edition" property is not set).
Sending and receiving changes
The Automerge library itself is agnostic to the network layer — that is, you can use whatever
communication mechanism you like to get changes from one node to another. There are currently
a few options, with more under development:
- Use
Automerge.getChanges()
and Automerge.applyChanges()
to manually capture changes on one
node and apply them on another. - Use
Automerge.Connection
,
an implementation of a protocol that syncs up two nodes by determining missing changes and
sending them to each other. - Use MPL, which runs the
Automerge.Connection
protocol
over WebRTC.
The getChanges()/applyChanges()
API works as follows:
newDoc = Automerge.change(currentDoc, doc => {
})
let changes = Automerge.getChanges(currentDoc, newDoc)
network.broadcast(JSON.stringify(changes))
let changes = JSON.parse(network.receive())
newDoc = Automerge.applyChanges(currentDoc, changes)
Note that Automerge.getChanges(oldDoc, newDoc)
takes two documents as arguments: an old state
and a new state. It then returns a list of all the changes that were made in newDoc
since
oldDoc
. If you want a list of all the changes ever made in newDoc
, you can call
Automerge.getChanges(Automerge.init(), newDoc)
.
The counterpart, Automerge.applyChanges(oldDoc, changes)
applies the list of changes
to the
given document, and returns a new document with those changes applied. Automerge guarantees that
whenever any two documents have applied the same set of changes — even if the changes were
applied in a different order — then those two documents are equal. That property is called
convergence, and it is the essence of what Automerge is all about.
Automerge.merge(doc1, doc2)
is a related function that is useful for testing. It looks for any
changes that appear in doc2
but not in doc1
, and applies them to doc1
, returning an updated
version of doc1
. This function requires that doc1
and doc2
have different actor IDs (that is,
they originated from different calls to Automerge.init()
). See the Example Usage section above
for an example using Automerge.merge()
.
Conflicting changes
Automerge allows different nodes to independently make arbitrary changes to their respective copies
of a document. In most cases, those changes can be combined without any trouble. For example, if
users modify two different objects, or two different properties in the same object, then it is
straightforward to combine those changes.
If users concurrently insert or delete items in a list (or characters in a text document), Automerge
preserves all the insertions and deletions. If two users concurrently insert at the same position,
Automerge will arbitrarily place one of the insertions first and the other second, while ensuring
that the final order is the same on all nodes.
The only case Automerge cannot handle automatically, because there is no well-defined resolution,
is when users concurrently update the same property in the same object (or, similarly, the same
index in the same list). In this case, Automerge arbitrarily picks one of the concurrently written
values as the "winner":
let doc1 = Automerge.change(Automerge.init(), doc => { doc.x = 1 })
let doc2 = Automerge.change(Automerge.init(), doc => { doc.x = 2 })
doc1 = Automerge.merge(doc1, doc2)
doc2 = Automerge.merge(doc2, doc1)
Although only one of the concurrently written values shows up in the object, the other values are
not lost. They are merely relegated to a conflicts object:
doc1
doc2
Automerge.getConflicts(doc1, 'x')
Automerge.getConflicts(doc2, 'x')
Here, the conflicts object contains the property x
, which matches the name of the property
on which the concurrent assignments happened. The nested key 0506162a-ac6e-4567-bc16-a12618b71940
is the actor ID that performed the assignment, and the associated value is the value it assigned
to the property x
. You might use the information in the conflicts object to show the conflict
in the user interface.
The next time you assign to a conflicting property, the conflict is automatically considered to
be resolved, and the conflict disappears from the object returned by Automerge.getConflicts()
.
Undo and Redo
Automerge makes it easy to support an undo/redo feature in your application. Note that undo is
a somewhat tricky concept in a collaborative application! Here, "undo" is taken as meaning "what the
user expects to happen when they hit Ctrl-Z/Cmd-Z". In particular, the undo feature undoes the most
recent change by the local user; it cannot currently be used to revert changes made by other
users.
Moreover, undo is not the same as jumping back to a previous version of a document; see
the next section on how to examine document history. Undo works by
applying the inverse operation of the local user's most recent change, and redo works by applying
the inverse of the inverse. Both undo and redo create new changes, so from other users' point of
view, an undo or redo looks the same as any other kind of change.
To check whether undo is currently available, use the function Automerge.canUndo(doc)
. It returns
true if the local user has made any changes since the document was created or loaded. You can then
call Automerge.undo(doc)
to perform an undo. The functions canRedo()
and redo()
do the
inverse:
let doc = Automerge.change(Automerge.init(), doc => { doc.birds = [] })
doc = Automerge.change(doc, doc => { doc.birds.push('blackbird') })
doc = Automerge.change(doc, doc => { doc.birds.push('robin') })
Automerge.canUndo(doc)
doc = Automerge.undo(doc)
doc = Automerge.undo(doc)
doc = Automerge.redo(doc)
doc = Automerge.redo(doc)
You can pass an optional message
as second argument to Automerge.undo(doc, message)
and
Automerge.redo(doc, message)
. This string is used as "commit message" that describes the
undo/redo change, and it appears in the change history (see next section).
Examining document history
An Automerge document internally saves a complete history of all the changes that were ever made
to it. This enables a nice feature: looking at the document state at past points in time, a.k.a.
time travel!
Automerge.getHistory(doc)
returns a list of all edits made to a document. Each edit is an object
with two properties: change
is the internal representation of the change (in the same form as
Automerge.getChanges()
returns), and snapshot
is the state of the document at the moment just
after that change had been applied.
Automerge.getHistory(doc2)
Within the change object, the property message
is set to the free-form "commit message" that
was passed in as second argument to Automerge.change()
(if any). The rest of the change object
is specific to Automerge implementation details, and normally shouldn't need to be interpreted.
If you want to find out what actually changed in a particular edit, rather than inspecting the
change object, it is better to use Automerge.diff(oldDoc, newDoc)
. This function returns a list
of edits that were made in document newDoc
since its prior version oldDoc
. You can pass in
snapshots returned by Automerge.getHistory()
in order to determine differences between historic
versions.
The data returned by Automerge.diff()
has the following form:
let history = Automerge.getHistory(doc2)
Automerge.diff(history[2].snapshot, doc2)
In the objects returned by Automerge.diff()
, obj
indicates the object ID of the object being
edited (the same as returned by Automerge.getObjectId()
), and type
indicates whether that object
is a map
, list
, or text
.
The available values for action
depend on the type of object. For type: 'map'
, the possible
actions are:
action: 'set'
: Then the property key
is the name of the property being updated. If the value
assigned to the property is a primitive (string, number, boolean, null), then value
contains
that value. If the assigned value is an object (map, list, or text), then value
contains the
ID of that object, and additionally the property link: true
is set. Moreover, if this
assignment caused conflicts, then the conflicting values are additionally contained in a
conflicts
property.action: 'remove'
: Then the property key
is the name of the property being removed.
For type: 'list'
and type: 'text'
, the possible actions are:
action: 'insert'
: Then the property index
contains the list index at which a new element is
being inserted, and value
contains the value inserted there. If the inserted value is an
object, the value
property contains its ID, and the property link: true
is set.action: 'set'
: Then the property index
contains the list index to which a new value is being
assigned, and value
contains that value. If the assigned value is an object, the value
property contains its ID, and the property link: true
is set.action: 'remove'
: Then the property index
contains the list index that is being removed from
the list.
Caveats
The project currently has a number of limitations that you should be aware of:
- No integrity checking: if a buggy (or malicious) device makes corrupted edits, it can cause
the application state on other devices to become corrupted or go out of sync.
- No security: there is currently no encryption, authentication, or access control.
- Small number of collaborators: Automerge is designed for small-group collaborations. While there
is no hard limit on the number of devices that can update a document, performance will degrade
if you go beyond, say, 100 devices or so.
- ...and more, see the open issues.
Meta
Copyright 2017, Ink & Switch LLC, and University of Cambridge.
Released under the terms of the MIT license (see LICENSE
).
Created by
Martin Kleppmann,
Orion Henry,
Peter van Hardenberg,
Roshan Choxi, and
Adam Wiggins.