@fluidframework/sequence
Every item in a SharedSegmentSequence is at a specific position starting at 0, kind of like an array. However, it differs from an array in that the positions can move as local and remote collaborators make modifications to the sequence. There are a number of different sequence types:
- SharedString for storing and collaborating on a sequence of text
- SharedNumberSequence for storing and collaborating on a sequence of numbers
- SharedObjectSequence for storing and collaborating on a sequence of json serializable objects
As the name suggests SharedSegmentSequence, or sequence for short, are made of segments. Segments are the leaf nodes of the tree data structure that enables collaboration and backs the sequence. Segments may be split and merged as modifications are made to the sequence. Every segment has a length from 1, to the length of the sequence. The length of the sequence will be the combined length of all the segments.
When talking about positions in a sequence we use the terms near, and far. The nearest position in a sequence is 0, and the farthest position is its length. When comparing two positions the nearer position is closer to 0, and the farther position is closer to the length.
Using a Sequence
Sequences support three basic operations: insert, remove, and annotate.
Insert operations on the sequence take a single position argument along with the content. This position is inclusive. This position can any position in the sequence including 0, and the length of the sequence.
sharedString.insertText(0, "hi");
sharedString.insertText(
sharedString.getLength(),
"!");
sharedString.insertText(
2,
" world");
Remove operations take a start and an end position. The start position is similar to the insert’s position, in that is can be any position in the sequence and is inclusive. However, unlike insert the start position cannot be the length of the sequence, as nothing exists there yet. The end position is exclusive and must be greater than the start, so it can be any value from 1 to the length of the sequence.
sharedString.removeRange(0, 3);
sharedString.removeRange(0, sharedString.getLength());
Annotate operations can add or remove map-like properties to or from content of the sequence. They can store any json serializable data and have similar behavior to a shared map. Annotate takes a start and end position which work the same way as the start and end of the remove operation. In addition to start and end annotate also takes a map-like properties object. Each key of the provided properties object will be set on each position of the specified range. Setting a property key to null will remove that property from the positions in the range.
let props1 = sharedString.getPropertiesAtPosition(1);
let props5 = sharedString.getPropertiesAtPosition(5);
sharedString.annotateRange(0, 2, { weight: 5 });
props1 = sharedString.getPropertiesAtPosition(1);
props5 = sharedString.getPropertiesAtPosition(5);
sharedString.annotateRange(
0,
sharedString.getLength(),
{ decoration: "underline" });
props1 = sharedString.getPropertiesAtPosition(1);
props5 = sharedString.getPropertiesAtPosition(5);
sharedString.annotateRange(
0,
sharedString.getLength(),
{ weight: null });
props1 = sharedString.getPropertiesAtPosition(1);
props5 = sharedString.getPropertiesAtPosition(5);
Whenever an operation is performed on a sequence a sequenceDelta event will be raised. This even provides the ranges affected by the operation, the type of the operation, and the properties that were changes by the operation.
How Collaboration Works
Like other data structures the sequences are eventually consistent which means all collaborators will end up in the same final state, however, the intermediate states seen by each collaborator may not be seen by other collaborators. These intermediate states occur when two or more collaborators modify the same position in the sequence which results in a conflict.
The basic strategy for insert conflict resolution in the sequence is to merge far. This strategy depends on a fundamental property of the Fluid Framework, which is guaranteed ordering. So, if two or more collaborators perform an operation on a sequence, the operations will be given an ordering and all clients will see those operations in the same order. What this means for the merge far strategy for resolving conflicting inserts is that the first operation will be placed in the conflicting position when it is received. When the next insert with the same position arrives and is applied it will be placed at the specified position and the previous inserts content position will be increased by the length of the incoming content pushing is farther towards the length of the sequence. This is what we call merging far.
Like insert the strategies for remove and annotate also rely on guaranteed ordering. For remove and annotate only content visible to the collaborator creating the operation will be modified, any content ordered after the won’t be.
For remove this means we can’t have an insert and a remove at the same time, as they will have an order, and all collaborators will see the operations in the same order. We also detect overlapping removes made by different collaborators, the resolutions here is straightforward, the content is removed.
As mentioned above annotate operations behave like operations on Shared Maps. The merge strategy here is last one wins. So, if two collaborators set the same key on the annotates properties the operation that gets ordered last will determine the value.
Shared String
The Shared String is a specialized data structure for handling collaborative text. It is based on a more general Sequence data structure but has additional features that make working with text easier.
In addition to text, a Shared String can also contain markers. Markers can be used to store metadata at positions within the text, like the details of an image or Fluid object that should be rendered with the text.
Both markers and text are stored as segments in the Shared String. Text segments will be split and merged when modifications are made to the Shared String and will therefore have variable length matching the length of the text content they contain. Marker segments are never split or merged, and always have a length of 1.
Examples
-
Rich Text Editor Implementations
-
Integrations with Open Source Rich Text Editors
-
Plain Text Editor Implementations
Sparse Matrix
The Sparse Matrix is a specialized data structure for efficiently handling collaborative tabular data. The Sparse Matrix works in a similar fashion to raster scanning. When a row is inserted it is inserted with the maximum possible number of columns, 16,385. This makes it easy to find any cell in the Sparse Matrix as it will exist at Row * MaxCol + Col. In order to store this efficiently the Sparse Matrix doesn't materialize cells that don't have data, this is where Sparse comes from.
Just like any other sequence, the Sparse Matrix is made of segments. The segment types are RunSegments and PaddingSegments. RunSegment contain the data for cells that have data, and PaddingSegments fill the spaces that have no data. PaddingSegments just contain how long they are, and this is how the Sparse Matrix efficiently stores all the rows with the max number of columns. For instance, if we had a Matrix with 2 rows, and each row only contained data in a couple columns it's serialized form would look something like this:
[
{
"items":["Value in row 0 cell 0", "Value in row 0 cell 1"],
"length": 2,
},
{
"length": 16383,
},
{
"items":["Value in row 1 cell 0"],
"length": 1,
},
{
"length": 3,
},
{
"items":["Value in row 1 cell 4"],
"length": 1,
},
{
"length": 16380,
},
]
SharedObjectSequence and SharedNumberSequence
SharedObjectSequence and SharedNumberSequence are very similar distributed data structures. The only difference is the type of content they support. SharedNumberSequence only supports numbers as content, while SharedObjectSequence supports any JSON serializable object. Both DDSes support inserting, removing, and annotating content. Each piece of content -- that is, each number or object -- will occupy a single position in the sequence. The length of the sequence is the count of content items in the sequence.
An important note is that, unlike an array, positions are not guaranteed remain constant. The position of an item can change as content is added or removed from the sequence. To track or pass a reference to a specific piece of content within the sequence you should find its segment via segment = s.getContainingSegment(position)
and then use pos = s.getPosition(segment)
to get its current position in the tree.