datocms-contentful-to-structured-text
This package contains utilities to convert Contentful Rich Text to a DatoCMS Structured Text dast
(DatoCMS Abstract Syntax Tree) document.
Please refer to the dast
format docs to learn more about the syntax tree format and the available nodes.
Usage
The main utility in this package is richTextToStructuredText
which takes a Rich Text JSON and transforms it into a valid dast
document.
richTextToStructuredText
returns a Promise
that resolves with a Structured Text document.
import { richTextToStructuredText } from 'datocms-contentful-to-structured-text';
const richText = {
nodeType: 'document',
data: {},
content: [
{
nodeType: 'heading-1',
content: [
{
nodeType: 'text',
value: 'Lorem ipsum dolor sit amet',
marks: [],
data: {},
},
],
data: {},
},
};
richTextToStructuredText(richText).then((structuredText) => {
console.log(structuredText);
});
Validate dast
documents
dast
is a strict format for DatoCMS' Structured Text fields and follows a different pattern from Contentful Rich Text structure.
The datocms-structured-text-utils
package provides a validate
utility to validate a Structured Text content to make sure that it is compatible with DatoCMS' Structured Text field.
import { validate } from 'datocms-structured-text-utils';
richTextToStructuredText(richText).then((structuredText) => {
const { valid, message } = validate(structuredText);
if (!valid) {
throw new Error(message);
}
});
We recommend to validate every dast
to avoid errors later when creating records.
Advanced Usage
Options
All the *ToStructuredText
utils accept an optional options
object as second argument:
type Options = Partial<{
handlers: Record<string, CreateNodeFunction>,
allowedBlocks: Array<
BlockquoteType | CodeType | HeadingType | LinkType | ListType,
>,
allowedMarks: Mark[],
}>;
Transforming Nodes
The utils in this library traverse a Contentful Rich Text
tree and transform supported nodes to dast
nodes. The transformation is done by working on a Contentful Rich Text
node with a handler (async) function.
Handlers are associated to Contentful Rich Text
nodes by nodeType
and look as follow:
import { visitChildren } from 'datocms-contentful-to-structured-text';
async function p(createDastNode, contentfulNode, context) {
return createDastNode('paragraph', {
children: await visitChildren(createDastNode, contentfulNode, context),
});
}
Handlers can return either a promise that resolves to a dast
node, an array of dast
Nodes, or undefined
to skip the current node.
To ensure that a valid dast
is generated, the default handlers also check that the current contentfulNode
is a valid dast
node for its parent and, if not, they ignore the current node and continue visiting its children.
Information about the parent dast
node name is available in context.parentNodeType
.
Please take a look at the default handlers implementation for examples.
The default handlers are available on context.defaultHandlers
.
Context
Every handler receives a context
object containing the following information:
export interface Context {
parentNodeType: NodeType;
parentNode: ContentfulNode;
handlers: Record<string, Handler<unknown>>;
defaultHandlers: Record<string, Handler<unknown>>;
marks?: Mark[];
allowedBlocks: Array<
BlockquoteType | CodeType | HeadingType | LinkType | ListType,
>;
allowedMarks: Mark[];
}
Custom Handlers
It is possible to register custom handlers and override the default behaviour via options, using the makeHandler
function.
For example, to create a custom handler for the Contentful text
element, specify a guard clause to specify the correct type
import { makeHandler } from 'datocms-contentful-to-structured-text';
const customTextHandler = makeHandler(
(node): node is Text => n.nodeType === "text",
async (node) => {
return [
{ type: 'span', value: node.value },
{ type: 'span', value: node.value },
];
}),
richTextToStructuredText(richText, {
handlers: [
customTextHandler,
],
}).then((structuredText) => {
console.log(structuredText);
});
import { paragraphHandler } from './customHandlers';
richTextToStructuredText(richText, {
handlers: {
paragraph: paragraphHandler,
},
}).then((structuredText) => {
console.log(structuredText);
});
It is highly encouraged to validate the dast
when using custom handlers because handlers are responsible for dictating valid parent-children relationships and therefore generating a tree that is compliant with DatoCMS Structured Text.
Preprocessing Rich Text
Because of the strictness of the dast
spec, it is possible that some elements might be lost during transformation.
To improve the final result, you might want to modify the Rich Text tree before it is transformed to dast
.
Examples
Split a node that contains an image.
In dast
, images can only be presented as Block
nodes, but blocks are not allowed inside of ListItem
nodes (unordered-list/ordered-list). In this example we will split the original unordered-list
in one list, the lifted up image block and another list.
import { liftAssets } from 'datocms-contentful-to-structured-text';
const richTextWithAssets = {
nodeType: 'document',
data: {},
content: [
{
nodeType: 'unordered-list',
content: [
{
nodeType: 'list-item',
content: [
{
nodeType: 'paragraph',
content: [
{
nodeType: 'text',
value: 'text',
marks: [],
data: {},
},
],
data: {},
},
{
content: [],
data: {
target: {
sys: {
id: 'zzz',
linkType: 'Asset',
type: 'Link',
},
},
},
nodeType: 'embedded-asset-block',
},
{
nodeType: 'paragraph',
content: [
{
nodeType: 'text',
value: 'text',
marks: [],
data: {},
},
],
data: {},
},
],
data: {},
},
],
data: {},
},
],
};
liftAssets(richTextWithAssets);
const handlers = {
'embedded-asset-block': async (createNode, node, context) => {
const item = '123';
return createNode('block', {
item,
});
},
};
const dast = await richTextToStructuredText(richTextWithAssets, { handlers });
The liftAssets function transforms the richText tree and moves the embedded-asset-block to root,splitting the list in two parts.
function liftAssets(richText) {
const visit = (node, cb, index = 0, parents = []) => {
if (node.content && node.content.length > 0) {
node.content.forEach((child, index) => {
visit(child, cb, index, [...parents, node]);
});
}
cb(node, index, parents);
};
const liftedImages = new WeakSet();
visit(richText, (node, index, parents) => {
if (
!node ||
node.nodeType !== 'embedded-asset-block' ||
liftedImages.has(node) ||
parents.length === 1
) {
return;
}
const imgParent = parents[parents.length - 1];
imgParent.content.splice(index, 1);
let i = parents.length;
let splitChildrenIndex = index;
const contentAfterSplitPoint = [];
while (--i > 0) {
const parent = parents[i];
const parentsParent = parents[i - 1];
contentAfterSplitPoint = parent.content.splice(splitChildrenIndex);
splitChildrenIndex = parentsParent.content.indexOf(parent);
let nodeInserted = false;
if (i === 1) {
splitChildrenIndex += 1;
parentsParent.content.splice(splitChildrenIndex, 0, node);
liftedImages.add(node);
nodeInserted = true;
}
splitChildrenIndex += 1;
if (contentAfterSplitPoint.length > 0) {
parentsParent.content.splice(splitChildrenIndex, 0, {
...parent,
content: contentAfterSplitPoint,
});
}
if (parent.content.length === 0) {
splitChildrenIndex -= 1;
parentsParent.content.splice(
nodeInserted ? splitChildrenIndex - 1 : splitChildrenIndex,
1,
);
}
}
});
}
License
MIT