unified
unified recently changed its interface. These changes have
yet to bubble through to other processors before all examples
start working.
unified is an interface for processing text using syntax trees.
It’s what powers remark, retext, and
others, but it also allows for processing between multiple syntaxes.
Installation
npm:
npm install unified
unified is also available as an AMD, CommonJS, and globals module,
uncompressed and compressed.
Usage
var unified = require('unified');
var markdown = require('remark-parse');
var lint = require('remark-lint');
var html = require('remark-html');
process.stdin
.pipe(unified())
.use(markdown)
.use(lint)
.use(html)
.pipe(process.stdout);
Table of Contents
-
Description
-
API
- processor()
- processor.use(plugin[, options])
- processor.parse(file|value[, options])
- processor.stringify(node[, file|value][, options])
- processor.run(node[, file|value][, done])
- processor.process(file|value[, options][, done])
- processor.write(chunk[, encoding][, callback])
- processor.end()
- processor.pipe(stream[, options])
- processor.data(key[, value])
- processor.abstract()
-
License
Description
unified is an interface for processing text using syntax trees.
Syntax trees are a representation understandable to programs.
Those programs, called plug-ins, take these trees and
modify them, amongst other things. To get to the syntax tree from
input text, there’s a parser, and, to get from that
back to text, there’s a compiler. This is the
process of a processor.
┌──────────────┐
┌─ │ Transformers │ ─┐
▲ └──────────────┘ ▼
└───────┐ ┌───────┘
│ │
┌───────┐ ┌──────┐ ┌────────┐
│ Input │ ── Parser ─▶ │ Tree │ ─ Compiler ▶ │ Output │
└───────┘ └──────┘ └────────┘
Processors
Every processor implements another processor. To create a new
processor, invoke another processor. This creates a new processor
which is configured to function the same as its ancestor. But, when
the descendant processor is configured in the future, that
configuration does not change the ancestral processor.
Often, when processors are exposed from a library (for example,
unified itself), they should not be modified directly, as that
would change their behaviour for all users. Those processors are
abstract, and they should be made concrete before
they are used, by invoking them.
Node
The syntax trees used in unified are Unist nodes,
which are plain JavaScript objects with a type
property. The
semantics of those type
s are defined by other projects.
There are several utilities for working with these
nodes.
List of Processors
The following projects process different syntax trees. They parse
text to their respective syntax tree, and they compile their syntax
trees back to text. These processors can be used as-is, or their
parser and compilers can be mixed and matched with other plug-ins
to allow processing between different syntaxes.
File
When processing documents, metadata is often gathered about that
document. VFile is a virtual file format which stores
data, and handles metadata for unified and its plug-ins.
There are several utilities for working with these
files.
Configuration
To configure a processor, invoke its use
method, supply it a
plug-in, and optionally settings.
Streaming
unified provides a streaming interface which enables it to plug
into transformations outside of itself. An example, which reads
markdown as input, adds a table of content, and writes it out, would
be as follows:
var unified = require('unified');
var markdown = require('remark-parse');
var stringify = require('remark-stringify');
var toc = require('remark-toc');
process.stdin
.pipe(unified())
.use(parse)
.use(toc)
.use(stringify)
.pipe(process.stdout);
Which when given on stdin(4):
# Alpha
## Table of Content
## Bravo
Yields, on stdout(4):
# Alpha
## Table of Content
* [Bravo](#bravo)
## Bravo
Programming interface
Next to streaming, there’s also a programming interface, which gives
access to processing metadata (such as lint messages), and supports
multiple passed through files:
var unified = require('unified');
var markdown = require('remark-parse');
var lint = require('remark-lint');
var html = require('remark-html');
var remark2retext = require('remark-retext');
var english = require('retext-english');
var equality = require('retext-equality');
var report = require('vfile-reporter');
unified()
.use(markdown)
.use(lint)
.use(remark2retext, unified().use(english).use(equality))
.use(html)
.process('## Hey guys', function (err, file) {
console.log(report(file));
console.log(file.contents);
});
Which yields:
<stdin>
1:1-1:12 warning First heading level should be `1` first-heading-level
1:8-1:12 warning `guys` may be insensitive, use `people`, `persons`, `folks` instead
⚠ 2 warnings
<h2>Hey guys</h2>
Bridge
unified bridges transform the syntax tree from one flavour to
another. Then, they apply another processor’s transformations on
that tree. And then, if possible, mutating the origin tree based
on changes made to the destination tree. Finally, it continues
running the origin process.
See unified-bridge for more information.
API
processor()
Object describing how to process text.
Returns
Function
— A new concrete processor which is
configured to function the same as its ancestor. But, when the
descendant processor is configured in the future, that configuration
does not change the ancestral processor.
Example
The following example shows how a new processor can be created (from
the remark processor) and linked to stdin(4) and stdout(4).
var remark = require('remark');
process.stdin.pipe(remark()).pipe(process.stdout);
processor.use(plugin[, options])
Configure the processor to use a plug-in, and configure
that plug-in with optional options.
Signatures
processor.use(plugin[, options])
;processor.use(plugins[, options])
;processor.use(list)
;processor.use(matrix)
.
Parameters
plugin
(Plugin
);options
(*
, optional) — Configuration for plugin
.plugins
(Array.<Function>
) — List of plugins;list
(Array
) — plugin
and options
in an array;matrix
(Array
) — array where each entry is a list
;
Returns
processor
— The processor on which use
is invoked.
Plugin
A unified plugin changes the way the applied-on processor works,
in the following ways:
-
It modifies the processor: such as changing the
parser, the compiler, or linking the processor to other processors;
-
It transforms the syntax tree representation of a file;
-
It modifies metadata of a file.
Plug-in’s are a concept which materialise as attachers.
function attacher(processor[, options])
An attacher is the thing passed to use
. It configures the
processor and in turn can receive options.
Attachers can configure processors, such as by interacting with parsers
and compilers, linking it to other processors, or specifying how the
syntax tree is handled.
Parameters
processor
(processor
) — Context on which it’s used;options
(*
, optional) — Configuration.
Returns
transformer
— Optional.
function transformer(node, file[, next])
Transformers modify the syntax tree or metadata of a file.
A transformer is a (generator) function which is invoked each time
a file is passed through the transform phase. If an error occurs
(either because it’s thrown, returned, rejected, or passed to
next
), the process stops.
Parameters
Returns
-
Error
— Can be returned to stop the process;
-
Node — Can be returned and results in further
transformations and stringify
s to be performed on the new
tree;
-
Promise
— If a promise is returned, the function is asynchronous,
and must be resolved (optionally with a Node) or
rejected (optionally with an Error
).
function next(err[, tree[, file]])
If the signature of a transformer includes next
(third argument),
the function may finish asynchronous, and must invoke next()
.
Parameters
err
(Error
, optional) — Stop the process;node
(Node, optional) — New syntax tree;file
(VFile, optional) — New virtual file.
processor.parse(file|value[, options])
Parse text to a syntax tree.
Parameters
file
(VFile);value
(string
) — String representation of a file.options
(Object
, optional) — Configuration given to the parser.
Returns
Node — Syntax tree representation of input.
processor.Parser
A constructor handling the parsing of text to a syntax tree.
It’s instantiated by the parse phase in the process
with a VFile, settings
, and the processor.
The instance must expose a parse
method which is invoked without
arguments, and must return a syntax tree representation of the
VFile.
processor.stringify(node[, file|value][, options])
Compile a syntax tree to text.
Parameters
node
(Node);file
(VFile, optional);value
(string
, optional) — String representation of a file;options
(Object
, optional) — Configuration given to the parser.
Returns
string
— String representation of the syntax tree file.
processor.Compiler
A constructor handling the compilation of a syntax tree to text.
It’s instantiated by the stringify phase in the
process with a VFile, settings
, and the processor.
The instance must expose a compile
method which is invoked with
the syntax tree, and must return a string representation of that
syntax tree.
processor.run(node[, file|value][, done])
Transform a syntax tree by applying plug-ins to it.
If asynchronous plug-ins are configured, an error
is thrown if done
is not supplied.
Parameters
node
(Node);file
(VFile, optional);value
(string
, optional) — String representation of a file.done
(Function
, optional).
Returns
Node — The given syntax tree.
function done(err[, node, file])
Invoked when transformation is complete. Either invoked with an
error, or a syntax tree and a file.
Parameters
err
(Error
) — Fatal error;node
(Node);file
(VFile).
processor.process(file|value[, options][, done])
Process the given representation of a file as configured on the
processor. The process invokes parse
, run
, and stringify
internally.
If asynchronous plug-ins are configured, an error
is thrown if done
is not supplied.
Parameters
-
file
(VFile);
-
value
(string
) — String representation of a file;
-
options
(Object
, optional) — Configuration for both the parser
and compiler;
-
done
(Function
, optional).
Returns
VFile — Virtual file with modified contents
.
function done(err, file)
Invoked when the process is complete. Invoked with a fatal error, if
any, and the VFile.
Parameters
err
(Error
, optional) — Fatal error;file
(VFile).
processor.write(chunk[, encoding][, callback])
Note: Although the interface is compatible with streams,
all data is currently buffered and passed through in one go.
This might be changed later.
Write data the the in-memory buffer.
Parameters
chunk
(Buffer
or string
);encoding
(string
, defaults to utf8
);callback
(Function
) — Invoked on successful write.
Returns
boolean
— Whether the write was successful (currently, always true).
processor.end()
Signal the writing is complete. Passes all arguments to a final
write
, and starts the process (using, when available,
options given to pipe
).
Events
-
data
(string
)
— When the process was successful, triggered with the compiled
file;
-
error
(Error
)
— When the process was unsuccessful, triggered with the fatal
error;
-
warning
(VFileMessage
)
— Each message created by the plug-ins in the process is triggered
and separately passed.
Returns
boolean
— Whether the write was successful (currently, always true).
processor.pipe(stream[, options])
Note: This does not pass all processed data (e.g., from loose
process()
calls) to the destination stream. There’s one process
created internally especially for streams. Only data piped into
the processor is piped out.
Pipe data streamed into the processor, processed, to the destination
stream. Optionally also set the configuration for how the data
is processed. Calls Stream#pipe
with the given arguments under the hood.
Parameters
Returns
WritableStream
— The given stream.
processor.data(key[, value])
Get or set information in an in-memory key-value store accessible to
all phases of the process. An example is a list of HTML elements
which are self-closing (i.e., do not need a closing tag), which is
needed when parsing, transforming, and compiling HTML.
Parameters
key
(string
) — Identifier;value
(*
, optional) — Value to set. Omit if getting key
.
Returns
processor
— If setting, the processor on which data
is invoked;*
— If getting, the value at key
.
Example
The following example show how to get and set information:
var unified = require('unified');
console.log(unified().data('alpha', 'bravo').data('alpha'))
Yields:
bravo
processor.abstract()
Turn a processor into an abstract processor. Abstract processors
are meant to be extended, and not to be configured or processed
directly (as concrete processors are).
Once a processor is abstract, it cannot be made concrete again.
But, a new concrete processor functioning just like it can be
created by invoking the processor.
Returns
Processor
— The processor on which abstract
is invoked.
Example
The following example, index.js
, shows how remark
prevents extensions to itself:
var unified = require('unified');
var parse = require('remark-parse');
var stringify = require('remark-stringify');
module.exports = unified().use(parse).use(stringify).abstract();
The below example, a.js
, shows how that processor can be used to
create a command line interface which reformats markdown passed on
stdin(4) and outputs it on stdout(4).
var remark = require('remark');
process.stdin.pipe(remark()).pipe(process.stdout);
The below example, b.js
, shows a similar looking example which
operates on the abstract remark interface. If this
behaviour was allowed it would result in unexpected behaviour, so
an error is thrown. This is invalid:
var remark = require('remark');
process.stdin.pipe(remark).pipe(process.stdout);
Yields:
~/index.js:118
throw new Error(
^
Error: Cannot pipe into abstract processor.
To make the processor concrete, invoke it: use `processor()` instead of `processor`.
at assertConcrete (~/index.js:118:13)
at Function.<anonymous> (~/index.js:135:7)
...
at Object.<anonymous> (~/b.js:76:15)
...
License
MIT © Titus Wormer