Small CommonMark compliant markdown parser with positional info and concrete
tokens.
Intro
micromark is a long awaited markdown parser.
It uses a state machine to parse the entirety of markdown into tokens.
It’s the smallest 100% CommonMark compliant markdown parser in JavaScript.
It’ll replace the internals of remark-parse
, the most
popular markdown parser.
Its interface is optimized to compile to HTML, but its parts can be used
to generate syntax trees or compile to other output formats too.
It’s in open beta: integration in remark, performance, CSTs, and docs.
Contents
Checklist
Install
npm:
npm install micromark
Use
var micromark = require('micromark')
console.log(micromark('## Hello, *world*!'))
Yields:
<h2>Hello, <em>world</em>!</h2>
Or (streaming interface):
var fs = require('fs')
var micromark = require('micromark/stream')
fs.createReadStream('example.md').pipe(micromark()).pipe(process.stdout)
Or (extensions, in this case micromark-extension-gfm
):
var micromark = require('micromark')
var gfmSyntax = require('micromark-extension-gfm')
var gfmHtml = require('micromark-extension-gfm/html')
var doc = '* [x] contact@example.com ~~strikethrough~~'
var result = micromark(doc, {
extensions: [gfmSyntax()],
htmlExtensions: [gfmHtml]
})
console.log(result)
<ul>
<li><input checked="" disabled="" type="checkbox"> <a href="mailto:contact@example.com">contact@example.com</a> <del>strikethrough</del></li>
</ul>
Or use remark, which will soon include micromark and is pretty
stable.
API
Note that there are more APIs than listed here currently.
Those are considered to be in progress.
micromark(doc[, encoding][, options])
Compile markdown to HTML.
Parameters
doc
Markdown to parse (string
or Buffer
)
encoding
Character encoding to understand doc
as when it’s a
Buffer
(string
, default: 'utf8'
).
options.defaultLineEnding
Value to use for line endings not in doc
(string
, default: first line
ending or '\n'
).
Generally, micromark copies line endings ('\r'
, '\n'
, '\r\n'
) in the
markdown document over to the compiled HTML.
In some cases, such as > a
, CommonMark requires that extra line endings are
added: <blockquote>\n<p>a</p>\n</blockquote>
.
options.allowDangerousHtml
Whether to allow embedded HTML (boolean
, default: false
).
options.allowDangerousProtocol
Whether to allow potentially dangerous protocols in links and images (boolean
,
default: false
).
URLs relative to the current protocol are always allowed (such as, image.jpg
).
For links, the allowed protocols are http
, https
, irc
, ircs
, mailto
,
and xmpp
.
For images, the allowed protocols are http
and https
.
options.extensions
Array of syntax extensions (Array.<SyntaxExtension>
,
default: []
).
options.htmlExtensions
Array of HTML extensions (Array.<HtmlExtension>
, default:
[]
).
Returns
string
— Compiled HTML.
createSteam(options?)
Streaming version of micromark.
Compiles markdown to HTML.
options
are the same as the buffering API above.
Available at require('micromark/stream')
.
Extensions
There are two types of extensions for micromark:
SyntaxExtension
and HtmlExtension
.
They can be passed in extensions
or
htmlExtensions
, respectively.
SyntaxExtension
A syntax extension is an object whose fields are the names of tokenizers:
content
(a block of, well, content: definitions and paragraphs), document
(containers such as block quotes and lists), flow
(block constructs such as
ATX and setext headings, HTML, indented and fenced code, thematic breaks),
string
(things that work in a few places such as destinations, fenced code
info, etc: character escapes and -references), or text
(rich inline text:
autolinks, character escapes and -references, code, hard breaks, HTML, images,
links, emphasis, strong).
The values at such objects are character codes, mapping to constructs.
The built in constructs are an extension.
See it and the existing extensions for inspiration.
HtmlExtension
An HTML extension is an object whose fields are either enter
or exit
(reflecting whether a token is entered or exited).
The values at such objects are names of tokens mapping to handlers.
See the existing extensions for inspiration.
List of extensions
Version
The open beta of micromark starts at version 2.0.0
(there was a different
package published on npm as micromark
before).
micromark will adhere to semver at 3.0.0
.
Use tilde ranges for now: "micromark": "~2.9.2"
.
Security
It’s safe to compile markdown to HTML if it does not include embedded HTML nor
uses dangerous protocols in links (such as javascript:
or data:
).
micromark is safe by default if embedded HTML or dangerous protocols are used
too, as it encodes or drops them.
Turning on the allowDangerousHtml
or allowDangerousProtocol
options for
user-provided markdown opens you up to cross-site scripting (XSS)
attacks.
For more information on markdown sanitation, see
improper-markup-sanitization.md
by @chalker.
See security.md
in micromark/.github
for how to submit
a security report.
Contribute
See contributing.md
in micromark/.github
for ways
to get started.
See support.md
for ways to get help.
This project has a code of conduct.
By interacting with this repository, organisation, or community you agree to
abide by its terms.
Support this effort and give back by sponsoring on OpenCollective!
License
MIT © Titus Wormer