What is rehype?
Rehype is a powerful HTML processor built on the unified framework. It allows you to parse, transform, and stringify HTML content. It is highly extensible and can be used for a variety of tasks such as sanitizing HTML, extracting content, and transforming HTML structures.
What are rehype's main functionalities?
Parsing HTML
Rehype can parse HTML strings into a syntax tree, which can then be manipulated or analyzed.
const rehype = require('rehype');
const html = '<h1>Hello, world!</h1>';
rehype().parse(html);
Transforming HTML
Rehype allows you to transform HTML content by manipulating the syntax tree. You can use plugins or custom functions to modify the tree.
const rehype = require('rehype');
const html = '<h1>Hello, world!</h1>';
rehype()
.use(() => (tree) => {
// Transform the tree
})
.process(html)
.then((file) => {
console.log(String(file));
});
Stringifying HTML
Rehype can convert a syntax tree back into an HTML string, allowing you to output the transformed content.
const rehype = require('rehype');
const tree = { type: 'root', children: [{ type: 'element', tagName: 'h1', properties: {}, children: [{ type: 'text', value: 'Hello, world!' }] }] };
rehype().stringify(tree);
Sanitizing HTML
Rehype can be used to sanitize HTML content, removing potentially dangerous elements and attributes.
const rehype = require('rehype');
const rehypeSanitize = require('rehype-sanitize');
const html = '<script>alert("Hello, world!")</script><p>Safe content</p>';
rehype()
.use(rehypeSanitize)
.process(html)
.then((file) => {
console.log(String(file));
});
Other packages similar to rehype
htmlparser2
Htmlparser2 is a fast and forgiving HTML/XML parser. It is similar to rehype in that it can parse HTML into a tree structure, but it is more focused on speed and less on extensibility and transformations.
jsdom
Jsdom is a JavaScript implementation of the DOM and HTML standards. It allows you to manipulate HTML content in a way similar to how you would in a browser. Unlike rehype, jsdom provides a full DOM API, making it more suitable for complex manipulations and simulations of browser environments.
cheerio
Cheerio is a fast, flexible, and lean implementation of core jQuery designed specifically for the server. It allows you to parse and manipulate HTML using a jQuery-like syntax. While rehype focuses on transformations and extensibility, cheerio is more about providing a familiar API for DOM manipulation.
rehype
unified processor to add support for parsing from HTML and serializing
to HTML.
Contents
What is this?
This package is a unified processor with support for parsing HTML as input
and serializing HTML as output by using unified with
rehype-parse
and rehype-stringify
.
See the monorepo readme for info on what the rehype ecosystem is.
When should I use this?
You can use this package when you want to use unified, have HTML as input, and
want HTML as output.
This package is a shortcut for
unified().use(rehypeParse).use(rehypeStringify)
.
When the input isn’t HTML (meaning you don’t need rehype-parse
) or the
output is not HTML (you don’t need rehype-stringify
), it’s recommended to
use unified
directly.
When you’re in a browser, trust your content, don’t need positional info on
nodes or formatting options, and value a smaller bundle size, you can use
rehype-dom
instead.
When you want to inspect and format HTML files in a project on the command
line, you can use rehype-cli
.
Install
This package is ESM only.
In Node.js (version 16+), install with npm:
npm install rehype
In Deno with esm.sh
:
import {rehype} from 'https://esm.sh/rehype@13'
In browsers with esm.sh
:
<script type="module">
import {rehype} from 'https://esm.sh/rehype@13?bundle'
</script>
Use
Say we have the following module example.js
:
import {rehype} from 'rehype'
import rehypeFormat from 'rehype-format'
const file = await rehype().use(rehypeFormat).process(`<!doctype html>
<html lang=en>
<head>
<title>Hi!</title>
</head>
<body>
<h1>Hello!</h1>
</body></html>`)
console.error(String(file))
…running that with node example.js
yields:
<!doctype html>
<html lang="en">
<head>
<title>Hi!</title>
</head>
<body>
<h1>Hello!</h1>
</body>
</html>
API
This package exports the identifier rehype
.
There is no default export.
rehype()
Create a new unified processor that already uses
rehype-parse
and rehype-stringify
.
You can add more plugins with use
.
See unified
for more information.
Examples
Example: passing options to rehype-parse
, rehype-stringify
When you use rehype-parse
or rehype-stringify
manually you can pass options
directly to them with use
.
Because both plugins are already used in rehype
, that’s not possible.
To define options for them, you can instead pass options to data
:
import {rehype} from 'rehype'
import {reporter} from 'vfile-reporter'
const file = await rehype()
.data('settings', {
emitParseErrors: true,
fragment: true,
preferUnquoted: true
})
.process('<div title="a" title="b"></div>')
console.error(reporter(file))
console.log(String(file))
…yields:
1:21-1:21 warning Unexpected duplicate attribute duplicate-attribute hast-util-from-html
⚠ 1 warning
<div title=a></div>
Syntax
HTML is parsed and serialized according to WHATWG HTML (the living standard),
which is also followed by all browsers.
Syntax tree
The syntax tree format used in rehype is hast.
Types
This package is fully typed with TypeScript.
It exports no additional types.
Compatibility
Projects maintained by the unified collective are compatible with maintained
versions of Node.js.
When we cut a new major release, we drop support for unmaintained versions of
Node.
This means we try to keep the current release line, rehype@^13
, compatible
with Node.js 16.
Security
As rehype works on HTML, and improper use of HTML can open you up to a
cross-site scripting (XSS) attack, use of rehype can also be unsafe.
Use rehype-sanitize
to make the tree safe.
Use of rehype plugins could also open you up to other attacks.
Carefully assess each plugin and the risks involved in using them.
For info on how to submit a report, see our security policy.
Contribute
See contributing.md
in rehypejs/.github
for ways
to get started.
See support.md
for ways to get help.
This project has a code of conduct.
By interacting with this repository, organization, or community you agree to
abide by its terms.
Support this effort and give back by sponsoring on OpenCollective!
License
MIT © Titus Wormer