Security News
Research
Data Theft Repackaged: A Case Study in Malicious Wrapper Packages on npm
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
Rehype is a powerful HTML processor built on the unified framework. It allows you to parse, transform, and stringify HTML content. It is highly extensible and can be used for a variety of tasks such as sanitizing HTML, extracting content, and transforming HTML structures.
Parsing HTML
Rehype can parse HTML strings into a syntax tree, which can then be manipulated or analyzed.
const rehype = require('rehype');
const html = '<h1>Hello, world!</h1>';
rehype().parse(html);
Transforming HTML
Rehype allows you to transform HTML content by manipulating the syntax tree. You can use plugins or custom functions to modify the tree.
const rehype = require('rehype');
const html = '<h1>Hello, world!</h1>';
rehype()
.use(() => (tree) => {
// Transform the tree
})
.process(html)
.then((file) => {
console.log(String(file));
});
Stringifying HTML
Rehype can convert a syntax tree back into an HTML string, allowing you to output the transformed content.
const rehype = require('rehype');
const tree = { type: 'root', children: [{ type: 'element', tagName: 'h1', properties: {}, children: [{ type: 'text', value: 'Hello, world!' }] }] };
rehype().stringify(tree);
Sanitizing HTML
Rehype can be used to sanitize HTML content, removing potentially dangerous elements and attributes.
const rehype = require('rehype');
const rehypeSanitize = require('rehype-sanitize');
const html = '<script>alert("Hello, world!")</script><p>Safe content</p>';
rehype()
.use(rehypeSanitize)
.process(html)
.then((file) => {
console.log(String(file));
});
Htmlparser2 is a fast and forgiving HTML/XML parser. It is similar to rehype in that it can parse HTML into a tree structure, but it is more focused on speed and less on extensibility and transformations.
Jsdom is a JavaScript implementation of the DOM and HTML standards. It allows you to manipulate HTML content in a way similar to how you would in a browser. Unlike rehype, jsdom provides a full DOM API, making it more suitable for complex manipulations and simulations of browser environments.
Cheerio is a fast, flexible, and lean implementation of core jQuery designed specifically for the server. It allows you to parse and manipulate HTML using a jQuery-like syntax. While rehype focuses on transformations and extensibility, cheerio is more about providing a familiar API for DOM manipulation.
unified processor to add support for parsing from HTML and serializing to HTML.
This package is a unified processor with support for parsing HTML as input
and serializing HTML as output by using unified with
rehype-parse
and rehype-stringify
.
See the monorepo readme for info on what the rehype ecosystem is.
You can use this package when you want to use unified, have HTML as input, and
want HTML as output.
This package is a shortcut for
unified().use(rehypeParse).use(rehypeStringify)
.
When the input isn’t HTML (meaning you don’t need rehype-parse
) or the
output is not HTML (you don’t need rehype-stringify
), it’s recommended to
use unified
directly.
When you’re in a browser, trust your content, don’t need positional info on
nodes or formatting options, and value a smaller bundle size, you can use
rehype-dom
instead.
When you want to inspect and format HTML files in a project on the command
line, you can use rehype-cli
.
This package is ESM only. In Node.js (version 16+), install with npm:
npm install rehype
In Deno with esm.sh
:
import {rehype} from 'https://esm.sh/rehype@13'
In browsers with esm.sh
:
<script type="module">
import {rehype} from 'https://esm.sh/rehype@13?bundle'
</script>
Say we have the following module example.js
:
import {rehype} from 'rehype'
import rehypeFormat from 'rehype-format'
const file = await rehype().use(rehypeFormat).process(`<!doctype html>
<html lang=en>
<head>
<title>Hi!</title>
</head>
<body>
<h1>Hello!</h1>
</body></html>`)
console.error(String(file))
…running that with node example.js
yields:
<!doctype html>
<html lang="en">
<head>
<title>Hi!</title>
</head>
<body>
<h1>Hello!</h1>
</body>
</html>
This package exports the identifier rehype
.
There is no default export.
rehype()
Create a new unified processor that already uses
rehype-parse
and rehype-stringify
.
You can add more plugins with use
.
See unified
for more information.
rehype-parse
, rehype-stringify
When you use rehype-parse
or rehype-stringify
manually you can pass options
directly to them with use
.
Because both plugins are already used in rehype
, that’s not possible.
To define options for them, you can instead pass options to data
:
import {rehype} from 'rehype'
import {reporter} from 'vfile-reporter'
const file = await rehype()
.data('settings', {
emitParseErrors: true,
fragment: true,
preferUnquoted: true
})
.process('<div title="a" title="b"></div>')
console.error(reporter(file))
console.log(String(file))
…yields:
1:21-1:21 warning Unexpected duplicate attribute duplicate-attribute hast-util-from-html
⚠ 1 warning
<div title=a></div>
HTML is parsed and serialized according to WHATWG HTML (the living standard), which is also followed by all browsers.
The syntax tree format used in rehype is hast.
This package is fully typed with TypeScript. It exports no additional types.
Projects maintained by the unified collective are compatible with maintained versions of Node.js.
When we cut a new major release, we drop support for unmaintained versions of
Node.
This means we try to keep the current release line, rehype@^13
, compatible
with Node.js 16.
As rehype works on HTML, and improper use of HTML can open you up to a
cross-site scripting (XSS) attack, use of rehype can also be unsafe.
Use rehype-sanitize
to make the tree safe.
Use of rehype plugins could also open you up to other attacks. Carefully assess each plugin and the risks involved in using them.
For info on how to submit a report, see our security policy.
See contributing.md
in rehypejs/.github
for ways
to get started.
See support.md
for ways to get help.
This project has a code of conduct. By interacting with this repository, organization, or community you agree to abide by its terms.
Support this effort and give back by sponsoring on OpenCollective!
Vercel |
Motif |
HashiCorp |
GitBook |
Gatsby | ||||
Netlify |
Coinbase |
ThemeIsle |
Expo |
Boost Note |
Markdown Space |
Holloway | ||
You? |
FAQs
HTML processor powered by plugins part of the unified collective
We found that rehype demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 3 open source maintainers collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Research
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
Research
Security News
Attackers used a malicious npm package typosquatting a popular ESLint plugin to steal sensitive data, execute commands, and exploit developer systems.
Security News
The Ultralytics' PyPI Package was compromised four times in one weekend through GitHub Actions cache poisoning and failure to rotate previously compromised API tokens.