What is hast-util-raw?
The hast-util-raw package is a utility for working with HAST (Hypertext Abstract Syntax Tree) trees. It can parse and transform raw HTML into a HAST tree, allowing for manipulation and analysis of the structure of HTML documents. This package is particularly useful for developers working with virtual DOMs or needing to preprocess or clean HTML content programmatically.
What are hast-util-raw's main functionalities?
Parsing HTML to HAST
This feature allows for the parsing of HTML strings embedded within HAST trees. The `raw` function takes a HAST tree that may contain raw HTML as part of its nodes and returns a new HAST tree with the raw HTML parsed into HAST nodes. This is useful for integrating unescaped HTML strings into a HAST-based workflow.
const raw = require('hast-util-raw');
const h = require('hastscript');
const tree = h('div', [h('span', 'Hello'), '<strong>world!</strong>']);
const result = raw(tree);
Transforming HAST with embedded raw HTML
This demonstrates how `hast-util-raw` can transform a HAST tree that includes a 'raw' node containing HTML into a fully parsed HAST structure. This is particularly useful for scenarios where raw HTML is mixed with HAST content and a uniform HAST structure is needed for further processing.
const raw = require('hast-util-raw');
const u = require('unist-builder');
const tree = u('root', [u('element', {tagName: 'div'}, [u('text', 'Some text'), u('raw', '<span>More text</span>')])]);
const result = raw(tree);
Other packages similar to hast-util-raw
rehype-parse
Similar to hast-util-raw, `rehype-parse` is used for parsing HTML into HAST. However, `rehype-parse` is more focused on being a full HTML parser as part of the rehype ecosystem, offering more comprehensive parsing options and better integration with rehype plugins.
hast-util-raw
hast utility to parse the tree again, now supporting
embedded raw
nodes.
One of the reasons to do this is for “malformed” syntax trees: for example, say
there’s an h1
element in a p
element, this utility will make them siblings.
Another reason to do this is if raw HTML/XML is embedded in a syntax tree, which
can occur when coming from Markdown using mdast-util-to-hast
.
If you’re working with remark and/or
remark-rehype
, use rehype-raw
instead.
Install
npm:
npm install hast-util-raw
Use
var h = require('hastscript')
var raw = require('hast-util-raw')
var tree = h('div', [h('h1', ['Foo ', h('h2', 'Bar'), ' Baz'])])
var clean = raw(tree)
console.log(clean)
Yields:
{ type: 'element',
tagName: 'div',
properties: {},
children:
[ { type: 'element',
tagName: 'h1',
properties: {},
children: [Object] },
{ type: 'element',
tagName: 'h2',
properties: {},
children: [Object] },
{ type: 'text', value: ' Baz' } ] }
API
raw(tree[, file])
Given a hast tree and an optional vfile (for
positional info), return a new parsed-again
hast tree.
Security
Use of hast-util-raw
can open you up to a cross-site scripting (XSS)
attack as raw
nodes are unsafe.
The following example shows how a raw node is used to inject a script that runs
when loaded in a browser.
raw(u('root', [u('raw', '<script>alert(1)</script>')]))
Yields:
<script>alert(1)</script>
Do not use this utility in combination with user input or use
hast-util-santize
.
Related
Contribute
See contributing.md
in syntax-tree/.github
for ways to get
started.
See support.md
for ways to get help.
This project has a code of conduct.
By interacting with this repository, organization, or community you agree to
abide by its terms.
License
MIT © Titus Wormer