What is hast-util-from-parse5?
The `hast-util-from-parse5` package is a utility that converts a Parse5 AST (Abstract Syntax Tree) to a HAST (Hypertext Abstract Syntax Tree). This is particularly useful for working with HTML in a structured way, allowing for transformations, analysis, and manipulation of HTML content.
What are hast-util-from-parse5's main functionalities?
Convert Parse5 AST to HAST
This feature allows you to convert a Parse5 AST to a HAST. The code sample demonstrates parsing an HTML string into a Parse5 AST and then converting it to a HAST.
const parse5 = require('parse5');
const fromParse5 = require('hast-util-from-parse5');
const html = '<!doctype html><html><head><title>Example</title></head><body><p>Hello, world!</p></body></html>';
const parse5Ast = parse5.parse(html);
const hast = fromParse5(parse5Ast);
console.log(JSON.stringify(hast, null, 2));
Convert Parse5 Fragment to HAST
This feature allows you to convert a Parse5 fragment to a HAST. The code sample demonstrates parsing an HTML fragment into a Parse5 fragment and then converting it to a HAST.
const parse5 = require('parse5');
const fromParse5 = require('hast-util-from-parse5');
const htmlFragment = '<p>Hello, world!</p>';
const parse5Fragment = parse5.parseFragment(htmlFragment);
const hastFragment = fromParse5(parse5Fragment);
console.log(JSON.stringify(hastFragment, null, 2));
Other packages similar to hast-util-from-parse5
rehype-parse
The `rehype-parse` package is a utility that parses HTML into a HAST directly, without needing an intermediate Parse5 AST. It is part of the unified collective and is often used in conjunction with other rehype plugins for processing HTML.
htmlparser2
The `htmlparser2` package is a fast and forgiving HTML/XML parser. It can be used to parse HTML into a DOM-like structure, which can then be manipulated or converted to other formats. While it does not directly convert to HAST, it provides a similar parsing capability.
parse5
The `parse5` package is a comprehensive HTML parsing library that produces a Parse5 AST. While it does not convert to HAST directly, it is often used in conjunction with `hast-util-from-parse5` to achieve this conversion.
hast-util-from-parse5
Transform HAST to Parse5’s AST.
Installation
npm:
npm install hast-util-from-parse5
Usage
Say we have the following file, example.html
:
<!doctype html><title>Hello!</title><h1 id="world">World!
And our script, example.js
, looks as follows:
var vfile = require('to-vfile')
var parse5 = require('parse5')
var inspect = require('unist-util-inspect')
var fromParse5 = require('hast-util-from-parse5')
var doc = vfile.readSync('example.html')
var ast = parse5.parse(String(doc), {sourceCodeLocationInfo: true})
var hast = fromParse5(ast, doc)
console.log(inspect(hast))
Now, running node example
yields:
root[2] (1:1-2:1, 0-70) [data={"quirksMode":false}]
├─ doctype (1:1-1:16, 0-15) [name="html"]
└─ element[2] [tagName="html"]
├─ element[1] [tagName="head"]
│ └─ element[1] (1:16-1:37, 15-36) [tagName="title"]
│ └─ text: "Hello!" (1:23-1:29, 22-28)
└─ element[1] [tagName="body"]
└─ element[3] (1:37-2:1, 36-70) [tagName="h1"][properties={"id":"world"}]
├─ text: "World!" (1:52-1:58, 51-57)
├─ comment: "after" (1:58-1:70, 57-69)
└─ text: "\n" (1:70-2:1, 69-70)
API
toParse5(ast[, options])
Transform an ASTNode
to a HAST Node.
options
If options
is a VFile, it’s treated as {file: options}
.
options.file
Virtual file, used to add positional information to HAST nodes.
If given, the file should have the original HTML source as its contents.
options.verbose
Whether to add positional information about starting tags, closing tags,
and attributes to elements (boolean
, default: false
). Note: not used
without file
.
For the following HTML:
<img src="http://example.com/fav.ico" alt="foo" title="bar">
The verbose info would looks as follows:
{
type: 'element',
tagName: 'img',
properties: {
src: 'http://example.com/fav.ico',
alt: 'foo',
title: 'bar'
},
children: [],
data: {
position: {
opening: {
start: {line: 1, column: 1, offset: 0},
end: {line: 1, column: 61, offset: 60}
},
closing: null,
properties: {
src: {
start: {line: 1, column: 6, offset: 5},
end: {line: 1, column: 38, offset: 37}
},
alt: {
start: {line: 1, column: 39, offset: 38},
end: {line: 1, column: 48, offset: 47}
},
title: {
start: {line: 1, column: 49, offset: 48},
end: {line: 1, column: 60, offset: 59}
}
}
}
},
position: {
start: {line: 1, column: 1, offset: 0},
end: {line: 1, column: 61, offset: 60}
}
}
Contribute
See contributing.md
in syntax-tree/hast
for ways to get
started.
This organisation has a Code of Conduct. By interacting with this
repository, organisation, or community you agree to abide by its terms.
License
MIT © Titus Wormer