What is posthtml-parser?
The posthtml-parser npm package is a tool used to parse HTML into an Abstract Syntax Tree (AST). This allows developers to manipulate HTML content programmatically, making it easier to perform tasks such as transforming HTML structures, extracting specific elements, and integrating with other tools in the PostHTML ecosystem.
What are posthtml-parser's main functionalities?
Parsing HTML to AST
This feature allows you to parse a string of HTML into an Abstract Syntax Tree (AST). The AST can then be manipulated programmatically.
const parse = require('posthtml-parser');
const html = '<div class="example">Hello World</div>';
const ast = parse(html);
console.log(ast);
Handling HTML fragments
This feature allows you to parse HTML fragments and control parsing options such as whether to convert tag names to lowercase.
const parse = require('posthtml-parser');
const fragment = '<span>Fragment</span>';
const ast = parse(fragment, { lowerCaseTags: false });
console.log(ast);
Integration with PostHTML plugins
This feature demonstrates how to integrate the parser with PostHTML plugins to transform HTML content. In this example, a plugin is used to change all <div> tags to <section> tags.
const posthtml = require('posthtml');
const parse = require('posthtml-parser');
const html = '<div class="example">Hello World</div>';
posthtml()
.use(tree => {
tree.match({ tag: 'div' }, node => {
node.tag = 'section';
return node;
});
})
.process(html)
.then(result => console.log(result.html));
Other packages similar to posthtml-parser
htmlparser2
htmlparser2 is a fast and forgiving HTML/XML parser. It is similar to posthtml-parser in that it parses HTML into a tree structure, but it is more focused on speed and flexibility. It also supports streaming and can handle large documents efficiently.
parse5
parse5 is a highly compliant HTML parser that closely follows the WHATWG HTML specification. It is similar to posthtml-parser in its ability to parse HTML into an AST, but it is known for its strict adherence to web standards and comprehensive support for HTML5 features.
cheerio
cheerio is a fast, flexible, and lean implementation of core jQuery designed specifically for the server. It parses HTML and XML into a DOM-like structure, allowing for jQuery-like manipulation of the document. While it offers similar parsing capabilities, it is more focused on providing a familiar API for DOM manipulation.
posthtml-parser
Parse HTML/XML to PostHTMLTree
Install
NPM install
$ npm install posthtml-parser
Usage
input HTML
<a class="animals" href="#">
<span class="animals__cat" style="background: url(cat.png)">Cat</span>
</a>
var parser = require('posthtml-parser');
var fs = require('fs');
var html = fs.readFileSync('path/to/input.html').toString();
clonsole.log(parser(html));
input HTML
<a class="animals" href="#">
<span class="animals__cat" style="background: url(cat.png)">Cat</span>
</a>
Result PostHTMLTree
[{
tag: 'a',
attrs: {
class: 'animals',
href: '#'
},
content: [
'\n ',
{
tag: 'span',
attrs: {
class: 'animals__cat',
style: 'background: url(cat.png)'
},
content: ['Cat']
},
'\n'
]
}]
License
MIT