Comparing version 0.6.1 to 0.8.1
{ | ||
"name": "parse5", | ||
"description": "Fast full-featured HTML parser for Node. Based on WHATWG HTML5 specification.", | ||
"version": "0.6.1", | ||
"author": "Ivan Nikulin (ifaaan@gmail.com, https://github.com/inikulin)", | ||
"keywords": [ | ||
"html", | ||
"parser", | ||
"html5", | ||
"WHATWG", | ||
"specification", | ||
"fast" | ||
], | ||
"repository": { | ||
"type": "git", | ||
"url": "git://github.com/inikulin/parse5.git" | ||
}, | ||
"main": "./lib/parser.js", | ||
"devDependencies": { | ||
"nodeunit": "0.8.0" | ||
}, | ||
"licenses": [ | ||
{ | ||
"type": "MIT", | ||
"url": "https://raw.github.com/inikulin/parse5/master/LICENSE" | ||
} | ||
] | ||
"name": "parse5", | ||
"description": "Fast full-featured HTML parser for Node. Based on WHATWG HTML5 specification.", | ||
"version": "0.8.1", | ||
"author": "Ivan Nikulin (ifaaan@gmail.com, https://github.com/inikulin)", | ||
"keywords": [ | ||
"html", | ||
"parser", | ||
"html5", | ||
"WHATWG", | ||
"specification", | ||
"fast", | ||
"html parser", | ||
"html5 parser", | ||
"htmlparser", | ||
"parse5", | ||
"serializer", | ||
"html serializer", | ||
"htmlserializer" | ||
], | ||
"repository": { | ||
"type": "git", | ||
"url": "git://github.com/inikulin/parse5.git" | ||
}, | ||
"main": "./index.js", | ||
"devDependencies": { | ||
"nodeunit": "0.8.0" | ||
}, | ||
"licenses": [ | ||
{ | ||
"type": "MIT", | ||
"url": "https://raw.github.com/inikulin/parse5/master/LICENSE" | ||
} | ||
] | ||
} |
130
README.md
@@ -1,9 +0,8 @@ | ||
parse5 | ||
====== | ||
Fast full-featured HTML parser for Node. Based on WHATWG HTML5 specification. | ||
To build [TestCafé](http://testcafe.devexpress.com/) we needed fast and ready for production HTML parser for node.js, which will parse HTML as a modern browser's parser. | ||
![logo](https://raw.github.com/inikulin/parse5/master/logo.png) | ||
Fast full-featured HTML parsing/serialization toolset for Node. Based on WHATWG HTML5 specification. | ||
To build [TestCafé](http://testcafe.devexpress.com/) we needed fast and ready for production HTML parser, which will parse HTML as a modern browser's parser. | ||
Existing solutions were either too slow or their output was too inaccurate. So, this is how parse5 was born. | ||
Install | ||
------- | ||
##Install | ||
``` | ||
@@ -13,4 +12,4 @@ $ npm install parse5 | ||
Usage and API | ||
------------- | ||
##Simple usage | ||
```js | ||
@@ -30,4 +29,3 @@ var Parser = require('parse5').Parser; | ||
Is it fast? | ||
----------- | ||
##Is it fast? | ||
Check out [this benchmark](https://github.com/inikulin/node-html-parser-bench). | ||
@@ -46,4 +44,95 @@ | ||
Testing | ||
------- | ||
##API reference | ||
###Enum: TreeAdapters | ||
Provides built-in tree adapters which can be passed as an optional argument to the `Parser` and `TreeSerializer` constructors. | ||
####• TreeAdapters.default | ||
Default tree format for parse5. | ||
####• TreeAdapters.htmlparser2 | ||
Quite popular [htmlparser2](https://github.com/fb55/htmlparser2) tree format (e.g. used in [cheerio](https://github.com/MatthewMueller/cheerio) and [jsdom](https://github.com/tmpvar/jsdom)). | ||
--------------------------------------- | ||
###Class: Parser | ||
Provides HTML parsing functionality. | ||
####• Parser.ctor([treeAdapter]) | ||
Creates new reusable instance of the `Parser`. Optional `treeAdapter` argument specifies resulting tree format. If `treeAdapter` argument is not specified, `default` tree adapter will be used. | ||
*Example:* | ||
```js | ||
var parse5 = require('parse5'); | ||
//Instantiate new parser with default tree adapter | ||
var parser1 = new parse5.Parser(); | ||
//Instantiate new parser with htmlparser2 tree adapter | ||
var parser2 = new parse5.Parser(parse5.TreeAdapters.htmlparser2); | ||
``` | ||
####• Parser.parse(html) | ||
Parses specified `html` string. Returns `document` node. | ||
*Example:* | ||
```js | ||
var document = parser.parse('<!DOCTYPE html><html><head></head><body>Hi there!</body></html>'); | ||
``` | ||
####• Parser.parseFragment(htmlFragment, [contextElement]) | ||
Parses given `htmlFragment`. Returns `documentFragment` node. Optional `contextElement` argument specifies resulting tree format. If `contextElement` argument is not specified, `<div>` element will be used. | ||
*Example:* | ||
```js | ||
var documentFragment = parser.parseFragment('<table></table>'); | ||
//Parse html fragment in context of the parsed <table> element | ||
var trFragment = parser.parseFragment('<tr><td>Shake it, baby</td></tr>', documentFragment.childNodes[0]); | ||
``` | ||
--------------------------------------- | ||
###Class: TreeSerializer | ||
Provides tree-to-HTML serialization functionality. | ||
####• TreeSerializer.ctor([treeAdapter]) | ||
Creates new reusable instance of the `TreeSerializer`. Optional `treeAdapter` argument specifies input tree format. If `treeAdapter` argument is not specified, `default` tree adapter will be used. | ||
*Example:* | ||
```js | ||
var parse5 = require('parse5'); | ||
//Instantiate new serializer with default tree adapter | ||
var serializer1 = new parse5.TreeSerializer(); | ||
//Instantiate new serializer with htmlparser2 tree adapter | ||
var serializer2 = new parse5.TreeSerializer(parse5.TreeAdapters.htmlparser2); | ||
``` | ||
####• TreeSerializer.serializer(node) | ||
Serializes the given `node`. Return HTML string. | ||
*Example:* | ||
```js | ||
var document = parser.parse('<!DOCTYPE html><html><head></head><body>Hi there!</body></html>'); | ||
//Serialize document | ||
var html = serializer.serialize(document); | ||
//Serialize <body> element content | ||
var bodyInnerHtml = serializer.serialize(document.childNodes[0].childNodes[1]); | ||
``` | ||
--------------------------------------- | ||
##Testing | ||
Test data is adopted from [html5lib project](https://github.com/html5lib). Parser is covered by more than 8000 test cases. | ||
@@ -55,4 +144,4 @@ To run tests: | ||
Custom tree adapter | ||
------------------- | ||
##Custom tree adapter | ||
You can create a custom tree adapter so parse5 can work with your own DOM-tree implementation. | ||
@@ -72,11 +161,14 @@ Just pass your adapter implementation to the parser's constructor as an argument: | ||
Sample implementation can be found [here](https://github.com/inikulin/parse5/blob/master/lib/default_tree_adapter.js). | ||
Sample implementation can be found [here](https://github.com/inikulin/parse5/blob/master/lib/tree_adapters/default.js). | ||
The custom tree adapter should implement all methods exposed via `exports` in the sample implementation. | ||
Questions or suggestions? | ||
------------------------- | ||
##Questions or suggestions? | ||
If you have any questions, please feel free to create an issue [here on github](https://github.com/inikulin/parse5/issues). | ||
Author | ||
------ | ||
##Author | ||
[Ivan Nikulin](https://github.com/inikulin) (ifaaan@gmail.com) | ||
[![Bitdeli Badge](https://d2weczhvl823v0.cloudfront.net/inikulin/parse5/trend.png)](https://bitdeli.com/free "Bitdeli Badge") | ||
Sorry, the diff of this file is not supported yet
Sorry, the diff of this file is not supported yet
License Policy Violation
LicenseThis package is not allowed per your license policy. Review the package's license to ensure compliance.
Found 1 instance in 1 package
Major refactor
Supply chain riskPackage has recently undergone a major refactor. It may be unstable or indicate significant internal changes. Use caution when updating to versions that include significant changes.
Found 1 instance in 1 package
License Policy Violation
LicenseThis package is not allowed per your license policy. Review the package's license to ensure compliance.
Found 1 instance in 1 package
Filesystem access
Supply chain riskAccesses the file system, and could potentially read sensitive data.
Found 1 instance in 1 package
170
0
363052
17
5813
1