html-dom-parser
Advanced tools
Comparing version 0.2.2 to 0.2.3
@@ -1,5 +0,40 @@ | ||
# Change Log | ||
# Changelog | ||
All notable changes to this project will be documented in this file. See [standard-version](https://github.com/conventional-changelog/standard-version) for commit guidelines. | ||
### [0.2.3](https://github.com/remarkablemark/html-dom-parser/compare/v0.2.2...v0.2.3) (2019-11-04) | ||
### Bug Fixes | ||
* **lib:** improve head and body regex in `domparser.js` ([457bb58](https://github.com/remarkablemark/html-dom-parser/commit/457bb58)), closes [#18](https://github.com/remarkablemark/html-dom-parser/issues/18) | ||
### Build System | ||
* **package:** save commitlint, husky, and lint-staged to devDeps ([3b0ce91](https://github.com/remarkablemark/html-dom-parser/commit/3b0ce91)) | ||
* **package:** update `eslint` and install `prettier` and plugin ([b7a6b81](https://github.com/remarkablemark/html-dom-parser/commit/b7a6b81)) | ||
* **package:** update `webpack` and save `webpack-cli` ([908e56d](https://github.com/remarkablemark/html-dom-parser/commit/908e56d)) | ||
* **package:** update dependencies and devDependencies ([a9016be](https://github.com/remarkablemark/html-dom-parser/commit/a9016be)) | ||
### Tests | ||
* **server:** remove skipped test ([a4c1057](https://github.com/remarkablemark/html-dom-parser/commit/a4c1057)) | ||
* refactor tests to ES6 ([d5255a5](https://github.com/remarkablemark/html-dom-parser/commit/d5255a5)) | ||
* **cases:** add empty string test case to `html.js` ([25d7e8a](https://github.com/remarkablemark/html-dom-parser/commit/25d7e8a)) | ||
* **cases:** add more special test cases to `html.js` ([6fdf2ea](https://github.com/remarkablemark/html-dom-parser/commit/6fdf2ea)) | ||
* **cases:** refactor test cases and move html data to its own file ([e4fcb09](https://github.com/remarkablemark/html-dom-parser/commit/e4fcb09)) | ||
* **cases:** remove unnecessary try/catch wrapper to fix lint error ([ca8175e](https://github.com/remarkablemark/html-dom-parser/commit/ca8175e)) | ||
* **cases:** skip html test cases that PhantomJS does not support ([d095d29](https://github.com/remarkablemark/html-dom-parser/commit/d095d29)) | ||
* **cases:** update `complex.html` ([1418775](https://github.com/remarkablemark/html-dom-parser/commit/1418775)) | ||
* **client:** add tests for client parser that will be run by karma ([a0c58aa](https://github.com/remarkablemark/html-dom-parser/commit/a0c58aa)) | ||
* **helpers:** create `index.js` which exports helpers ([a9255d5](https://github.com/remarkablemark/html-dom-parser/commit/a9255d5)) | ||
* **helpers:** move helper that tests for errors to separate file ([f2e6312](https://github.com/remarkablemark/html-dom-parser/commit/f2e6312)) | ||
* **helpers:** refactor and move `runTests` to its own file ([8e30784](https://github.com/remarkablemark/html-dom-parser/commit/8e30784)) | ||
* **server:** add tests that spy and mock htmlparser2 and domhandler ([61075a1](https://github.com/remarkablemark/html-dom-parser/commit/61075a1)) | ||
* **server:** move `html-to-dom-server.js` to `server` directory ([3684dac](https://github.com/remarkablemark/html-dom-parser/commit/3684dac)) | ||
## [0.2.2](https://github.com/remarkablemark/html-dom-parser/compare/v0.2.1...v0.2.2) (2019-06-07) | ||
@@ -6,0 +41,0 @@ |
@@ -10,3 +10,3 @@ (function webpackUniversalModuleDefinition(root, factory) { | ||
root["HTMLDOMParser"] = factory(); | ||
})(this, function() { | ||
})(window, function() { | ||
return /******/ (function(modules) { // webpackBootstrap | ||
@@ -47,16 +47,33 @@ /******/ // The module cache | ||
/******/ | ||
/******/ // identity function for calling harmony imports with the correct context | ||
/******/ __webpack_require__.i = function(value) { return value; }; | ||
/******/ | ||
/******/ // define getter function for harmony exports | ||
/******/ __webpack_require__.d = function(exports, name, getter) { | ||
/******/ if(!__webpack_require__.o(exports, name)) { | ||
/******/ Object.defineProperty(exports, name, { | ||
/******/ configurable: false, | ||
/******/ enumerable: true, | ||
/******/ get: getter | ||
/******/ }); | ||
/******/ Object.defineProperty(exports, name, { enumerable: true, get: getter }); | ||
/******/ } | ||
/******/ }; | ||
/******/ | ||
/******/ // define __esModule on exports | ||
/******/ __webpack_require__.r = function(exports) { | ||
/******/ if(typeof Symbol !== 'undefined' && Symbol.toStringTag) { | ||
/******/ Object.defineProperty(exports, Symbol.toStringTag, { value: 'Module' }); | ||
/******/ } | ||
/******/ Object.defineProperty(exports, '__esModule', { value: true }); | ||
/******/ }; | ||
/******/ | ||
/******/ // create a fake namespace object | ||
/******/ // mode & 1: value is a module id, require it | ||
/******/ // mode & 2: merge all properties of value into the ns | ||
/******/ // mode & 4: return value when already ns object | ||
/******/ // mode & 8|1: behave like require | ||
/******/ __webpack_require__.t = function(value, mode) { | ||
/******/ if(mode & 1) value = __webpack_require__(value); | ||
/******/ if(mode & 8) return value; | ||
/******/ if((mode & 4) && typeof value === 'object' && value && value.__esModule) return value; | ||
/******/ var ns = Object.create(null); | ||
/******/ __webpack_require__.r(ns); | ||
/******/ Object.defineProperty(ns, 'default', { enumerable: true, value: value }); | ||
/******/ if(mode & 2 && typeof value != 'string') for(var key in value) __webpack_require__.d(ns, key, function(key) { return value[key]; }.bind(null, key)); | ||
/******/ return ns; | ||
/******/ }; | ||
/******/ | ||
/******/ // getDefaultExport function for compatibility with non-harmony modules | ||
@@ -77,476 +94,54 @@ /******/ __webpack_require__.n = function(module) { | ||
/******/ | ||
/******/ | ||
/******/ // Load entry module and return exports | ||
/******/ return __webpack_require__(__webpack_require__.s = 3); | ||
/******/ return __webpack_require__(__webpack_require__.s = "./lib/html-to-dom-client.js"); | ||
/******/ }) | ||
/************************************************************************/ | ||
/******/ ([ | ||
/* 0 */ | ||
/***/ (function(module, exports, __webpack_require__) { | ||
/******/ ({ | ||
"use strict"; | ||
/***/ "./lib/constants.js": | ||
/*!**************************!*\ | ||
!*** ./lib/constants.js ***! | ||
\**************************/ | ||
/*! no static exports found */ | ||
/***/ (function(module, exports) { | ||
eval("/**\n * SVG elements are case-sensitive.\n *\n * @see {@link https://developer.mozilla.org/docs/Web/SVG/Element#SVG_elements_A_to_Z}\n */\nvar CASE_SENSITIVE_TAG_NAMES = [\n 'animateMotion',\n 'animateTransform',\n 'clipPath',\n 'feBlend',\n 'feColorMatrix',\n 'feComponentTransfer',\n 'feComposite',\n 'feConvolveMatrix',\n 'feDiffuseLighting',\n 'feDisplacementMap',\n 'feDropShadow',\n 'feFlood',\n 'feFuncA',\n 'feFuncB',\n 'feFuncG',\n 'feFuncR',\n 'feGaussainBlur',\n 'feImage',\n 'feMerge',\n 'feMergeNode',\n 'feMorphology',\n 'feOffset',\n 'fePointLight',\n 'feSpecularLighting',\n 'feSpotLight',\n 'feTile',\n 'feTurbulence',\n 'foreignObject',\n 'linearGradient',\n 'radialGradient',\n 'textPath'\n];\n\nmodule.exports = {\n CASE_SENSITIVE_TAG_NAMES: CASE_SENSITIVE_TAG_NAMES\n};\n\n\n//# sourceURL=webpack://HTMLDOMParser/./lib/constants.js?"); | ||
var CASE_SENSITIVE_TAG_NAMES = __webpack_require__(2).CASE_SENSITIVE_TAG_NAMES; | ||
/***/ }), | ||
var caseSensitiveTagNamesMap = {}; | ||
var tagName; | ||
for (var i = 0, len = CASE_SENSITIVE_TAG_NAMES.length; i < len; i++) { | ||
tagName = CASE_SENSITIVE_TAG_NAMES[i]; | ||
caseSensitiveTagNamesMap[tagName.toLowerCase()] = tagName; | ||
} | ||
/***/ "./lib/domparser.js": | ||
/*!**************************!*\ | ||
!*** ./lib/domparser.js ***! | ||
\**************************/ | ||
/*! no static exports found */ | ||
/***/ (function(module, exports, __webpack_require__) { | ||
/** | ||
* Gets case-sensitive tag name. | ||
* | ||
* @param {String} tagName - The lowercase tag name. | ||
* @return {String|undefined} | ||
*/ | ||
function getCaseSensitiveTagName(tagName) { | ||
return caseSensitiveTagNamesMap[tagName]; | ||
} | ||
eval("var utilities = __webpack_require__(/*! ./utilities */ \"./lib/utilities.js\");\n\n// constants\nvar HTML = 'html';\nvar HEAD = 'head';\nvar BODY = 'body';\nvar FIRST_TAG_REGEX = /<([a-zA-Z]+[0-9]?)/; // e.g., <h1>\nvar HEAD_TAG_REGEX = /<head.*>/i;\nvar BODY_TAG_REGEX = /<body.*>/i;\n// http://www.w3.org/TR/html/syntax.html#void-elements\nvar VOID_ELEMENTS_REGEX = /<(area|base|br|col|embed|hr|img|input|keygen|link|menuitem|meta|param|source|track|wbr)(.*?)\\/?>/gi;\n\n// detect IE browser\nvar isIE9 = utilities.isIE(9);\nvar isIE = isIE9 || utilities.isIE();\n\n/**\n * DOMParser (performance: slow).\n *\n * @see https://developer.mozilla.org/docs/Web/API/DOMParser#Parsing_an_SVG_or_HTML_document\n */\nvar parseFromString;\n\nif (typeof window.DOMParser === 'function') {\n var domParser = new window.DOMParser();\n\n // IE9 does not support 'text/html' MIME type\n // https://msdn.microsoft.com/en-us/library/ff975278(v=vs.85).aspx\n var mimeType = isIE9 ? 'text/xml' : 'text/html';\n\n /**\n * Creates an HTML document using `DOMParser.parseFromString`.\n *\n * @param {string} html - The HTML string.\n * @param {string} [tagName] - The element to render the HTML (with 'body' as fallback).\n * @return {HTMLDocument}\n */\n parseFromString = function domStringParser(html, tagName) {\n if (tagName) {\n html = '<' + tagName + '>' + html + '</' + tagName + '>';\n }\n\n // because IE9 only supports MIME type 'text/xml', void elements need to be self-closed\n if (isIE9) {\n html = html.replace(VOID_ELEMENTS_REGEX, '<$1$2$3/>');\n }\n\n return domParser.parseFromString(html, mimeType);\n };\n}\n\n/**\n * DOMImplementation (performance: fair).\n *\n * @see https://developer.mozilla.org/docs/Web/API/DOMImplementation/createHTMLDocument\n */\nvar parseFromDocument;\n\nif (typeof document.implementation === 'object') {\n // title parameter is required in IE\n // https://msdn.microsoft.com/en-us/library/ff975457(v=vs.85).aspx\n var doc = document.implementation.createHTMLDocument(\n isIE ? 'HTML_DOM_PARSER_TITLE' : undefined\n );\n\n /**\n * Use HTML document created by `document.implementation.createHTMLDocument`.\n *\n * @param {string} html - The HTML string.\n * @param {string} [tagName] - The element to render the HTML (with 'body' as fallback).\n * @return {HTMLDocument}\n */\n parseFromDocument = function createHTMLDocument(html, tagName) {\n if (tagName) {\n doc.documentElement.getElementsByTagName(tagName)[0].innerHTML = html;\n return doc;\n }\n\n try {\n doc.documentElement.innerHTML = html;\n return doc;\n // fallback when certain elements in `documentElement` are read-only (IE9)\n } catch (err) {\n if (parseFromString) {\n return parseFromString(html);\n }\n }\n };\n}\n\n/**\n * Template (performance: fast).\n *\n * @see https://developer.mozilla.org/docs/Web/HTML/Element/template\n */\nvar parseFromTemplate;\nvar template = document.createElement('template');\n\nif (template.content) {\n /**\n * Uses a template element (content fragment) to parse HTML.\n *\n * @param {string} html - The HTML string.\n * @return {NodeList}\n */\n parseFromTemplate = function templateParser(html) {\n template.innerHTML = html;\n return template.content.childNodes;\n };\n}\n\n// fallback document parser\nvar parseWithFallback = parseFromDocument || parseFromString;\n\n/**\n * Parses HTML string to DOM nodes.\n *\n * @param {string} html - The HTML string.\n * @return {NodeList|Array}\n */\nfunction domparser(html) {\n var firstTagName;\n var match = html.match(FIRST_TAG_REGEX);\n\n if (match && match[1]) {\n firstTagName = match[1].toLowerCase();\n }\n\n var doc;\n var element;\n var elements;\n\n switch (firstTagName) {\n case HTML:\n if (parseFromString) {\n doc = parseFromString(html);\n\n // the created document may come with filler head/body elements,\n // so make sure to remove them if they don't actually exist\n if (!HEAD_TAG_REGEX.test(html)) {\n element = doc.getElementsByTagName(HEAD)[0];\n if (element) {\n element.parentNode.removeChild(element);\n }\n }\n\n if (!BODY_TAG_REGEX.test(html)) {\n element = doc.getElementsByTagName(BODY)[0];\n if (element) {\n element.parentNode.removeChild(element);\n }\n }\n\n return doc.getElementsByTagName(HTML);\n }\n break;\n\n case HEAD:\n case BODY:\n if (parseWithFallback) {\n elements = parseWithFallback(html).getElementsByTagName(firstTagName);\n\n // account for possibility of sibling\n if (BODY_TAG_REGEX.test(html) && HEAD_TAG_REGEX.test(html)) {\n return elements[0].parentNode.childNodes;\n }\n\n return elements;\n }\n break;\n\n // low-level tag or text\n default:\n if (parseFromTemplate) {\n return parseFromTemplate(html);\n }\n\n if (parseWithFallback) {\n return parseWithFallback(html, BODY).getElementsByTagName(BODY)[0]\n .childNodes;\n }\n\n break;\n }\n\n return [];\n}\n\nmodule.exports = domparser;\n\n\n//# sourceURL=webpack://HTMLDOMParser/./lib/domparser.js?"); | ||
/** | ||
* Formats DOM attributes to a hash map. | ||
* | ||
* @param {NamedNodeMap} attributes - The list of attributes. | ||
* @return {Object} - A map of attribute name to value. | ||
*/ | ||
function formatAttributes(attributes) { | ||
var result = {}; | ||
var attribute; | ||
// `NamedNodeMap` is array-like | ||
for (var i = 0, len = attributes.length; i < len; i++) { | ||
attribute = attributes[i]; | ||
result[attribute.name] = attribute.value; | ||
} | ||
return result; | ||
} | ||
/***/ }), | ||
/** | ||
* Corrects the tag name if it is case-sensitive (SVG). | ||
* Otherwise, returns the lowercase tag name (HTML). | ||
* | ||
* @param {String} tagName - The lowercase tag name. | ||
* @return {String} - The formatted tag name. | ||
*/ | ||
function formatTagName(tagName) { | ||
tagName = tagName.toLowerCase(); | ||
var caseSensitiveTagName = getCaseSensitiveTagName(tagName); | ||
if (caseSensitiveTagName) { | ||
return caseSensitiveTagName; | ||
} | ||
return tagName; | ||
} | ||
/** | ||
* Formats the browser DOM nodes to mimic the output of `htmlparser2.parseDOM()`. | ||
* | ||
* @param {NodeList} nodes - The DOM nodes. | ||
* @param {Object} [parentObj] - The formatted parent node. | ||
* @param {String} [directive] - The directive. | ||
* @return {Object[]} - The formatted DOM object. | ||
*/ | ||
function formatDOM(nodes, parentObj, directive) { | ||
parentObj = parentObj || null; | ||
var result = []; | ||
var node; | ||
var prevNode; | ||
var nodeObj; | ||
// `NodeList` is array-like | ||
for (var i = 0, len = nodes.length; i < len; i++) { | ||
node = nodes[i]; | ||
// reset | ||
nodeObj = { | ||
next: null, | ||
prev: result[i - 1] || null, | ||
parent: parentObj | ||
}; | ||
// set the next node for the previous node (if applicable) | ||
prevNode = result[i - 1]; | ||
if (prevNode) { | ||
prevNode.next = nodeObj; | ||
} | ||
// set the node name if it's not "#text" or "#comment" | ||
// e.g., "div" | ||
if (node.nodeName[0] !== '#') { | ||
nodeObj.name = formatTagName(node.nodeName); | ||
// also, nodes of type "tag" have "attribs" | ||
nodeObj.attribs = {}; // default | ||
if (node.attributes && node.attributes.length) { | ||
nodeObj.attribs = formatAttributes(node.attributes); | ||
} | ||
} | ||
// set the node type | ||
// e.g., "tag" | ||
switch (node.nodeType) { | ||
// 1 = element | ||
case 1: | ||
if (nodeObj.name === 'script' || nodeObj.name === 'style') { | ||
nodeObj.type = nodeObj.name; | ||
} else { | ||
nodeObj.type = 'tag'; | ||
} | ||
// recursively format the children | ||
nodeObj.children = formatDOM(node.childNodes, nodeObj); | ||
break; | ||
// 2 = attribute | ||
// 3 = text | ||
case 3: | ||
nodeObj.type = 'text'; | ||
nodeObj.data = node.nodeValue; | ||
break; | ||
// 8 = comment | ||
case 8: | ||
nodeObj.type = 'comment'; | ||
nodeObj.data = node.nodeValue; | ||
break; | ||
default: | ||
break; | ||
} | ||
result.push(nodeObj); | ||
} | ||
if (directive) { | ||
result.unshift({ | ||
name: directive.substring(0, directive.indexOf(' ')).toLowerCase(), | ||
data: directive, | ||
type: 'directive', | ||
next: result[0] ? result[0] : null, | ||
prev: null, | ||
parent: parentObj | ||
}); | ||
if (result[1]) { | ||
result[1].prev = result[0]; | ||
} | ||
} | ||
return result; | ||
} | ||
/** | ||
* Detects IE with or without version. | ||
* | ||
* @param {Number} [version] - The IE version to detect. | ||
* @return {Boolean} - Whether IE or the version has been detected. | ||
*/ | ||
function isIE(version) { | ||
if (version) { | ||
return document.documentMode === version; | ||
} | ||
return /(MSIE |Trident\/|Edge\/)/.test(navigator.userAgent); | ||
} | ||
/** | ||
* Export utilities. | ||
*/ | ||
module.exports = { | ||
formatAttributes: formatAttributes, | ||
formatDOM: formatDOM, | ||
isIE: isIE | ||
}; | ||
/***/ }), | ||
/* 1 */ | ||
/***/ "./lib/html-to-dom-client.js": | ||
/*!***********************************!*\ | ||
!*** ./lib/html-to-dom-client.js ***! | ||
\***********************************/ | ||
/*! no static exports found */ | ||
/***/ (function(module, exports, __webpack_require__) { | ||
"use strict"; | ||
eval("var domparser = __webpack_require__(/*! ./domparser */ \"./lib/domparser.js\");\nvar utilities = __webpack_require__(/*! ./utilities */ \"./lib/utilities.js\");\n\nvar formatDOM = utilities.formatDOM;\nvar isIE9 = utilities.isIE(9);\n\nvar DIRECTIVE_REGEX = /<(![a-zA-Z\\s]+)>/; // e.g., <!doctype html>\n\n/**\n * Parses HTML and reformats DOM nodes output.\n *\n * @param {String} html - The HTML string.\n * @return {Array} - The formatted DOM nodes.\n */\nfunction parseDOM(html) {\n if (typeof html !== 'string') {\n throw new TypeError('First argument must be a string');\n }\n\n if (!html) {\n return [];\n }\n\n // match directive\n var match = html.match(DIRECTIVE_REGEX);\n var directive;\n\n if (match && match[1]) {\n directive = match[1];\n\n // remove directive in IE9 because DOMParser uses\n // MIME type 'text/xml' instead of 'text/html'\n if (isIE9) {\n html = html.replace(match[0], '');\n }\n }\n\n return formatDOM(domparser(html), null, directive);\n}\n\nmodule.exports = parseDOM;\n\n\n//# sourceURL=webpack://HTMLDOMParser/./lib/html-to-dom-client.js?"); | ||
/** | ||
* Module dependencies. | ||
*/ | ||
var utilities = __webpack_require__(0); | ||
var detectIE = utilities.isIE; | ||
/** | ||
* Constants. | ||
*/ | ||
var HTML_TAG_NAME = 'html'; | ||
var BODY_TAG_NAME = 'body'; | ||
var HEAD_TAG_NAME = 'head'; | ||
var FIRST_TAG_REGEX = /<([a-zA-Z]+[0-9]?)/; // e.g., <h1> | ||
var HEAD_REGEX = /<\/head>/i; | ||
var BODY_REGEX = /<\/body>/i; | ||
// http://www.w3.org/TR/html/syntax.html#void-elements | ||
var VOID_ELEMENTS_REGEX = /<(area|base|br|col|embed|hr|img|input|keygen|link|menuitem|meta|param|source|track|wbr)(.*?)\/?>/gi; | ||
// browser support | ||
var isIE = detectIE(); | ||
var isIE9 = detectIE(9); | ||
/** | ||
* DOMParser (performance: slow). | ||
* | ||
* https://developer.mozilla.org/docs/Web/API/DOMParser#Parsing_an_SVG_or_HTML_document | ||
*/ | ||
var parseFromString; | ||
if (typeof window.DOMParser === 'function') { | ||
var domParser = new window.DOMParser(); | ||
// IE9 does not support 'text/html' MIME type | ||
// https://msdn.microsoft.com/en-us/library/ff975278(v=vs.85).aspx | ||
var MIME_TYPE = isIE9 ? 'text/xml' : 'text/html'; | ||
/** | ||
* Creates an HTML document using `DOMParser.parseFromString`. | ||
* | ||
* @param {String} html - The HTML string. | ||
* @param {String} [tagName] - The element to render the HTML (with 'body' as fallback). | ||
* @return {HTMLDocument} | ||
*/ | ||
parseFromString = function domStringParser(html, tagName) { | ||
if (tagName) { | ||
html = ['<', tagName, '>', html, '</', tagName, '>'].join(''); | ||
} | ||
// because IE9 only supports MIME type 'text/xml', void elements need to be self-closed | ||
if (isIE9) { | ||
html = html.replace(VOID_ELEMENTS_REGEX, '<$1$2$3/>'); | ||
} | ||
return domParser.parseFromString(html, MIME_TYPE); | ||
}; | ||
} | ||
/** | ||
* DOMImplementation (performance: fair). | ||
* | ||
* https://developer.mozilla.org/docs/Web/API/DOMImplementation/createHTMLDocument | ||
*/ | ||
var parseFromDocument; | ||
if (typeof document.implementation === 'object') { | ||
// title parameter is required in IE | ||
// https://msdn.microsoft.com/en-us/library/ff975457(v=vs.85).aspx | ||
var doc = document.implementation.createHTMLDocument(isIE ? 'HTML_DOM_PARSER_TITLE' : undefined); | ||
/** | ||
* Use HTML document created by `document.implementation.createHTMLDocument`. | ||
* | ||
* @param {String} html - The HTML string. | ||
* @param {String} [tagName] - The element to render the HTML (with 'body' as fallback). | ||
* @return {HTMLDocument} | ||
*/ | ||
parseFromDocument = function createHTMLDocument(html, tagName) { | ||
if (tagName) { | ||
doc.documentElement.getElementsByTagName(tagName)[0].innerHTML = html; | ||
return doc; | ||
} | ||
try { | ||
doc.documentElement.innerHTML = html; | ||
return doc; | ||
// fallback when certain elements in `documentElement` are read-only (IE9) | ||
} catch (err) { | ||
if (parseFromString) return parseFromString(html); | ||
} | ||
}; | ||
} | ||
/** | ||
* Template (performance: fast). | ||
* | ||
* https://developer.mozilla.org/docs/Web/HTML/Element/template | ||
*/ | ||
var parseFromTemplate; | ||
var template = document.createElement('template'); | ||
if (template.content) { | ||
/** | ||
* Uses a template element (content fragment) to parse HTML. | ||
* | ||
* @param {String} html - The HTML string. | ||
* @return {NodeList} | ||
*/ | ||
parseFromTemplate = function templateParser(html) { | ||
template.innerHTML = html; | ||
return template.content.childNodes; | ||
}; | ||
} | ||
/** Fallback document parser. */ | ||
var parseWithFallback = parseFromDocument || parseFromString; | ||
/** | ||
* Parses HTML string to DOM nodes. | ||
* | ||
* @param {String} html - The HTML string. | ||
* @param {String} [tagName] - The tag name. | ||
* @return {NodeList|Array} | ||
*/ | ||
module.exports = function domparser(html) { | ||
// try to match first tag | ||
var tagName; | ||
var match = html.match(FIRST_TAG_REGEX); | ||
if (match && match[1]) { | ||
tagName = match[1].toLowerCase(); | ||
} | ||
var doc; | ||
var element; | ||
var elements; | ||
switch (tagName) { | ||
case HTML_TAG_NAME: | ||
if (parseFromString) { | ||
doc = parseFromString(html); | ||
// the created document may come with filler head/body elements, | ||
// so ake sure to remove them if they don't actually exist | ||
if (!HEAD_REGEX.test(html)) { | ||
element = doc.getElementsByTagName(HEAD_TAG_NAME)[0]; | ||
if (element) element.parentNode.removeChild(element); | ||
} | ||
if (!BODY_REGEX.test(html)) { | ||
element = doc.getElementsByTagName(BODY_TAG_NAME)[0]; | ||
if (element) element.parentNode.removeChild(element); | ||
} | ||
return doc.getElementsByTagName(HTML_TAG_NAME); | ||
} | ||
break; | ||
case HEAD_TAG_NAME: | ||
if (parseWithFallback) { | ||
elements = parseWithFallback(html).getElementsByTagName(HEAD_TAG_NAME); | ||
// account for possibility of sibling | ||
if (BODY_REGEX.test(html)) { | ||
return elements[0].parentNode.childNodes; | ||
} | ||
return elements; | ||
} | ||
break; | ||
case BODY_TAG_NAME: | ||
if (parseWithFallback) { | ||
elements = parseWithFallback(html).getElementsByTagName(BODY_TAG_NAME); | ||
// account for possibility of sibling (return both body and head) | ||
if (HEAD_REGEX.test(html)) { | ||
return elements[0].parentNode.childNodes; | ||
} | ||
return elements; | ||
} | ||
break; | ||
// low-level tag or text | ||
default: | ||
if (parseFromTemplate) return parseFromTemplate(html); | ||
if (parseWithFallback) { | ||
return parseWithFallback(html, BODY_TAG_NAME).getElementsByTagName(BODY_TAG_NAME)[0].childNodes; | ||
} | ||
break; | ||
} | ||
return []; | ||
}; | ||
/***/ }), | ||
/* 2 */ | ||
/***/ (function(module, exports, __webpack_require__) { | ||
"use strict"; | ||
/** | ||
* SVG elements, unlike HTML elements, are case-sensitive. | ||
* | ||
* @see {@link https://developer.mozilla.org/docs/Web/SVG/Element#SVG_elements_A_to_Z} | ||
*/ | ||
var CASE_SENSITIVE_TAG_NAMES = [ | ||
'animateMotion', | ||
'animateTransform', | ||
'clipPath', | ||
'feBlend', | ||
'feColorMatrix', | ||
'feComponentTransfer', | ||
'feComposite', | ||
'feConvolveMatrix', | ||
'feDiffuseLighting', | ||
'feDisplacementMap', | ||
'feDropShadow', | ||
'feFlood', | ||
'feFuncA', | ||
'feFuncB', | ||
'feFuncG', | ||
'feFuncR', | ||
'feGaussainBlur', | ||
'feImage', | ||
'feMerge', | ||
'feMergeNode', | ||
'feMorphology', | ||
'feOffset', | ||
'fePointLight', | ||
'feSpecularLighting', | ||
'feSpotLight', | ||
'feTile', | ||
'feTurbulence', | ||
'foreignObject', | ||
'linearGradient', | ||
'radialGradient', | ||
'textPath' | ||
]; | ||
module.exports = { | ||
CASE_SENSITIVE_TAG_NAMES: CASE_SENSITIVE_TAG_NAMES | ||
}; | ||
/***/ }), | ||
/* 3 */ | ||
/***/ "./lib/utilities.js": | ||
/*!**************************!*\ | ||
!*** ./lib/utilities.js ***! | ||
\**************************/ | ||
/*! no static exports found */ | ||
/***/ (function(module, exports, __webpack_require__) { | ||
"use strict"; | ||
eval("var CASE_SENSITIVE_TAG_NAMES = __webpack_require__(/*! ./constants */ \"./lib/constants.js\").CASE_SENSITIVE_TAG_NAMES;\n\nvar caseSensitiveTagNamesMap = {};\nvar tagName;\nfor (var i = 0, len = CASE_SENSITIVE_TAG_NAMES.length; i < len; i++) {\n tagName = CASE_SENSITIVE_TAG_NAMES[i];\n caseSensitiveTagNamesMap[tagName.toLowerCase()] = tagName;\n}\n\n/**\n * Gets case-sensitive tag name.\n *\n * @param {String} tagName - The lowercase tag name.\n * @return {String|undefined}\n */\nfunction getCaseSensitiveTagName(tagName) {\n return caseSensitiveTagNamesMap[tagName];\n}\n\n/**\n * Formats DOM attributes to a hash map.\n *\n * @param {NamedNodeMap} attributes - The list of attributes.\n * @return {Object} - A map of attribute name to value.\n */\nfunction formatAttributes(attributes) {\n var result = {};\n var attribute;\n // `NamedNodeMap` is array-like\n for (var i = 0, len = attributes.length; i < len; i++) {\n attribute = attributes[i];\n result[attribute.name] = attribute.value;\n }\n return result;\n}\n\n/**\n * Corrects the tag name if it is case-sensitive (SVG).\n * Otherwise, returns the lowercase tag name (HTML).\n *\n * @param {String} tagName - The lowercase tag name.\n * @return {String} - The formatted tag name.\n */\nfunction formatTagName(tagName) {\n tagName = tagName.toLowerCase();\n var caseSensitiveTagName = getCaseSensitiveTagName(tagName);\n if (caseSensitiveTagName) {\n return caseSensitiveTagName;\n }\n return tagName;\n}\n\n/**\n * Formats the browser DOM nodes to mimic the output of `htmlparser2.parseDOM()`.\n *\n * @param {NodeList} nodes - The DOM nodes.\n * @param {Object} [parentObj] - The formatted parent node.\n * @param {String} [directive] - The directive.\n * @return {Object[]} - The formatted DOM object.\n */\nfunction formatDOM(nodes, parentObj, directive) {\n parentObj = parentObj || null;\n\n var result = [];\n var node;\n var prevNode;\n var nodeObj;\n\n // `NodeList` is array-like\n for (var i = 0, len = nodes.length; i < len; i++) {\n node = nodes[i];\n // reset\n nodeObj = {\n next: null,\n prev: result[i - 1] || null,\n parent: parentObj\n };\n\n // set the next node for the previous node (if applicable)\n prevNode = result[i - 1];\n if (prevNode) {\n prevNode.next = nodeObj;\n }\n\n // set the node name if it's not \"#text\" or \"#comment\"\n // e.g., \"div\"\n if (node.nodeName[0] !== '#') {\n nodeObj.name = formatTagName(node.nodeName);\n // also, nodes of type \"tag\" have \"attribs\"\n nodeObj.attribs = {}; // default\n if (node.attributes && node.attributes.length) {\n nodeObj.attribs = formatAttributes(node.attributes);\n }\n }\n\n // set the node type\n // e.g., \"tag\"\n switch (node.nodeType) {\n // 1 = element\n case 1:\n if (nodeObj.name === 'script' || nodeObj.name === 'style') {\n nodeObj.type = nodeObj.name;\n } else {\n nodeObj.type = 'tag';\n }\n // recursively format the children\n nodeObj.children = formatDOM(node.childNodes, nodeObj);\n break;\n // 2 = attribute\n // 3 = text\n case 3:\n nodeObj.type = 'text';\n nodeObj.data = node.nodeValue;\n break;\n // 8 = comment\n case 8:\n nodeObj.type = 'comment';\n nodeObj.data = node.nodeValue;\n break;\n }\n\n result.push(nodeObj);\n }\n\n if (directive) {\n result.unshift({\n name: directive.substring(0, directive.indexOf(' ')).toLowerCase(),\n data: directive,\n type: 'directive',\n next: result[0] ? result[0] : null,\n prev: null,\n parent: parentObj\n });\n\n if (result[1]) {\n result[1].prev = result[0];\n }\n }\n\n return result;\n}\n\n/**\n * Detects IE with or without version.\n *\n * @param {Number} [version] - The IE version to detect.\n * @return {Boolean} - Whether IE or the version has been detected.\n */\nfunction isIE(version) {\n if (version) {\n return document.documentMode === version;\n }\n return /(MSIE |Trident\\/|Edge\\/)/.test(navigator.userAgent);\n}\n\nmodule.exports = {\n formatAttributes: formatAttributes,\n formatDOM: formatDOM,\n isIE: isIE\n};\n\n\n//# sourceURL=webpack://HTMLDOMParser/./lib/utilities.js?"); | ||
/***/ }) | ||
/** | ||
* Module dependencies. | ||
*/ | ||
var domparser = __webpack_require__(1); | ||
var utilities = __webpack_require__(0); | ||
var formatDOM = utilities.formatDOM; | ||
var isIE9 = utilities.isIE(9); | ||
/** | ||
* Constants. | ||
*/ | ||
var DIRECTIVE_REGEX = /<(![a-zA-Z\s]+)>/; // e.g., <!doctype html> | ||
/** | ||
* Parses HTML and reformats DOM nodes output. | ||
* | ||
* @param {String} html - The HTML string. | ||
* @return {Array} - The formatted DOM nodes. | ||
*/ | ||
module.exports = function parseDOM(html) { | ||
if (typeof html !== 'string') { | ||
throw new TypeError('First argument must be a string.'); | ||
} | ||
if (!html) return []; | ||
// match directive | ||
var match = html.match(DIRECTIVE_REGEX); | ||
var directive; | ||
if (match && match[1]) { | ||
directive = match[1]; | ||
// remove directive in IE9 because DOMParser uses | ||
// MIME type 'text/xml' instead of 'text/html' | ||
if (isIE9) { | ||
html = html.replace(match[0], ''); | ||
} | ||
} | ||
return formatDOM(domparser(html), null, directive); | ||
}; | ||
/***/ }) | ||
/******/ ]); | ||
/******/ }); | ||
}); |
@@ -1,1 +0,1 @@ | ||
!function(e,t){"object"==typeof exports&&"object"==typeof module?module.exports=t():"function"==typeof define&&define.amd?define([],t):"object"==typeof exports?exports.HTMLDOMParser=t():e.HTMLDOMParser=t()}(this,function(){return function(e){function t(r){if(n[r])return n[r].exports;var a=n[r]={i:r,l:!1,exports:{}};return e[r].call(a.exports,a,a.exports,t),a.l=!0,a.exports}var n={};return t.m=e,t.c=n,t.i=function(e){return e},t.d=function(e,n,r){t.o(e,n)||Object.defineProperty(e,n,{configurable:!1,enumerable:!0,get:r})},t.n=function(e){var n=e&&e.__esModule?function(){return e.default}:function(){return e};return t.d(n,"a",n),n},t.o=function(e,t){return Object.prototype.hasOwnProperty.call(e,t)},t.p="",t(t.s=3)}([function(e,t,n){"use strict";function r(e){return c[e]}function a(e){for(var t,n={},r=0,a=e.length;r<a;r++)t=e[r],n[t.name]=t.value;return n}function o(e){e=e.toLowerCase();var t=r(e);return t||e}function i(e,t,n){t=t||null;for(var r,u,f,s=[],c=0,l=e.length;c<l;c++){switch(r=e[c],f={next:null,prev:s[c-1]||null,parent:t},u=s[c-1],u&&(u.next=f),"#"!==r.nodeName[0]&&(f.name=o(r.nodeName),f.attribs={},r.attributes&&r.attributes.length&&(f.attribs=a(r.attributes))),r.nodeType){case 1:"script"===f.name||"style"===f.name?f.type=f.name:f.type="tag",f.children=i(r.childNodes,f);break;case 3:f.type="text",f.data=r.nodeValue;break;case 8:f.type="comment",f.data=r.nodeValue}s.push(f)}return n&&(s.unshift({name:n.substring(0,n.indexOf(" ")).toLowerCase(),data:n,type:"directive",next:s[0]?s[0]:null,prev:null,parent:t}),s[1]&&(s[1].prev=s[0])),s}function u(e){return e?document.documentMode===e:/(MSIE |Trident\/|Edge\/)/.test(navigator.userAgent)}for(var f,s=n(2).CASE_SENSITIVE_TAG_NAMES,c={},l=0,m=s.length;l<m;l++)f=s[l],c[f.toLowerCase()]=f;e.exports={formatAttributes:a,formatDOM:i,isIE:u}},function(e,t,n){"use strict";var r,a=n(0),o=a.isIE,i=/<([a-zA-Z]+[0-9]?)/,u=/<\/head>/i,f=/<\/body>/i,s=/<(area|base|br|col|embed|hr|img|input|keygen|link|menuitem|meta|param|source|track|wbr)(.*?)\/?>/gi,c=o(),l=o(9);if("function"==typeof window.DOMParser){var m=new window.DOMParser,d=l?"text/xml":"text/html";r=function(e,t){return t&&(e=["<",t,">",e,"</",t,">"].join("")),l&&(e=e.replace(s,"<$1$2$3/>")),m.parseFromString(e,d)}}var p;if("object"==typeof document.implementation){var g=document.implementation.createHTMLDocument(c?"HTML_DOM_PARSER_TITLE":void 0);p=function(e,t){if(t)return g.documentElement.getElementsByTagName(t)[0].innerHTML=e,g;try{return g.documentElement.innerHTML=e,g}catch(t){if(r)return r(e)}}}var h,y=document.createElement("template");y.content&&(h=function(e){return y.innerHTML=e,y.content.childNodes});var b=p||r;e.exports=function(e){var t,n=e.match(i);n&&n[1]&&(t=n[1].toLowerCase());var a,o,s;switch(t){case"html":if(r)return a=r(e),u.test(e)||(o=a.getElementsByTagName("head")[0])&&o.parentNode.removeChild(o),f.test(e)||(o=a.getElementsByTagName("body")[0])&&o.parentNode.removeChild(o),a.getElementsByTagName("html");break;case"head":if(b)return s=b(e).getElementsByTagName("head"),f.test(e)?s[0].parentNode.childNodes:s;break;case"body":if(b)return s=b(e).getElementsByTagName("body"),u.test(e)?s[0].parentNode.childNodes:s;break;default:if(h)return h(e);if(b)return b(e,"body").getElementsByTagName("body")[0].childNodes}return[]}},function(e,t,n){"use strict";var r=["animateMotion","animateTransform","clipPath","feBlend","feColorMatrix","feComponentTransfer","feComposite","feConvolveMatrix","feDiffuseLighting","feDisplacementMap","feDropShadow","feFlood","feFuncA","feFuncB","feFuncG","feFuncR","feGaussainBlur","feImage","feMerge","feMergeNode","feMorphology","feOffset","fePointLight","feSpecularLighting","feSpotLight","feTile","feTurbulence","foreignObject","linearGradient","radialGradient","textPath"];e.exports={CASE_SENSITIVE_TAG_NAMES:r}},function(e,t,n){"use strict";var r=n(1),a=n(0),o=a.formatDOM,i=a.isIE(9),u=/<(![a-zA-Z\s]+)>/;e.exports=function(e){if("string"!=typeof e)throw new TypeError("First argument must be a string.");if(!e)return[];var t,n=e.match(u);return n&&n[1]&&(t=n[1],i&&(e=e.replace(n[0],""))),o(r(e),null,t)}}])}); | ||
!function(e,t){"object"==typeof exports&&"object"==typeof module?module.exports=t():"function"==typeof define&&define.amd?define([],t):"object"==typeof exports?exports.HTMLDOMParser=t():e.HTMLDOMParser=t()}(window,(function(){return function(e){var t={};function n(r){if(t[r])return t[r].exports;var o=t[r]={i:r,l:!1,exports:{}};return e[r].call(o.exports,o,o.exports,n),o.l=!0,o.exports}return n.m=e,n.c=t,n.d=function(e,t,r){n.o(e,t)||Object.defineProperty(e,t,{enumerable:!0,get:r})},n.r=function(e){"undefined"!=typeof Symbol&&Symbol.toStringTag&&Object.defineProperty(e,Symbol.toStringTag,{value:"Module"}),Object.defineProperty(e,"__esModule",{value:!0})},n.t=function(e,t){if(1&t&&(e=n(e)),8&t)return e;if(4&t&&"object"==typeof e&&e&&e.__esModule)return e;var r=Object.create(null);if(n.r(r),Object.defineProperty(r,"default",{enumerable:!0,value:e}),2&t&&"string"!=typeof e)for(var o in e)n.d(r,o,function(t){return e[t]}.bind(null,o));return r},n.n=function(e){var t=e&&e.__esModule?function(){return e.default}:function(){return e};return n.d(t,"a",t),t},n.o=function(e,t){return Object.prototype.hasOwnProperty.call(e,t)},n.p="",n(n.s=1)}([function(e,t,n){for(var r,o=n(3).CASE_SENSITIVE_TAG_NAMES,a={},i=0,u=o.length;i<u;i++)r=o[i],a[r.toLowerCase()]=r;function f(e){for(var t,n={},r=0,o=e.length;r<o;r++)n[(t=e[r]).name]=t.value;return n}function c(e){var t=function(e){return a[e]}(e=e.toLowerCase());return t||e}e.exports={formatAttributes:f,formatDOM:function e(t,n,r){n=n||null;for(var o,a,i,u=[],l=0,s=t.length;l<s;l++){switch(o=t[l],i={next:null,prev:u[l-1]||null,parent:n},(a=u[l-1])&&(a.next=i),"#"!==o.nodeName[0]&&(i.name=c(o.nodeName),i.attribs={},o.attributes&&o.attributes.length&&(i.attribs=f(o.attributes))),o.nodeType){case 1:"script"===i.name||"style"===i.name?i.type=i.name:i.type="tag",i.children=e(o.childNodes,i);break;case 3:i.type="text",i.data=o.nodeValue;break;case 8:i.type="comment",i.data=o.nodeValue}u.push(i)}return r&&(u.unshift({name:r.substring(0,r.indexOf(" ")).toLowerCase(),data:r,type:"directive",next:u[0]?u[0]:null,prev:null,parent:n}),u[1]&&(u[1].prev=u[0])),u},isIE:function(e){return e?document.documentMode===e:/(MSIE |Trident\/|Edge\/)/.test(navigator.userAgent)}}},function(e,t,n){var r=n(2),o=n(0),a=o.formatDOM,i=o.isIE(9),u=/<(![a-zA-Z\s]+)>/;e.exports=function(e){if("string"!=typeof e)throw new TypeError("First argument must be a string");if(!e)return[];var t,n=e.match(u);return n&&n[1]&&(t=n[1],i&&(e=e.replace(n[0],""))),a(r(e),null,t)}},function(e,t,n){var r,o,a,i=n(0),u="html",f="head",c="body",l=/<([a-zA-Z]+[0-9]?)/,s=/<head.*>/i,d=/<body.*>/i,m=/<(area|base|br|col|embed|hr|img|input|keygen|link|menuitem|meta|param|source|track|wbr)(.*?)\/?>/gi,p=i.isIE(9),g=p||i.isIE();if("function"==typeof window.DOMParser){var y=new window.DOMParser,b=p?"text/xml":"text/html";r=function(e,t){return t&&(e="<"+t+">"+e+"</"+t+">"),p&&(e=e.replace(m,"<$1$2$3/>")),y.parseFromString(e,b)}}if("object"==typeof document.implementation){var h=document.implementation.createHTMLDocument(g?"HTML_DOM_PARSER_TITLE":void 0);o=function(e,t){if(t)return h.documentElement.getElementsByTagName(t)[0].innerHTML=e,h;try{return h.documentElement.innerHTML=e,h}catch(t){if(r)return r(e)}}}var v=document.createElement("template");v.content&&(a=function(e){return v.innerHTML=e,v.content.childNodes});var M=o||r;e.exports=function(e){var t,n,o,i,m=e.match(l);switch(m&&m[1]&&(t=m[1].toLowerCase()),t){case u:if(r)return n=r(e),s.test(e)||(o=n.getElementsByTagName(f)[0])&&o.parentNode.removeChild(o),d.test(e)||(o=n.getElementsByTagName(c)[0])&&o.parentNode.removeChild(o),n.getElementsByTagName(u);break;case f:case c:if(M)return i=M(e).getElementsByTagName(t),d.test(e)&&s.test(e)?i[0].parentNode.childNodes:i;break;default:if(a)return a(e);if(M)return M(e,c).getElementsByTagName(c)[0].childNodes}return[]}},function(e,t){e.exports={CASE_SENSITIVE_TAG_NAMES:["animateMotion","animateTransform","clipPath","feBlend","feColorMatrix","feComponentTransfer","feComposite","feConvolveMatrix","feDiffuseLighting","feDisplacementMap","feDropShadow","feFlood","feFuncA","feFuncB","feFuncG","feFuncR","feGaussainBlur","feImage","feMerge","feMergeNode","feMorphology","feOffset","fePointLight","feSpecularLighting","feSpotLight","feTile","feTurbulence","foreignObject","linearGradient","radialGradient","textPath"]}}])})); |
@@ -1,9 +0,7 @@ | ||
'use strict'; | ||
/** | ||
* Use the server/node parser by default. | ||
* When running on Node.js, use the server parser. | ||
* When bundling for the browser, use the client parser. | ||
* | ||
* But use the client parser when bundling for the browser: | ||
* https://github.com/substack/node-browserify#browser-field | ||
* @see {@link https://github.com/substack/node-browserify#browser-field} | ||
*/ | ||
module.exports = require('./lib/html-to-dom-server'); |
@@ -1,5 +0,3 @@ | ||
'use strict'; | ||
/** | ||
* SVG elements, unlike HTML elements, are case-sensitive. | ||
* SVG elements are case-sensitive. | ||
* | ||
@@ -9,37 +7,37 @@ * @see {@link https://developer.mozilla.org/docs/Web/SVG/Element#SVG_elements_A_to_Z} | ||
var CASE_SENSITIVE_TAG_NAMES = [ | ||
'animateMotion', | ||
'animateTransform', | ||
'clipPath', | ||
'feBlend', | ||
'feColorMatrix', | ||
'feComponentTransfer', | ||
'feComposite', | ||
'feConvolveMatrix', | ||
'feDiffuseLighting', | ||
'feDisplacementMap', | ||
'feDropShadow', | ||
'feFlood', | ||
'feFuncA', | ||
'feFuncB', | ||
'feFuncG', | ||
'feFuncR', | ||
'feGaussainBlur', | ||
'feImage', | ||
'feMerge', | ||
'feMergeNode', | ||
'feMorphology', | ||
'feOffset', | ||
'fePointLight', | ||
'feSpecularLighting', | ||
'feSpotLight', | ||
'feTile', | ||
'feTurbulence', | ||
'foreignObject', | ||
'linearGradient', | ||
'radialGradient', | ||
'textPath' | ||
'animateMotion', | ||
'animateTransform', | ||
'clipPath', | ||
'feBlend', | ||
'feColorMatrix', | ||
'feComponentTransfer', | ||
'feComposite', | ||
'feConvolveMatrix', | ||
'feDiffuseLighting', | ||
'feDisplacementMap', | ||
'feDropShadow', | ||
'feFlood', | ||
'feFuncA', | ||
'feFuncB', | ||
'feFuncG', | ||
'feFuncR', | ||
'feGaussainBlur', | ||
'feImage', | ||
'feMerge', | ||
'feMergeNode', | ||
'feMorphology', | ||
'feOffset', | ||
'fePointLight', | ||
'feSpecularLighting', | ||
'feSpotLight', | ||
'feTile', | ||
'feTurbulence', | ||
'foreignObject', | ||
'linearGradient', | ||
'radialGradient', | ||
'textPath' | ||
]; | ||
module.exports = { | ||
CASE_SENSITIVE_TAG_NAMES: CASE_SENSITIVE_TAG_NAMES | ||
CASE_SENSITIVE_TAG_NAMES: CASE_SENSITIVE_TAG_NAMES | ||
}; |
@@ -1,24 +0,16 @@ | ||
'use strict'; | ||
/** | ||
* Module dependencies. | ||
*/ | ||
var utilities = require('./utilities'); | ||
var detectIE = utilities.isIE; | ||
/** | ||
* Constants. | ||
*/ | ||
var HTML_TAG_NAME = 'html'; | ||
var BODY_TAG_NAME = 'body'; | ||
var HEAD_TAG_NAME = 'head'; | ||
// constants | ||
var HTML = 'html'; | ||
var HEAD = 'head'; | ||
var BODY = 'body'; | ||
var FIRST_TAG_REGEX = /<([a-zA-Z]+[0-9]?)/; // e.g., <h1> | ||
var HEAD_REGEX = /<\/head>/i; | ||
var BODY_REGEX = /<\/body>/i; | ||
var HEAD_TAG_REGEX = /<head.*>/i; | ||
var BODY_TAG_REGEX = /<body.*>/i; | ||
// http://www.w3.org/TR/html/syntax.html#void-elements | ||
var VOID_ELEMENTS_REGEX = /<(area|base|br|col|embed|hr|img|input|keygen|link|menuitem|meta|param|source|track|wbr)(.*?)\/?>/gi; | ||
// browser support | ||
var isIE = detectIE(); | ||
var isIE9 = detectIE(9); | ||
// detect IE browser | ||
var isIE9 = utilities.isIE(9); | ||
var isIE = isIE9 || utilities.isIE(); | ||
@@ -28,28 +20,32 @@ /** | ||
* | ||
* https://developer.mozilla.org/docs/Web/API/DOMParser#Parsing_an_SVG_or_HTML_document | ||
* @see https://developer.mozilla.org/docs/Web/API/DOMParser#Parsing_an_SVG_or_HTML_document | ||
*/ | ||
var parseFromString; | ||
if (typeof window.DOMParser === 'function') { | ||
var domParser = new window.DOMParser(); | ||
// IE9 does not support 'text/html' MIME type | ||
// https://msdn.microsoft.com/en-us/library/ff975278(v=vs.85).aspx | ||
var MIME_TYPE = isIE9 ? 'text/xml' : 'text/html'; | ||
var domParser = new window.DOMParser(); | ||
/** | ||
* Creates an HTML document using `DOMParser.parseFromString`. | ||
* | ||
* @param {String} html - The HTML string. | ||
* @param {String} [tagName] - The element to render the HTML (with 'body' as fallback). | ||
* @return {HTMLDocument} | ||
*/ | ||
parseFromString = function domStringParser(html, tagName) { | ||
if (tagName) { | ||
html = ['<', tagName, '>', html, '</', tagName, '>'].join(''); | ||
} | ||
// because IE9 only supports MIME type 'text/xml', void elements need to be self-closed | ||
if (isIE9) { | ||
html = html.replace(VOID_ELEMENTS_REGEX, '<$1$2$3/>'); | ||
} | ||
return domParser.parseFromString(html, MIME_TYPE); | ||
}; | ||
// IE9 does not support 'text/html' MIME type | ||
// https://msdn.microsoft.com/en-us/library/ff975278(v=vs.85).aspx | ||
var mimeType = isIE9 ? 'text/xml' : 'text/html'; | ||
/** | ||
* Creates an HTML document using `DOMParser.parseFromString`. | ||
* | ||
* @param {string} html - The HTML string. | ||
* @param {string} [tagName] - The element to render the HTML (with 'body' as fallback). | ||
* @return {HTMLDocument} | ||
*/ | ||
parseFromString = function domStringParser(html, tagName) { | ||
if (tagName) { | ||
html = '<' + tagName + '>' + html + '</' + tagName + '>'; | ||
} | ||
// because IE9 only supports MIME type 'text/xml', void elements need to be self-closed | ||
if (isIE9) { | ||
html = html.replace(VOID_ELEMENTS_REGEX, '<$1$2$3/>'); | ||
} | ||
return domParser.parseFromString(html, mimeType); | ||
}; | ||
} | ||
@@ -60,31 +56,36 @@ | ||
* | ||
* https://developer.mozilla.org/docs/Web/API/DOMImplementation/createHTMLDocument | ||
* @see https://developer.mozilla.org/docs/Web/API/DOMImplementation/createHTMLDocument | ||
*/ | ||
var parseFromDocument; | ||
if (typeof document.implementation === 'object') { | ||
// title parameter is required in IE | ||
// https://msdn.microsoft.com/en-us/library/ff975457(v=vs.85).aspx | ||
var doc = document.implementation.createHTMLDocument(isIE ? 'HTML_DOM_PARSER_TITLE' : undefined); | ||
// title parameter is required in IE | ||
// https://msdn.microsoft.com/en-us/library/ff975457(v=vs.85).aspx | ||
var doc = document.implementation.createHTMLDocument( | ||
isIE ? 'HTML_DOM_PARSER_TITLE' : undefined | ||
); | ||
/** | ||
* Use HTML document created by `document.implementation.createHTMLDocument`. | ||
* | ||
* @param {String} html - The HTML string. | ||
* @param {String} [tagName] - The element to render the HTML (with 'body' as fallback). | ||
* @return {HTMLDocument} | ||
*/ | ||
parseFromDocument = function createHTMLDocument(html, tagName) { | ||
if (tagName) { | ||
doc.documentElement.getElementsByTagName(tagName)[0].innerHTML = html; | ||
return doc; | ||
} | ||
/** | ||
* Use HTML document created by `document.implementation.createHTMLDocument`. | ||
* | ||
* @param {string} html - The HTML string. | ||
* @param {string} [tagName] - The element to render the HTML (with 'body' as fallback). | ||
* @return {HTMLDocument} | ||
*/ | ||
parseFromDocument = function createHTMLDocument(html, tagName) { | ||
if (tagName) { | ||
doc.documentElement.getElementsByTagName(tagName)[0].innerHTML = html; | ||
return doc; | ||
} | ||
try { | ||
doc.documentElement.innerHTML = html; | ||
return doc; | ||
// fallback when certain elements in `documentElement` are read-only (IE9) | ||
} catch (err) { | ||
if (parseFromString) return parseFromString(html); | ||
} | ||
}; | ||
try { | ||
doc.documentElement.innerHTML = html; | ||
return doc; | ||
// fallback when certain elements in `documentElement` are read-only (IE9) | ||
} catch (err) { | ||
if (parseFromString) { | ||
return parseFromString(html); | ||
} | ||
} | ||
}; | ||
} | ||
@@ -95,21 +96,21 @@ | ||
* | ||
* https://developer.mozilla.org/docs/Web/HTML/Element/template | ||
* @see https://developer.mozilla.org/docs/Web/HTML/Element/template | ||
*/ | ||
var parseFromTemplate; | ||
var template = document.createElement('template'); | ||
if (template.content) { | ||
/** | ||
* Uses a template element (content fragment) to parse HTML. | ||
* | ||
* @param {String} html - The HTML string. | ||
* @return {NodeList} | ||
*/ | ||
parseFromTemplate = function templateParser(html) { | ||
template.innerHTML = html; | ||
return template.content.childNodes; | ||
}; | ||
/** | ||
* Uses a template element (content fragment) to parse HTML. | ||
* | ||
* @param {string} html - The HTML string. | ||
* @return {NodeList} | ||
*/ | ||
parseFromTemplate = function templateParser(html) { | ||
template.innerHTML = html; | ||
return template.content.childNodes; | ||
}; | ||
} | ||
/** Fallback document parser. */ | ||
// fallback document parser | ||
var parseWithFallback = parseFromDocument || parseFromString; | ||
@@ -120,72 +121,73 @@ | ||
* | ||
* @param {String} html - The HTML string. | ||
* @param {String} [tagName] - The tag name. | ||
* @param {string} html - The HTML string. | ||
* @return {NodeList|Array} | ||
*/ | ||
module.exports = function domparser(html) { | ||
// try to match first tag | ||
var tagName; | ||
var match = html.match(FIRST_TAG_REGEX); | ||
if (match && match[1]) { | ||
tagName = match[1].toLowerCase(); | ||
} | ||
function domparser(html) { | ||
var firstTagName; | ||
var match = html.match(FIRST_TAG_REGEX); | ||
var doc; | ||
var element; | ||
var elements; | ||
if (match && match[1]) { | ||
firstTagName = match[1].toLowerCase(); | ||
} | ||
switch (tagName) { | ||
case HTML_TAG_NAME: | ||
if (parseFromString) { | ||
doc = parseFromString(html); | ||
var doc; | ||
var element; | ||
var elements; | ||
// the created document may come with filler head/body elements, | ||
// so ake sure to remove them if they don't actually exist | ||
if (!HEAD_REGEX.test(html)) { | ||
element = doc.getElementsByTagName(HEAD_TAG_NAME)[0]; | ||
if (element) element.parentNode.removeChild(element); | ||
} | ||
if (!BODY_REGEX.test(html)) { | ||
element = doc.getElementsByTagName(BODY_TAG_NAME)[0]; | ||
if (element) element.parentNode.removeChild(element); | ||
} | ||
switch (firstTagName) { | ||
case HTML: | ||
if (parseFromString) { | ||
doc = parseFromString(html); | ||
return doc.getElementsByTagName(HTML_TAG_NAME); | ||
} | ||
break; | ||
// the created document may come with filler head/body elements, | ||
// so make sure to remove them if they don't actually exist | ||
if (!HEAD_TAG_REGEX.test(html)) { | ||
element = doc.getElementsByTagName(HEAD)[0]; | ||
if (element) { | ||
element.parentNode.removeChild(element); | ||
} | ||
} | ||
case HEAD_TAG_NAME: | ||
if (parseWithFallback) { | ||
elements = parseWithFallback(html).getElementsByTagName(HEAD_TAG_NAME); | ||
if (!BODY_TAG_REGEX.test(html)) { | ||
element = doc.getElementsByTagName(BODY)[0]; | ||
if (element) { | ||
element.parentNode.removeChild(element); | ||
} | ||
} | ||
// account for possibility of sibling | ||
if (BODY_REGEX.test(html)) { | ||
return elements[0].parentNode.childNodes; | ||
} | ||
return elements; | ||
} | ||
break; | ||
return doc.getElementsByTagName(HTML); | ||
} | ||
break; | ||
case BODY_TAG_NAME: | ||
if (parseWithFallback) { | ||
elements = parseWithFallback(html).getElementsByTagName(BODY_TAG_NAME); | ||
case HEAD: | ||
case BODY: | ||
if (parseWithFallback) { | ||
elements = parseWithFallback(html).getElementsByTagName(firstTagName); | ||
// account for possibility of sibling (return both body and head) | ||
if (HEAD_REGEX.test(html)) { | ||
return elements[0].parentNode.childNodes; | ||
} | ||
return elements; | ||
} | ||
break; | ||
// account for possibility of sibling | ||
if (BODY_TAG_REGEX.test(html) && HEAD_TAG_REGEX.test(html)) { | ||
return elements[0].parentNode.childNodes; | ||
} | ||
// low-level tag or text | ||
default: | ||
if (parseFromTemplate) return parseFromTemplate(html); | ||
if (parseWithFallback) { | ||
return parseWithFallback(html, BODY_TAG_NAME).getElementsByTagName(BODY_TAG_NAME)[0].childNodes; | ||
} | ||
break; | ||
} | ||
return elements; | ||
} | ||
break; | ||
return []; | ||
}; | ||
// low-level tag or text | ||
default: | ||
if (parseFromTemplate) { | ||
return parseFromTemplate(html); | ||
} | ||
if (parseWithFallback) { | ||
return parseWithFallback(html, BODY).getElementsByTagName(BODY)[0] | ||
.childNodes; | ||
} | ||
break; | ||
} | ||
return []; | ||
} | ||
module.exports = domparser; |
@@ -1,14 +0,7 @@ | ||
'use strict'; | ||
/** | ||
* Module dependencies. | ||
*/ | ||
var domparser = require('./domparser'); | ||
var utilities = require('./utilities'); | ||
var formatDOM = utilities.formatDOM; | ||
var isIE9 = utilities.isIE(9); | ||
/** | ||
* Constants. | ||
*/ | ||
var DIRECTIVE_REGEX = /<(![a-zA-Z\s]+)>/; // e.g., <!doctype html> | ||
@@ -22,22 +15,28 @@ | ||
*/ | ||
module.exports = function parseDOM(html) { | ||
if (typeof html !== 'string') { | ||
throw new TypeError('First argument must be a string.'); | ||
} | ||
if (!html) return []; | ||
function parseDOM(html) { | ||
if (typeof html !== 'string') { | ||
throw new TypeError('First argument must be a string'); | ||
} | ||
// match directive | ||
var match = html.match(DIRECTIVE_REGEX); | ||
var directive; | ||
if (match && match[1]) { | ||
directive = match[1]; | ||
if (!html) { | ||
return []; | ||
} | ||
// remove directive in IE9 because DOMParser uses | ||
// MIME type 'text/xml' instead of 'text/html' | ||
if (isIE9) { | ||
html = html.replace(match[0], ''); | ||
} | ||
// match directive | ||
var match = html.match(DIRECTIVE_REGEX); | ||
var directive; | ||
if (match && match[1]) { | ||
directive = match[1]; | ||
// remove directive in IE9 because DOMParser uses | ||
// MIME type 'text/xml' instead of 'text/html' | ||
if (isIE9) { | ||
html = html.replace(match[0], ''); | ||
} | ||
} | ||
return formatDOM(domparser(html), null, directive); | ||
}; | ||
return formatDOM(domparser(html), null, directive); | ||
} | ||
module.exports = parseDOM; |
@@ -1,6 +0,1 @@ | ||
'use strict'; | ||
/** | ||
* Module dependencies. | ||
*/ | ||
var Parser = require('htmlparser2/lib/Parser'); | ||
@@ -19,9 +14,11 @@ var DomHandler = require('domhandler'); | ||
*/ | ||
module.exports = function parseDOM(html, options) { | ||
if (typeof html !== 'string') { | ||
throw new TypeError('First argument must be a string.'); | ||
} | ||
var handler = new DomHandler(options); | ||
new Parser(handler, options).end(html); | ||
return handler.dom; | ||
}; | ||
function parseDOM(html, options) { | ||
if (typeof html !== 'string') { | ||
throw new TypeError('First argument must be a string.'); | ||
} | ||
var handler = new DomHandler(options); | ||
new Parser(handler, options).end(html); | ||
return handler.dom; | ||
} | ||
module.exports = parseDOM; |
@@ -1,3 +0,1 @@ | ||
'use strict'; | ||
var CASE_SENSITIVE_TAG_NAMES = require('./constants').CASE_SENSITIVE_TAG_NAMES; | ||
@@ -8,4 +6,4 @@ | ||
for (var i = 0, len = CASE_SENSITIVE_TAG_NAMES.length; i < len; i++) { | ||
tagName = CASE_SENSITIVE_TAG_NAMES[i]; | ||
caseSensitiveTagNamesMap[tagName.toLowerCase()] = tagName; | ||
tagName = CASE_SENSITIVE_TAG_NAMES[i]; | ||
caseSensitiveTagNamesMap[tagName.toLowerCase()] = tagName; | ||
} | ||
@@ -20,3 +18,3 @@ | ||
function getCaseSensitiveTagName(tagName) { | ||
return caseSensitiveTagNamesMap[tagName]; | ||
return caseSensitiveTagNamesMap[tagName]; | ||
} | ||
@@ -31,10 +29,10 @@ | ||
function formatAttributes(attributes) { | ||
var result = {}; | ||
var attribute; | ||
// `NamedNodeMap` is array-like | ||
for (var i = 0, len = attributes.length; i < len; i++) { | ||
attribute = attributes[i]; | ||
result[attribute.name] = attribute.value; | ||
} | ||
return result; | ||
var result = {}; | ||
var attribute; | ||
// `NamedNodeMap` is array-like | ||
for (var i = 0, len = attributes.length; i < len; i++) { | ||
attribute = attributes[i]; | ||
result[attribute.name] = attribute.value; | ||
} | ||
return result; | ||
} | ||
@@ -50,8 +48,8 @@ | ||
function formatTagName(tagName) { | ||
tagName = tagName.toLowerCase(); | ||
var caseSensitiveTagName = getCaseSensitiveTagName(tagName); | ||
if (caseSensitiveTagName) { | ||
return caseSensitiveTagName; | ||
} | ||
return tagName; | ||
tagName = tagName.toLowerCase(); | ||
var caseSensitiveTagName = getCaseSensitiveTagName(tagName); | ||
if (caseSensitiveTagName) { | ||
return caseSensitiveTagName; | ||
} | ||
return tagName; | ||
} | ||
@@ -68,83 +66,81 @@ | ||
function formatDOM(nodes, parentObj, directive) { | ||
parentObj = parentObj || null; | ||
parentObj = parentObj || null; | ||
var result = []; | ||
var node; | ||
var prevNode; | ||
var nodeObj; | ||
var result = []; | ||
var node; | ||
var prevNode; | ||
var nodeObj; | ||
// `NodeList` is array-like | ||
for (var i = 0, len = nodes.length; i < len; i++) { | ||
node = nodes[i]; | ||
// reset | ||
nodeObj = { | ||
next: null, | ||
prev: result[i - 1] || null, | ||
parent: parentObj | ||
}; | ||
// `NodeList` is array-like | ||
for (var i = 0, len = nodes.length; i < len; i++) { | ||
node = nodes[i]; | ||
// reset | ||
nodeObj = { | ||
next: null, | ||
prev: result[i - 1] || null, | ||
parent: parentObj | ||
}; | ||
// set the next node for the previous node (if applicable) | ||
prevNode = result[i - 1]; | ||
if (prevNode) { | ||
prevNode.next = nodeObj; | ||
} | ||
// set the next node for the previous node (if applicable) | ||
prevNode = result[i - 1]; | ||
if (prevNode) { | ||
prevNode.next = nodeObj; | ||
} | ||
// set the node name if it's not "#text" or "#comment" | ||
// e.g., "div" | ||
if (node.nodeName[0] !== '#') { | ||
nodeObj.name = formatTagName(node.nodeName); | ||
// also, nodes of type "tag" have "attribs" | ||
nodeObj.attribs = {}; // default | ||
if (node.attributes && node.attributes.length) { | ||
nodeObj.attribs = formatAttributes(node.attributes); | ||
} | ||
} | ||
// set the node name if it's not "#text" or "#comment" | ||
// e.g., "div" | ||
if (node.nodeName[0] !== '#') { | ||
nodeObj.name = formatTagName(node.nodeName); | ||
// also, nodes of type "tag" have "attribs" | ||
nodeObj.attribs = {}; // default | ||
if (node.attributes && node.attributes.length) { | ||
nodeObj.attribs = formatAttributes(node.attributes); | ||
} | ||
} | ||
// set the node type | ||
// e.g., "tag" | ||
switch (node.nodeType) { | ||
// 1 = element | ||
case 1: | ||
if (nodeObj.name === 'script' || nodeObj.name === 'style') { | ||
nodeObj.type = nodeObj.name; | ||
} else { | ||
nodeObj.type = 'tag'; | ||
} | ||
// recursively format the children | ||
nodeObj.children = formatDOM(node.childNodes, nodeObj); | ||
break; | ||
// 2 = attribute | ||
// 3 = text | ||
case 3: | ||
nodeObj.type = 'text'; | ||
nodeObj.data = node.nodeValue; | ||
break; | ||
// 8 = comment | ||
case 8: | ||
nodeObj.type = 'comment'; | ||
nodeObj.data = node.nodeValue; | ||
break; | ||
default: | ||
break; | ||
// set the node type | ||
// e.g., "tag" | ||
switch (node.nodeType) { | ||
// 1 = element | ||
case 1: | ||
if (nodeObj.name === 'script' || nodeObj.name === 'style') { | ||
nodeObj.type = nodeObj.name; | ||
} else { | ||
nodeObj.type = 'tag'; | ||
} | ||
result.push(nodeObj); | ||
// recursively format the children | ||
nodeObj.children = formatDOM(node.childNodes, nodeObj); | ||
break; | ||
// 2 = attribute | ||
// 3 = text | ||
case 3: | ||
nodeObj.type = 'text'; | ||
nodeObj.data = node.nodeValue; | ||
break; | ||
// 8 = comment | ||
case 8: | ||
nodeObj.type = 'comment'; | ||
nodeObj.data = node.nodeValue; | ||
break; | ||
} | ||
if (directive) { | ||
result.unshift({ | ||
name: directive.substring(0, directive.indexOf(' ')).toLowerCase(), | ||
data: directive, | ||
type: 'directive', | ||
next: result[0] ? result[0] : null, | ||
prev: null, | ||
parent: parentObj | ||
}); | ||
result.push(nodeObj); | ||
} | ||
if (result[1]) { | ||
result[1].prev = result[0]; | ||
} | ||
if (directive) { | ||
result.unshift({ | ||
name: directive.substring(0, directive.indexOf(' ')).toLowerCase(), | ||
data: directive, | ||
type: 'directive', | ||
next: result[0] ? result[0] : null, | ||
prev: null, | ||
parent: parentObj | ||
}); | ||
if (result[1]) { | ||
result[1].prev = result[0]; | ||
} | ||
} | ||
return result; | ||
return result; | ||
} | ||
@@ -159,15 +155,12 @@ | ||
function isIE(version) { | ||
if (version) { | ||
return document.documentMode === version; | ||
} | ||
return /(MSIE |Trident\/|Edge\/)/.test(navigator.userAgent); | ||
if (version) { | ||
return document.documentMode === version; | ||
} | ||
return /(MSIE |Trident\/|Edge\/)/.test(navigator.userAgent); | ||
} | ||
/** | ||
* Export utilities. | ||
*/ | ||
module.exports = { | ||
formatAttributes: formatAttributes, | ||
formatDOM: formatDOM, | ||
isIE: isIE | ||
formatAttributes: formatAttributes, | ||
formatDOM: formatDOM, | ||
isIE: isIE | ||
}; |
{ | ||
"name": "html-dom-parser", | ||
"version": "0.2.2", | ||
"description": "An HTML to DOM parser that works on the server and client.", | ||
"version": "0.2.3", | ||
"description": "HTML to DOM parser.", | ||
"author": "Mark <mark@remarkablemark.org>", | ||
"main": "index.js", | ||
"scripts": { | ||
"build": "npm run build:min && npm run build:unmin", | ||
"build:min": "webpack index.js dist/html-dom-parser.min.js -p --output-library HTMLDOMParser --output-library-target umd", | ||
"build:unmin": "webpack index.js dist/html-dom-parser.js --output-library HTMLDOMParser --output-library-target umd", | ||
"build": "run-s build:*", | ||
"build:min": "webpack index.js -o dist/html-dom-parser.min.js --mode production --output-library HTMLDOMParser --output-library-target umd", | ||
"build:unmin": "webpack index.js -o dist/html-dom-parser.js --mode development --output-library HTMLDOMParser --output-library-target umd", | ||
"clean": "rm -rf dist", | ||
"test": "mocha", | ||
"coveralls": "nyc report --reporter=text-lcov | coveralls", | ||
"dtslint": "dtslint .", | ||
"lint": "eslint . --ignore-path .gitignore", | ||
"dtslint": "dtslint .", | ||
"cover": "istanbul cover _mocha -- -R spec", | ||
"coveralls": "cat coverage/lcov.info | coveralls", | ||
"prepublish": "npm run clean && npm run build", | ||
"release": "standard-version --no-verify" | ||
"lint:fix": "npm run lint -- --fix", | ||
"prepublishOnly": "run-s lint dtslint test clean build", | ||
"release": "standard-version --no-verify", | ||
"test": "run-s test:server test:client", | ||
"test:client": "npm run test:client:watch -- --single-run", | ||
"test:client:build": "webpack node_modules/htmlparser2/lib/index.js -o dist/htmlparser2.js --mode production --output-library htmlparser2 --output-library-target umd", | ||
"test:client:setup": "test -f dist/htmlparser2.js || npm run test:client:build", | ||
"test:client:watch": "npm run test:client:setup && karma start", | ||
"test:server": "mocha test/server", | ||
"test:server:coverage": "nyc npm run test:server", | ||
"test:server:coverage:report": "nyc report --reporter=html" | ||
}, | ||
@@ -28,24 +35,48 @@ "repository": { | ||
"keywords": [ | ||
"html-dom-parser", | ||
"html", | ||
"dom", | ||
"parser", | ||
"htmlparser2" | ||
"htmlparser2", | ||
"pojo" | ||
], | ||
"dependencies": { | ||
"@types/domhandler": "2.4.1", | ||
"domhandler": "2.3.0", | ||
"htmlparser2": "3.9.1" | ||
"domhandler": "2.4.2", | ||
"htmlparser2": "3.10.1" | ||
}, | ||
"devDependencies": { | ||
"chai": "^3.5.0", | ||
"coveralls": "^2.11.14", | ||
"dtslint": "^0.5.9", | ||
"eslint": "^3.4.0", | ||
"html-minifier": "^3.1.0", | ||
"istanbul": "^0.4.5", | ||
"jsdomify": "^3.1.0", | ||
"mocha": "^3.4.2", | ||
"standard-version": "^5.0.2", | ||
"webpack": "^2.6.1" | ||
"@commitlint/cli": "^8.2.0", | ||
"@commitlint/config-conventional": "^8.2.0", | ||
"chai": "^4.2.0", | ||
"coveralls": "^3.0.7", | ||
"dtslint": "^1.0.2", | ||
"eslint": "^6.6.0", | ||
"eslint-plugin-prettier": "^3.1.1", | ||
"html-minifier": "^4.0.0", | ||
"husky": "^3.0.9", | ||
"jsdomify": "^3.1.1", | ||
"karma": "^4.4.1", | ||
"karma-chai": "^0.1.0", | ||
"karma-chrome-launcher": "^3.1.0", | ||
"karma-commonjs": "^1.0.0", | ||
"karma-mocha": "^1.3.0", | ||
"karma-mocha-reporter": "^2.2.5", | ||
"karma-phantomjs-launcher": "^1.0.4", | ||
"lint-staged": "^9.4.2", | ||
"mocha": "^6.2.2", | ||
"mock-require": "^3.0.3", | ||
"npm-run-all": "^4.1.5", | ||
"nyc": "^14.1.1", | ||
"prettier": "^1.18.2", | ||
"sinon": "^7.5.0", | ||
"standard-version": "^6", | ||
"webpack": "^4.41.2", | ||
"webpack-cli": "^3.3.10" | ||
}, | ||
"files": [ | ||
"/dist", | ||
"/index.d.ts", | ||
"/lib" | ||
], | ||
"browser": { | ||
@@ -52,0 +83,0 @@ "./index.js": "./lib/html-to-dom-client.js" |
105
README.md
@@ -9,4 +9,5 @@ # html-dom-parser | ||
[![Dependency status](https://david-dm.org/remarkablemark/html-dom-parser.svg)](https://david-dm.org/remarkablemark/html-dom-parser) | ||
[![NPM downloads](https://img.shields.io/npm/dm/html-dom-parser.svg?style=flat-square)](https://www.npmjs.com/package/html-dom-parser) | ||
An HTML to DOM parser that works on both the server and the browser: | ||
HTML to DOM parser that works on both the server (Node.js) and the client (browser): | ||
@@ -17,6 +18,30 @@ ``` | ||
The parser converts an HTML string to a JavaScript object that describes the DOM tree. | ||
It converts an HTML string to a JavaScript object that describes the DOM tree. | ||
[repl.it](https://repl.it/@remarkablemark/html-dom-parser) | [JSFiddle](https://jsfiddle.net/remarkablemark/ff9yg1yz/) | ||
#### Example: | ||
```js | ||
var parse = require('html-dom-parser'); | ||
parse('<div>text</div>'); | ||
``` | ||
Output: | ||
``` | ||
[ { type: 'tag', | ||
name: 'div', | ||
attribs: {}, | ||
children: | ||
[ { data: 'text', | ||
type: 'text', | ||
next: null, | ||
prev: null, | ||
parent: [Circular] } ], | ||
next: null, | ||
prev: null, | ||
parent: null } ] | ||
``` | ||
[Repl.it](https://repl.it/@remarkablemark/html-dom-parser) | [JSFiddle](https://jsfiddle.net/remarkablemark/ff9yg1yz/) | [Examples](https://github.com/remarkablemark/html-dom-parser/tree/master/examples) | ||
## Installation | ||
@@ -36,3 +61,3 @@ | ||
[unpkg](https://unpkg.com/html-dom-parser/) (CDN): | ||
[CDN](https://unpkg.com/html-dom-parser/): | ||
@@ -48,26 +73,26 @@ ```html | ||
Import parser: | ||
Import the module: | ||
```js | ||
// server | ||
var parser = require('html-dom-parser'); | ||
// CommonJS | ||
var parse = require('html-dom-parser'); | ||
// client | ||
var parser = window.HTMLDOMParser; | ||
// ES Modules | ||
import parse from 'html-dom-parser'; | ||
``` | ||
Parse input: | ||
Parse markup: | ||
```js | ||
parser('<p>Hello, world!</p>'); | ||
parse('<p class="primary" style="color: skyblue;">Hello world</p>'); | ||
``` | ||
Get output: | ||
Output: | ||
```js | ||
``` | ||
[ { type: 'tag', | ||
name: 'p', | ||
attribs: {}, | ||
attribs: { class: 'primary', style: 'color: skyblue;' }, | ||
children: | ||
[ { data: 'Hello, world!', | ||
[ { data: 'Hello world', | ||
type: 'text', | ||
@@ -82,16 +107,58 @@ next: null, | ||
On the server-side (Node.js), the parser is a wrapper of `parseDOM` from [htmlparser2](https://github.com/fb55/htmlparser2). | ||
The _server parser_ is a wrapper of [htmlparser2](https://github.com/fb55/htmlparser2)'s `parseDOM`; the _client parser_ mimics the server parser by using the [DOM](https://developer.mozilla.org/docs/Web/API/Document_Object_Model/Introduction) API. | ||
On the client-side (browser), the parser uses the [DOM](https://developer.mozilla.org/docs/Web/API/Document_Object_Model/Introduction) API to mimic the output schema of the server parser. | ||
## Testing | ||
Run server and client tests: | ||
```sh | ||
$ npm test | ||
$ npm run lint # npm run lint:fix | ||
``` | ||
Run server tests with coverage: | ||
```sh | ||
$ npm run test:server:coverage | ||
# generate html report | ||
$ npm run test:server:coverage:report | ||
``` | ||
Run client tests: | ||
```sh | ||
$ npm run test:client | ||
``` | ||
Lint files: | ||
```sh | ||
$ npm run lint | ||
# fix lint errors | ||
$ npm run lint:fix | ||
``` | ||
Test TypeScript declaration file for style and correctness: | ||
```sh | ||
$ npm run dtslint | ||
``` | ||
## Release | ||
Only collaborators with credentials can release and publish: | ||
```sh | ||
$ npm run release | ||
$ git push --follow-tags && npm publish | ||
``` | ||
## Special Thanks | ||
- [Contributors](https://github.com/remarkablemark/html-dom-parser/graphs/contributors) | ||
- [htmlparser2](https://github.com/fb55/htmlparser2) | ||
## License | ||
[MIT](https://github.com/remarkablemark/html-dom-parser/blob/master/LICENSE) |
License Policy Violation
LicenseThis package is not allowed per your license policy. Review the package's license to ensure compliance.
Found 1 instance in 1 package
License Policy Violation
LicenseThis package is not allowed per your license policy. Review the package's license to ensure compliance.
Found 1 instance in 1 package
48118
160
27
18
635
+ Addeddomhandler@2.4.2(transitive)
+ Addedhtmlparser2@3.10.1(transitive)
+ Addedreadable-stream@3.6.2(transitive)
+ Addedsafe-buffer@5.2.1(transitive)
+ Addedstring_decoder@1.3.0(transitive)
- Removedcore-util-is@1.0.3(transitive)
- Removeddomhandler@2.3.0(transitive)
- Removedhtmlparser2@3.9.1(transitive)
- Removedisarray@1.0.0(transitive)
- Removedprocess-nextick-args@2.0.1(transitive)
- Removedreadable-stream@2.3.8(transitive)
- Removedsafe-buffer@5.1.2(transitive)
- Removedstring_decoder@1.1.1(transitive)
Updateddomhandler@2.4.2
Updatedhtmlparser2@3.10.1