
Research
NPM targeted by malware campaign mimicking familiar library names
Socket uncovered npm malware campaign mimicking popular Node.js libraries and packages from other ecosystems; packages steal data and execute remote code.
@desertnet/html-parser
Advanced tools
This HTML parser and validator is not a strict HTML parser in the vein of the XHTML strict validators of old. Instead, it aims to parse HTML as browsers do and only surface errors that are likely to confuse browsers or that are indicative of a confused HTML author.
npm install --save @desertnet/html-parser
const HTMLParser = require('@desertnet/html-parser')
const html = `
<p>
<b>I forgot to close my b tag.
</p>
<p>
I&npsb;misspelled the no-break-space entity.
</p>
`.trim()
const errors = HTMLParser.validate(html)
errors.forEach(error => {
console.log(`${error.message} (line: ${error.line}, column: ${error.column})`)
})
Outputs:
Could not find closing tag for "<b>". (line: 2, column: 3)
Unexpected closing tag, "</p>". Expected closing tag for "<b>". (line: 3, column: 1)
Invalid HTML entity name for "&npsb;". (line: 6, column: 5)
Static method on HTMLParser
constructor. It parses the HTML fragment in htmlString
and returns an array of HTMLParseError
s. If there were no errors, an empty array is returned.
The HTMLParser
constructor. It takes no arguments.
const parser = new HTMLParser()
Parses the HTML fragment in htmlString
and returns an HTMLNode
object containing the parsed HTML.
const parseTree = parser.parse('<h1>Hello world!</h1>')
Represents an error discovered durning parsing of an HTML fragment.
parseTree.errors.forEach(error => {
console.log(`${error.message} (line: ${error.line}, column: ${error.column})`)
})
Property that is a string containing an English description of the error.
Property that is a number indicating the index into the source string where the error begins.
Property that is a number indicating the index into the source string where the error ends.
Property that is a number indicating the line number where the error begins. Line numbers being at 1
, not 0
.
Property that is a number indicating the column of the line where the error begins. Columns also begin at 1
, not 0
.
The base class for all node types. Returned by HTMLParser
's .parse
method.
console.log(indentedNodeList(parseTree))
function indentedNodeList (node, indent = '') {
let str = ''
if (node.children) {
str += node.children.reduce((prev, child) => {
return prev + indentedNodeList(child, `${indent} `)
}, '')
}
return `${indent}${node.type}\n${str}`
}
Property indicating the type of node. It will be one of the following strings:
ROOT
: Root node of the tree.TAG
: An HTML tag.ATTRIBUTE
: An attribute of an HTML tag.TEXT
: A text content node.ENTITY
: An HTML entity (i.e. &
)COMMENT
: An HTML comment tag.CLOSETAG
: A closing HTML tag.Property that is either null
, or an array of HTMLNode
s that are the node's children.
Property that, if non-null, will be an array of two numbers. The first is the index of the source string where the node begins, and the second is the index of the source string where the node ends.
Property that is an array of HTMLParseError
s associated with this node and its descendants.
Property that is an array of HTMLParseError
s associated with only this node (not its descendants).
On TAG
and CLOSETAG
type nodes, this is a string of the lowercased tag name.
On TAG
type nodes, this is an array of ATTRIBUTE
nodes that belong to the tag.
On TAG
type nodes, this is a CLOSETAG
node, if a closing tag for this node was found.
FAQs
HTML parser and non-strict validator
The npm package @desertnet/html-parser receives a total of 7,258 weekly downloads. As such, @desertnet/html-parser popularity was classified as popular.
We found that @desertnet/html-parser demonstrated a not healthy version release cadence and project activity because the last version was released a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Research
Socket uncovered npm malware campaign mimicking popular Node.js libraries and packages from other ecosystems; packages steal data and execute remote code.
Research
Socket's research uncovers three dangerous Go modules that contain obfuscated disk-wiping malware, threatening complete data loss.
Research
Socket uncovers malicious packages on PyPI using Gmail's SMTP protocol for command and control (C2) to exfiltrate data and execute commands.