Research
Security News
Threat Actor Exposes Playbook for Exploiting npm to Build Blockchain-Powered Botnets
A threat actor's playbook for exploiting the npm ecosystem was exposed on the dark web, detailing how to build a blockchain-powered botnet.
parse-entities
Advanced tools
The parse-entities npm package is used to parse HTML entities in text. It can decode named and numerical character references in HTML, making it useful for processing and sanitizing HTML content.
Decode named character references
This feature allows you to decode named character references in a string. For example, it converts " to ".
const parseEntities = require('parse-entities');
const decoded = parseEntities('The "quick" brown fox');
console.log(decoded); // Output: The "quick" brown fox
Decode numerical character references
This feature allows you to decode numerical character references in a string. For example, it converts 😀 to 😀.
const parseEntities = require('parse-entities');
const decoded = parseEntities('The 😀 emoji');
console.log(decoded); // Output: The 😀 emoji
Decode mixed character references
This feature allows you to decode a mix of named and numerical character references in a string.
const parseEntities = require('parse-entities');
const decoded = parseEntities('The "quick" brown fox jumps over the lazy dog🐶');
console.log(decoded); // Output: The "quick" brown fox jumps over the lazy dog🐶
The 'he' package (short for HTML entities) is a robust HTML entity encoder/decoder. It supports both named and numerical character references and offers more configuration options compared to parse-entities.
The 'entities' package is another library for encoding and decoding XML and HTML entities. It provides similar functionality to parse-entities but also includes support for encoding entities, which parse-entities does not offer.
The 'html-entities' package provides methods to encode and decode HTML entities. It supports both named and numerical references and offers additional features like encoding non-ASCII characters.
Parse HTML character references.
This is a small and powerful decoder of HTML character references (often called entities).
You can use this for spec-compliant decoding of character references. It’s small and fast enough to do that well. You can also use this when making a linter, because there are different warnings emitted with reasons for why and positional info on where they happened.
This package is ESM only. In Node.js (version 14.14+, 16.0+), install with npm:
npm install parse-entities
In Deno with esm.sh
:
import {parseEntities} from 'https://esm.sh/parse-entities@3'
In browsers with esm.sh
:
<script type="module">
import {parseEntities} from 'https://esm.sh/parse-entities@3?bundle'
</script>
import {parseEntities} from 'parse-entities'
console.log(parseEntities('alpha & bravo')))
// => alpha & bravo
console.log(parseEntities('charlie ©cat; delta'))
// => charlie ©cat; delta
console.log(parseEntities('echo © foxtrot ≠ golf 𝌆 hotel'))
// => echo © foxtrot ≠ golf 𝌆 hotel
This package exports the identifier parseEntities
.
There is no default export.
parseEntities(value[, options])
Parse HTML character references.
options
Configuration (optional).
options.additional
Additional character to accept (string?
, default: ''
).
This allows other characters, without error, when following an ampersand.
options.attribute
Whether to parse value
as an attribute value (boolean?
, default: false
).
This results in slightly different behavior.
options.nonTerminated
Whether to allow nonterminated references (boolean
, default: true
).
For example, ©cat
for ©cat
.
This behavior is compliant to the spec but can lead to unexpected results.
options.position
Starting position
of value
(Position
or Point
, optional).
Useful when dealing with values nested in some sort of syntax tree.
The default is:
{line: 1, column: 1, offset: 0}
options.warning
Error handler (Function?
).
options.text
Text handler (Function?
).
options.reference
Reference handler (Function?
).
options.warningContext
Context used when calling warning
('*'
, optional).
options.textContext
Context used when calling text
('*'
, optional).
options.referenceContext
Context used when calling reference
('*'
, optional)
string
— decoded value
.
function warning(reason, point, code)
Error handler.
this
(*
) — refers to warningContext
when given to parseEntities
reason
(string
) — human readable reason for emitting a parse errorpoint
(Point
) — place where the error occurredcode
(number
) — machine readable code the errorThe following codes are used:
Code | Example | Note |
---|---|---|
1 | foo & bar | Missing semicolon (named) |
2 | foo { bar | Missing semicolon (numeric) |
3 | Foo &bar baz | Empty (named) |
4 | Foo &# | Empty (numeric) |
5 | Foo &bar; baz | Unknown (named) |
6 | Foo € baz | Disallowed reference |
7 | Foo � baz | Prohibited: outside permissible unicode range |
function text(value, position)
Text handler.
this
(*
) — refers to textContext
when given to parseEntities
value
(string
) — string of contentposition
(Position
) — place where value
starts and endsfunction reference(value, position, source)
Character reference handler.
this
(*
) — refers to referenceContext
when given to parseEntities
value
(string
) — decoded character referenceposition
(Position
) — place where source
starts and endssource
(string
) — raw source of character referenceThis package is fully typed with TypeScript.
It exports the additional types Options
, WarningHandler
,
ReferenceHandler
, and TextHandler
.
This package is at least compatible with all maintained versions of Node.js. As of now, that is Node.js 14.14+ and 16.0+. It also works in Deno and modern browsers.
This package is safe: it matches the HTML spec to parse character references.
wooorm/stringify-entities
— encode HTML character referenceswooorm/character-entities
— info on character referenceswooorm/character-entities-html4
— info on HTML4 character referenceswooorm/character-entities-legacy
— info on legacy character referenceswooorm/character-reference-invalid
— info on invalid numeric character referencesYes please! See How to Contribute to Open Source.
FAQs
Parse HTML character references
The npm package parse-entities receives a total of 7,183,959 weekly downloads. As such, parse-entities popularity was classified as popular.
We found that parse-entities demonstrated a not healthy version release cadence and project activity because the last version was released a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Research
Security News
A threat actor's playbook for exploiting the npm ecosystem was exposed on the dark web, detailing how to build a blockchain-powered botnet.
Security News
NVD’s backlog surpasses 20,000 CVEs as analysis slows and NIST announces new system updates to address ongoing delays.
Security News
Research
A malicious npm package disguised as a WhatsApp client is exploiting authentication flows with a remote kill switch to exfiltrate data and destroy files.