What is parse-entities?
The parse-entities npm package is used to parse HTML entities in text. It can decode named and numerical character references in HTML, making it useful for processing and sanitizing HTML content.
What are parse-entities's main functionalities?
Decode named character references
This feature allows you to decode named character references in a string. For example, it converts " to ".
const parseEntities = require('parse-entities');
const decoded = parseEntities('The "quick" brown fox');
console.log(decoded); // Output: The "quick" brown fox
Decode numerical character references
This feature allows you to decode numerical character references in a string. For example, it converts 😀 to 😀.
const parseEntities = require('parse-entities');
const decoded = parseEntities('The 😀 emoji');
console.log(decoded); // Output: The 😀 emoji
Decode mixed character references
This feature allows you to decode a mix of named and numerical character references in a string.
const parseEntities = require('parse-entities');
const decoded = parseEntities('The "quick" brown fox jumps over the lazy dog🐶');
console.log(decoded); // Output: The "quick" brown fox jumps over the lazy dog🐶
Other packages similar to parse-entities
he
The 'he' package (short for HTML entities) is a robust HTML entity encoder/decoder. It supports both named and numerical character references and offers more configuration options compared to parse-entities.
entities
The 'entities' package is another library for encoding and decoding XML and HTML entities. It provides similar functionality to parse-entities but also includes support for encoding entities, which parse-entities does not offer.
html-entities
The 'html-entities' package provides methods to encode and decode HTML entities. It supports both named and numerical references and offers additional features like encoding non-ASCII characters.
parse-entities
Parse HTML character references: fast, spec-compliant, positional
information.
Installation
npm:
npm install parse-entities
Usage
var decode = require('parse-entities');
decode('alpha & bravo');
decode('charlie ©cat; delta');
decode('echo © foxtrot ≠ golf 𝌆 hotel');
API
parseEntities(value[, options])
options
-
additional
(string
, optional, default: ''
)
— Additional character to accept when following an ampersand (without
error)
-
attribute
(boolean
, optional, default: false
)
— Whether to parse value
as an attribute value
-
nonTerminated
(boolean
, default: true
)
— Whether to allow non-terminated entities, such as ©cat
to
©cat
. This behaviour is spec-compliant but can lead to unexpected
results
-
warning
(Function
, optional)
— Error handler
-
text
(Function
, optional)
— Text handler
-
reference
(Function
,
optional) — Reference handler
-
warningContext
('*'
, optional)
— Context used when invoking warning
-
textContext
('*'
, optional)
— Context used when invoking text
-
referenceContext
('*'
, optional)
— Context used when invoking reference
-
position
(Location
or Position
, optional)
— Starting position
of value
, useful when dealing with values
nested in some sort of syntax tree. The default is:
{
"start": {
"line": 1,
"column": 1,
"offset": 0
},
"indent": []
}
Returns
string
— Decoded value
.
function warning(reason, position, code)
Error handler.
Context
this
refers to warningContext
when given to parseEntities
.
Parameters
reason
(string
)
— Reason (human-readable) for triggering a parse errorposition
(Position
)
— Place at which the parse error occurredcode
(number
)
— Identifier of reason for triggering a parse error
The following codes are used:
Code | Example | Note |
---|
1 | foo & bar | Missing semicolon (named) |
2 | foo { bar | Missing semicolon (numeric) |
3 | Foo &bar baz | Ampersand did not start a reference |
4 | Foo &# | Empty reference |
5 | Foo &bar; baz | Unknown entity |
6 | Foo € baz | Disallowed reference |
7 | Foo � baz | Prohibited: outside permissible unicode range |
function text(value, location)
Text handler.
Context
this
refers to textContext
when given to parseEntities
.
Parameters
value
(string
) — String of contentlocation
(Location
) — Location at which value
starts and ends
function reference(value, location, source)
Character reference handler.
Context
this
refers to referenceContext
when given to parseEntities
.
Parameters
value
(string
) — Encoded character referencelocation
(Location
) — Location at which value
starts and endssource
(Location
) — Source of character reference
License
MIT © Titus Wormer