Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More →

Dependencies

Maintainers

Versions

Alerts

File Explorer

Advanced tools

License

Install Socket

Detect and block malicious and high-risk dependencies

Install

he

A robust HTML entities encoder/decoder with full Unicode support.

0.4.0
Source
npm

Version published: 10 years ago

Weekly downloads: 21M; decreased by-7.48%

Maintainers: 1

Weekly downloads

Created: 11 years ago

What is he?

The 'he' npm package is a robust HTML entity encoder/decoder written in JavaScript. It supports all standardized named character references, numeric character references, and it can encode/decode any arbitrary data in UTF-8 encoding. It is designed to work in all modern web browsers and can be used with Node.js.

What are he's main functionalities?

Encode HTML entities

This feature allows you to encode text into HTML entities, which is useful for preventing XSS attacks by sanitizing user input before inserting it into the DOM.

const he = require('he');
const encoded = he.encode('Schrödinger’s cat & Co.');
console.log(encoded); // 'Schrödinger’s cat &amp; Co.'

Decode HTML entities

This feature allows you to decode HTML entities back into their original form, which is useful for displaying encoded content as plain text.

const he = require('he');
const decoded = he.decode('Schrödinger’s cat &amp; Co.');
console.log(decoded); // 'Schrödinger’s cat & Co.'

Escape XML entities

This feature allows you to escape XML entities, which is similar to encoding but specifically for XML content.

const he = require('he');
const escaped = he.escape('Schrödinger’s cat & Co.');
console.log(escaped); // 'Schrödinger’s cat &amp; Co.'

Unescape XML entities

This feature allows you to unescape XML entities, which is the reverse process of escaping and is used to convert XML-encoded content back to its original form.

const he = require('he');
const unescaped = he.unescape('Schrödinger’s cat &amp; Co.');
console.log(unescaped); // 'Schrödinger’s cat & Co.'

Other packages similar to he

he

he (for “HTML entities”) is a robust HTML entity encoder/decoder written in JavaScript. It supports all standardized named character references as per HTML, handles ambiguous ampersands and other edge cases just like a browser would, has an extensive test suite, and — contrary to many other JavaScript solutions — he handles astral Unicode symbols just fine. An online demo is available.

Installation

Via npm:

npm install he

Via Bower:

bower install he

Via Component:

component install mathiasbynens/he

In a browser:

<script src="he.js"></script>

In Narwhal, Node.js, and RingoJS:

var he = require('he');

In Rhino:

load('he.js');

Using an AMD loader like RequireJS:

require(
  {
    'paths': {
      'he': 'path/to/he'
    }
  },
  ['he'],
  function(he) {
    console.log(he);
  }
);

API

`he.version`

A string representing the semantic version number.

`he.encode(text, options)`

This function takes a string of text and encodes (by default) any symbols that aren’t printable ASCII symbols, replacing them with character references. As long as the input string contains allowed code points only, the return value of this function is always valid HTML.

he.encode('foo © bar ≠ baz 𝌆 qux');
// → 'foo &#xA9; bar &#x2260; baz &#x1D306; qux'

The options object is optional. It recognizes the following properties:

`useNamedReferences`

The default value for the useNamedReferences option is false. This means that encode() will not use any named character references (e.g. ©) in the output — hexadecimal escapes (e.g. ©) will be used instead. Set it to true to enable the use of named references.

Note that if compatibility with older browsers is a concern, this option should remain disabled.

// Using the global default setting (defaults to `false`):
he.encode('foo © bar ≠ baz 𝌆 qux');
// → 'foo &#xA9; bar &#x2260; baz &#x1D306; qux'

// Passing an `options` object to `encode`, to explicitly disallow named references:
he.encode('foo © bar ≠ baz 𝌆 qux', {
  'useNamedReferences': false
});
// → 'foo &#xA9; bar &#x2260; baz &#x1D306; qux'

// Passing an `options` object to `encode`, to explicitly allow named references:
he.encode('foo © bar ≠ baz 𝌆 qux', {
  'useNamedReferences': true
});
// → 'foo &copy; bar &ne; baz &#x1D306; qux'

`encodeEverything`

The default value for the encodeEverything option is false. This means that encode() will not use any character references for printable ASCII symbols that don’t need escaping. Set it to true to encode every symbol in the input string.

// Using the global default setting (defaults to `false`):
he.encode('foo © bar ≠ baz 𝌆 qux');
// → 'foo &#xA9; bar &#x2260; baz &#x1D306; qux'

// Passing an `options` object to `encode`, to explicitly encode all symbols:
he.encode('foo © bar ≠ baz 𝌆 qux', {
  'encodeEverything': true
});
// → '&#x66;&#x6F;&#x6F;&#x20;&#xA9;&#x20;&#x62;&#x61;&#x72;&#x20;&#x2260;&#x20;&#x62;&#x61;&#x7A;&#x20;&#x1D306;&#x20;&#x71;&#x75;&#x78;'

// This setting can be combined with the `useNamedReferences` option:
he.encode('foo © bar ≠ baz 𝌆 qux', {
  'encodeEverything': true,
  'useNamedReferences': true
});
// → '&#x66;&#x6F;&#x6F;&#x20;&copy;&#x20;&#x62;&#x61;&#x72;&#x20;&ne;&#x20;&#x62;&#x61;&#x7A;&#x20;&#x1D306;&#x20;&#x71;&#x75;&#x78;'

Overriding default `encode` options globally

The global default setting can be overridden by modifying the he.encode.options object. This saves you from passing in an options object for every call to encode if you want to use the non-default setting.

// Read the global default setting:
he.encode.options.useNamedReferences;
// → `false` by default

// Override the global default setting:
he.encode.options.useNamedReferences = true;

// Using the global default setting, which is now `true`:
he.encode('foo © bar ≠ baz 𝌆 qux');
// → 'foo &copy; bar &ne; baz &#x1D306; qux'

`he.decode(html, options)`

This function takes a string of HTML and decodes any named and numerical character references in it using the algorithm described in section 12.2.4.69 of the HTML spec.

he.decode('foo &copy; bar &ne; baz &#x1D306; qux');
// → 'foo © bar ≠ baz 𝌆 qux'

The options object is optional. It recognizes the following properties:

`isAttributeValue`

The default value for the isAttributeValue option is false. This means that decode() will decode the string as if it were used in a text context in an HTML document. HTML has different rules for parsing character references in attribute values — set this option to true to treat the input string as if it were used as an attribute value.

// Using the global default setting (defaults to `false`, i.e. HTML text context):
he.decode('foo&ampbar');
// → 'foo&bar'

// Passing an `options` object to `decode`, to explicitly assume an HTML text context:
he.decode('foo&ampbar', {
  'isAttributeValue': false
});
// → 'foo&bar'

// Passing an `options` object to `decode`, to explicitly assume an HTML attribute value context:
he.decode('foo&ampbar', {
  'isAttributeValue': true
});
// → 'foo&ampbar'

`strict`

The default value for the strict option is false. This means that decode() will decode any HTML text content you feed it, even if it contains any entities that cause parse errors. To throw an error when such invalid HTML is encountered, set the strict option to true. This option makes it possible to use he as part of HTML parsers and HTML validators.

// Using the global default setting (defaults to `false`, i.e. error-tolerant mode):
he.decode('foo&ampbar');
// → 'foo&bar'

// Passing an `options` object to `decode`, to explicitly enable error-tolerant mode:
he.decode('foo&ampbar', {
  'strict': false
});
// → 'foo&bar'

// Passing an `options` object to `decode`, to explicitly enable strict mode:
he.decode('foo&ampbar', {
  'strict': true
});
// → Parse error

Overriding default `decode` options globally

The global default settings for the decode function can be overridden by modifying the he.decode.options object. This saves you from passing in an options object for every call to decode if you want to use a non-default setting.

// Read the global default setting:
he.decode.options.isAttributeValue;
// → `false` by default

// Override the global default setting:
he.decode.options.isAttributeValue = true;

// Using the global default setting, which is now `true`:
he.decode('foo&ampbar');
// → 'foo&ampbar'

`he.escape(text)`

This function takes a string of text and escapes it for use in text contexts in XML or HTML documents. Only the following characters are escaped: &, <, >, ", and '.

he.escape('<img src=\'x\' onerror="prompt(1)">');
// → '&lt;img src=&#x27;x&#x27; onerror=&quot;prompt(1)&quot;&gt;'

`he.unescape(html, options)`

he.unescape is an alias for he.decode. It takes a string of HTML and decodes any named and numerical character references in it.

Using the `he` binary

To use the he binary in your shell, simply install he globally using npm:

npm install -g he

After that you will be able to encode/decode HTML entities from the command line:

$ he --encode 'föo ♥ bår 𝌆 baz'
f&#xF6;o &#x2665; b&#xE5;r &#x1D306; baz

$ he --encode --use-named-refs 'föo ♥ bår 𝌆 baz'
f&ouml;o &hearts; b&aring;r &#x1D306; baz

$ he --decode 'f&ouml;o &hearts; b&aring;r &#x1D306; baz'
föo ♥ bår 𝌆 baz

Read a local text file, encode it for use in an HTML text context, and save the result to a new file:

$ he --encode < foo.txt > foo-escaped.html

Or do the same with an online text file:

$ curl -sL "http://git.io/HnfEaw" | he --encode > escaped.html

Or, the opposite — read a local file containing a snippet of HTML in a text context, decode it back to plain text, and save the result to a new file:

$ he --decode < foo-escaped.html > foo.txt

Or do the same with an online HTML snippet:

$ curl -sL "http://git.io/HnfEaw" | he --decode > decoded.txt

See he --help for the full list of options.

Support

he has been tested in at least Chrome 27-29, Firefox 3-22, Safari 4-6, Opera 10-12, IE 6-10, Node.js v0.10.0, Narwhal 0.3.2, RingoJS 0.8-0.9, PhantomJS 1.9.0, and Rhino 1.7RC4.

Unit tests & code coverage

After cloning this repository, run npm install to install the dependencies needed for he development and testing. You may want to install Istanbul globally using npm install istanbul -g.

Once that’s done, you can run the unit tests in Node using npm test or node tests/tests.js. To run the tests in Rhino, Ringo, Narwhal, and web browsers as well, use grunt test.

To generate the code coverage report, use grunt cover.

Acknowledgements

Thanks to Simon Pieters (@zcorpan) for the many suggestions.

Author


Mathias Bynens

License

he is available under the MIT license.

Keywords

FAQs

What is he?

Is he popular?

Is he well maintained?

Package last updated on 23 May 2014

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

he

What is he?

What are he's main functionalities?

Other packages similar to he

entities

html-entities

escape-html

Installation

API

he.version

he.encode(text, options)

useNamedReferences

encodeEverything

Overriding default encode options globally

he.decode(html, options)

isAttributeValue

strict

Overriding default decode options globally

he.escape(text)

he.unescape(html, options)

Using the he binary

Support

Unit tests & code coverage

Acknowledgements

Author

License

Keywords

Related posts

RubyGems.org Adds New Maintainer Role

Node.js Implements Stricter Policies for Semver-Major Pull Requests Ahead of Release Deadlines

Roblox Developers Targeted with npm Packages Infected with Skuld Infostealer and Blank Grabber

`he.version`

`he.encode(text, options)`

`useNamedReferences`

`encodeEverything`

Overriding default `encode` options globally

`he.decode(html, options)`

`isAttributeValue`

`strict`

Overriding default `decode` options globally

`he.escape(text)`

`he.unescape(html, options)`

Using the `he` binary