What is hast-util-sanitize?
The `hast-util-sanitize` package is a utility for sanitizing HTML content represented as HAST (Hypertext Abstract Syntax Tree). It helps in cleaning and securing HTML by removing potentially harmful elements and attributes, making it safe for use in web applications.
What are hast-util-sanitize's main functionalities?
Basic Sanitization
This feature allows you to sanitize a basic HAST tree, removing any potentially harmful elements or attributes.
const sanitize = require('hast-util-sanitize');
const hast = { type: 'element', tagName: 'div', properties: { className: 'foo' }, children: [] };
const cleanHast = sanitize(hast);
console.log(cleanHast);
Custom Schema
This feature allows you to define a custom schema for sanitization, specifying which tags and attributes are allowed.
const sanitize = require('hast-util-sanitize');
const schema = { tagNames: ['div', 'span'], attributes: { '*': ['className'] } };
const hast = { type: 'element', tagName: 'div', properties: { className: 'foo', id: 'bar' }, children: [] };
const cleanHast = sanitize(hast, schema);
console.log(cleanHast);
Sanitizing Nested Elements
This feature demonstrates sanitizing nested elements, ensuring that even deeply nested potentially harmful elements are removed.
const sanitize = require('hast-util-sanitize');
const hast = { type: 'element', tagName: 'div', properties: {}, children: [ { type: 'element', tagName: 'script', properties: {}, children: [] } ] };
const cleanHast = sanitize(hast);
console.log(cleanHast);
Other packages similar to hast-util-sanitize
sanitize-html
The `sanitize-html` package is a powerful library for cleaning up user-submitted HTML, removing any potentially harmful elements and attributes. It offers a high level of customization and is widely used for sanitizing HTML content. Compared to `hast-util-sanitize`, `sanitize-html` works directly with HTML strings rather than HAST.
dompurify
The `dompurify` package is a fast and tolerant XSS sanitizer for HTML, MathML, and SVG. It works by parsing the input and then serializing it back to a string, ensuring that only safe content is retained. Unlike `hast-util-sanitize`, `dompurify` operates on DOM nodes and HTML strings.
xss
The `xss` package is a robust library for filtering and sanitizing HTML to prevent XSS attacks. It provides a rich set of options for customizing the sanitization process. While `hast-util-sanitize` focuses on HAST, `xss` works directly with HTML strings.
hast-util-sanitize

Sanitize HAST.
Installation
npm:
npm install hast-util-sanitize
Usage
var h = require('hastscript')
var u = require('unist-builder')
var sanitize = require('hast-util-sanitize')
var toHTML = require('hast-util-to-html')
var tree = h('div', {onmouseover: 'alert("alpha")'}, [
h(
'a',
{href: 'jAva script:alert("bravo")', onclick: 'alert("charlie")'},
'delta'
),
u('text', '\n'),
h('script', 'alert("charlie")'),
u('text', '\n'),
h('img', {src: 'x', onerror: 'alert("delta")'}),
u('text', '\n'),
h('iframe', {src: 'javascript:alert("echo")'}),
u('text', '\n'),
h('math', h('mi', {'xlink:href': 'data:x,<script>alert("foxtrot")</script>'}))
])
var unsanitized = toHTML(tree)
var sanitized = toHTML(sanitize(tree))
console.log(unsanitized)
console.log(sanitized)
Unsanitized:
<div onmouseover="alert("alpha")"><a href="jAva script:alert("bravo")" onclick="alert("charlie")">delta</a>
<script>alert("charlie")</script>
<img src="x" onerror="alert("delta")">
<iframe src="javascript:alert("echo")"></iframe>
<math><mi xlink:href="data:x,<script>alert("foxtrot")</script>"></mi></math></div>
Sanitized:
<div><a>delta</a>
<img src="x">
</div>
API
sanitize(node[, schema])
Sanitize the given HAST tree.
Parameters
Returns
HASTNode
— A new node.
Schema
Configuration. If not given, defaults to GitHub style sanitation.
If any top-level key isn’t given, it defaults to GH’s style too.
For a thorough sample, see the packages github.json
.
To extend the standard schema with a few changes, clone github.json
like so:
var h = require('hastscript')
var merge = require('deepmerge')
var gh = require('hast-util-sanitize/lib/github')
var sanitize = require('hast-util-sanitize')
var schema = merge(gh, {attributes: {'*': ['className']}})
var tree = sanitize(h('div', {className: ['foo']}), schema)
console.log(tree)
attributes
Map of tag-names to allowed attributes (Object.<Array.<string>>
).
The special '*'
key sets attributes allowed on all elements.
One special value, namely 'data*'
, can be used to allow all data
properties.
"attributes": {
"a": [
"href"
],
"img": [
"src",
"longDesc"
],
"*": [
"abbr",
"accept",
"acceptCharset",
"vspace",
"width",
"itemProp"
]
}
tagNames
List of allowed tag-names (Array.<string>
).
"tagNames": [
"h1",
"h2",
"h3",
"strike",
"summary",
"details"
]
protocols
Map of protocols to support for attributes (Object.<Array.<string>>
).
"protocols": {
"href": [
"http",
"https",
"mailto"
],
"longDesc": [
"http",
"https"
]
}
ancestors
Map of tag-names to their required ancestral elements
(Object.<Array.<string>>
).
"ancestors": {
"li": [
"ol",
"ul"
],
"tr": [
"table"
]
}
clobber
List of allowed attribute-names which can clobber (Array.<string>
).
"clobber": [
"name",
"id"
]
clobberPrefix
Prefix (string
) to use before potentially clobbering properties.
"clobberPrefix": "user-content"
strip
Tag-names to strip from the tree (Array.<string>
).
By default, unsafe elements are replaced by their content. Some elements,
should however be entirely stripped from the tree.
"strip": [
"script"
]
Whether to allow comment nodes (boolean
, default: false
).
"allowComments": true
allowDoctypes
Whether to allow doctype nodes (boolean
, default: false
).
"allowDoctypes": true
Contribute
See contributing.md
in syntax-tree/hast
for ways to get
started.
This organisation has a Code of Conduct. By interacting with this
repository, organisation, or community you agree to abide by its terms.
License
MIT © Titus Wormer