What is striptags?
The 'striptags' npm package is a utility for stripping HTML and XML tags from a string. It is useful for sanitizing user input, cleaning up text for display, and ensuring that text content is free from potentially harmful or unwanted HTML tags.
What are striptags's main functionalities?
Basic HTML Tag Removal
This feature allows you to remove all HTML tags from a string, leaving only the text content.
const striptags = require('striptags');
const text = striptags('<p>Hello <strong>world</strong>!</p>');
console.log(text); // Output: 'Hello world!'
Allow Specific Tags
This feature allows you to specify which HTML tags should be allowed to remain in the string while stripping all others.
const striptags = require('striptags');
const text = striptags('<p>Hello <strong>world</strong>!</p>', ['strong']);
console.log(text); // Output: 'Hello <strong>world</strong>!'
Strip Tags with Whitelist
This feature allows you to strip tags while using a whitelist of allowed tags, providing more control over the sanitization process.
const striptags = require('striptags');
const text = striptags('<p>Hello <strong>world</strong>!</p>', [], '<>');
console.log(text); // Output: 'Hello world!'
Other packages similar to striptags
sanitize-html
The 'sanitize-html' package provides a more comprehensive solution for sanitizing HTML content. It allows for more granular control over which tags and attributes are allowed, and can also handle nested tags and complex HTML structures. Compared to 'striptags', 'sanitize-html' offers more advanced sanitization options but may be more complex to configure.
xss
The 'xss' package is designed to filter out potential XSS (Cross-Site Scripting) attacks by sanitizing HTML content. It provides a high level of security by default and allows for customization of allowed tags and attributes. 'xss' is more focused on security compared to 'striptags', making it a better choice for applications where preventing XSS is a primary concern.
html-entities
The 'html-entities' package is used to encode and decode HTML entities. While it does not strip tags, it can be used in conjunction with other packages to ensure that HTML entities are properly handled. It is more focused on encoding and decoding rather than sanitization, making it a complementary tool rather than a direct alternative to 'striptags'.
striptags
An implementation of PHP's strip_tags in Node.js.
Features
- Fast
- Zero dependencies
- 100% test code coverage
- No unsafe regular expressions
Installing
npm install striptags
Basic Usage
striptags(html, allowed_tags, tag_replacement);
Example
var striptags = require('striptags');
var html =
'<a href="https://example.com">' +
'lorem ipsum <strong>dolor</strong> <em>sit</em> amet' +
'</a>';
striptags(html);
striptags(html, '<strong>');
striptags(html, ['a']);
striptags(html, [], '\n');
Outputs:
'lorem ipsum dolor sit amet'
lorem ipsum <strong>dolor</strong> sit amet'
'<a href="https://example.com">lorem ipsum dolor sit amet</a>'
lorem ipsum
dolor
sit amet
Streaming Mode
striptags
can also operate in streaming mode. Simply call init_streaming_mode
to get back a function that accepts HTML and outputs stripped HTML. State is saved between calls so that partial HTML can be safely passed in.
let stream_function = striptags.init_streaming_mode(
allowed_tags,
tag_replacement
);
let partial_text = stream_function(partial_html);
let more_text = stream_function(more_html);
Check out test/striptags-test.js for a concrete example.
Tests
You can run tests (powered by mocha) locally via:
npm test
Generate test coverage (powered by istanbul) via :
npm run coverage
Doesn't use regular expressions
striptags
does not use any regular expressions for stripping HTML tags.
Regular expressions are not capable of preventing all possible scripting attacks (see this). Here is a great StackOverflow answer regarding how strip_tags (when used without specifying allowableTags) is not vulnerable to scripting attacks.