What is striptags?
The 'striptags' npm package is a utility for stripping HTML and XML tags from a string. It is useful for sanitizing user input, cleaning up text for display, and ensuring that text content is free from potentially harmful or unwanted HTML tags.
What are striptags's main functionalities?
Basic HTML Tag Removal
This feature allows you to remove all HTML tags from a string, leaving only the text content.
const striptags = require('striptags');
const text = striptags('<p>Hello <strong>world</strong>!</p>');
console.log(text); // Output: 'Hello world!'
Allow Specific Tags
This feature allows you to specify which HTML tags should be allowed to remain in the string while stripping all others.
const striptags = require('striptags');
const text = striptags('<p>Hello <strong>world</strong>!</p>', ['strong']);
console.log(text); // Output: 'Hello <strong>world</strong>!'
Strip Tags with Whitelist
This feature allows you to strip tags while using a whitelist of allowed tags, providing more control over the sanitization process.
const striptags = require('striptags');
const text = striptags('<p>Hello <strong>world</strong>!</p>', [], '<>');
console.log(text); // Output: 'Hello world!'
Other packages similar to striptags
sanitize-html
The 'sanitize-html' package provides a more comprehensive solution for sanitizing HTML content. It allows for more granular control over which tags and attributes are allowed, and can also handle nested tags and complex HTML structures. Compared to 'striptags', 'sanitize-html' offers more advanced sanitization options but may be more complex to configure.
xss
The 'xss' package is designed to filter out potential XSS (Cross-Site Scripting) attacks by sanitizing HTML content. It provides a high level of security by default and allows for customization of allowed tags and attributes. 'xss' is more focused on security compared to 'striptags', making it a better choice for applications where preventing XSS is a primary concern.
html-entities
The 'html-entities' package is used to encode and decode HTML entities. While it does not strip tags, it can be used in conjunction with other packages to ensure that HTML entities are properly handled. It is more focused on encoding and decoding rather than sanitization, making it a complementary tool rather than a direct alternative to 'striptags'.
striptags
An implementation of PHP's strip_tags in Node.js.
Features
- Fast
- Zero dependencies
- 100% test code coverage
- No unsafe regular expressions!
Installing
npm install striptags
Usage
striptags(html, allowedTags, tagReplacement);
Example
var striptags = require('striptags');
var html =
'<a href="https://example.com">' +
'lorem ipsum <strong>dolor</strong> <em>sit</em> amet' +
'</a>';
striptags(html);
striptags(html, '<a><strong>');
striptags(html, ['a']);
striptags(html, [], '\n');
Outputs:
'lorem ipsum dolor sit amet'
'<a href="https://example.com">lorem ipsum <strong>dolor</strong> sit amet</a>'
'<a href="https://example.com">lorem ipsum dolor sit amet</a>'
lorem ipsum
dolor
sit amet
Tests
You can run tests (powered by mocha) locally via:
npm test
Generate test coverage (powered by blanket.js) via :
npm run test-coverage
Differences between PHP strip_tags and striptags
In this version, not much! This now closely resembles a 'port' from PHP 5.5's internal implementation of strip_tags, php_strip_tags_ex.
One major difference is that this JS version does not strip PHP-style tags; it seemed out of place in a node.js project. Let me know if this is important enough to consider including.
Doesn't use regular expressions
striptags does not use any regular expressions for stripping HTML tags (these are used for detecting whitespace and parsing the allowedTags parameter, not finding HTML).
Regular expressions are not capable of preventing all possible scripting attacks (see this). Here is a great StackOverflow answer regarding how strip_tags (when used without specifying allowableTags) is not vulnerable to scripting attacks.