Socket
Socket
Sign inDemoInstall

html-dom-parser

Package Overview
Dependencies
6
Maintainers
1
Versions
44
Alerts
File Explorer

Advanced tools

Install Socket

Detect and block malicious and high-risk dependencies

Install

    html-dom-parser

HTML to DOM parser.


Version published
Weekly downloads
1.4M
decreased by-0.71%
Maintainers
1
Install size
1.02 MB
Created
Weekly downloads
 

Package description

What is html-dom-parser?

The html-dom-parser npm package is designed to parse HTML strings into DOM nodes and vice versa, making it easier to manipulate, traverse, and work with HTML content programmatically in JavaScript environments. It is particularly useful for server-side rendering, web scraping, and building web crawlers or SEO tools.

What are html-dom-parser's main functionalities?

Parsing HTML string to DOM nodes

This feature allows you to convert an HTML string into DOM nodes, enabling programmatic manipulation of the resulting structure. It's useful for extracting information from HTML content or preparing it for further processing.

const parse = require('html-dom-parser');
const domNodes = parse('<div><p>Hello World</p></div>');

Converting DOM nodes back to HTML string

This functionality allows you to take DOM nodes (possibly after manipulation) and convert them back into an HTML string. This is particularly useful for generating HTML content dynamically or modifying existing HTML content programmatically.

const domToHtml = require('html-dom-parser').domToHtml;
const htmlString = domToHtml([{ type: 'tag', name: 'div', children: [{ type: 'tag', name: 'p', children: [{ type: 'text', data: 'Hello World' }] }] }]);

Other packages similar to html-dom-parser

Changelog

Source

5.0.8 (2024-02-12)

Bug Fixes

  • esm: fix exported types (b6918ae)

Readme

Source

html-dom-parser

NPM

NPM version Bundlephobia minified + gzip Build Status codecov NPM downloads

HTML to DOM parser that works on both the server (Node.js) and the client (browser):

HTMLDOMParser(string[, options])

The parser converts an HTML string to a JavaScript object that describes the DOM tree.

Example
import parse from 'html-dom-parser';

parse('<p>Hello, World!</p>');
Output

[
  Element {
    type: 'tag',
    parent: null,
    prev: null,
    next: null,
    startIndex: null,
    endIndex: null,
    children: [
      Text {
        type: 'text',
        parent: [Circular],
        prev: null,
        next: null,
        startIndex: null,
        endIndex: null,
        data: 'Hello, World!'
      }
    ],
    name: 'p',
    attribs: {}
  }
]

Replit | JSFiddle | Examples

Install

NPM:

npm install html-dom-parser --save

Yarn:

yarn add html-dom-parser

CDN:

<script src="https://unpkg.com/html-dom-parser@latest/dist/html-dom-parser.min.js"></script>
<script>
  window.HTMLDOMParser(/* string */);
</script>

Usage

Import with ES Modules:

import parse from 'html-dom-parser';

Require with CommonJS:

const parse = require('html-dom-parser').default;

Parse empty string:

parse('');

Output:

[]

Parse string:

parse('Hello, World!');
Output

[
  Text {
    type: 'text',
    parent: null,
    prev: null,
    next: null,
    startIndex: null,
    endIndex: null,
    data: 'Hello, World!'
  }
]

Parse element with attributes:

parse('<p class="foo" style="color: #bada55">Hello, <em>world</em>!</p>');
Output

[
  Element {
    type: 'tag',
    parent: null,
    prev: null,
    next: null,
    startIndex: null,
    endIndex: null,
    children: [ [Text], [Element], [Text] ],
    name: 'p',
    attribs: { class: 'foo', style: 'color: #bada55' }
  }
]

The server parser is a wrapper of htmlparser2 parseDOM but with the root parent node excluded. The next section shows the available options you can use with the server parse.

The client parser mimics the server parser by using the DOM API to parse the HTML string.

Options (server only)

Because the server parser is a wrapper of htmlparser2, which implements domhandler, you can alter how the server parser parses your code with the following options:

/**
 * These are the default options being used if you omit the optional options object.
 * htmlparser2 will use the same options object for its domhandler so the options
 * should be combined into a single object like so:
 */
const options = {
  /**
   * Options for the domhandler class.
   * https://github.com/fb55/domhandler/blob/master/src/index.ts#L16
   */
  withStartIndices: false,
  withEndIndices: false,
  xmlMode: false,
  /**
   * Options for the htmlparser2 class.
   * https://github.com/fb55/htmlparser2/blob/master/src/Parser.ts#L104
   */
  xmlMode: false, // Will overwrite what is used for the domhandler, otherwise inherited.
  decodeEntities: true,
  lowerCaseTags: true, // !xmlMode by default
  lowerCaseAttributeNames: true, // !xmlMode by default
  recognizeCDATA: false, // xmlMode by default
  recognizeSelfClosing: false, // xmlMode by default
  Tokenizer: Tokenizer,
};

If you're parsing SVG, you can set lowerCaseTags to true without having to enable xmlMode. This will return all tag names in camelCase and not the HTML standard of lowercase.

[!NOTE] If you're parsing code client-side (in-browser), you cannot control the parsing options. Client-side parsing automatically handles returning some HTML tags in camelCase, such as specific SVG elements, but returns all other tags lowercased according to the HTML standard.

Migration

v5

Migrated to TypeScript. CommonJS imports require the .default key:

const parse = require('html-dom-parser').default;

v4

Upgraded htmlparser2 to v9.

v3

Upgraded domhandler to v5. Parser options like normalizeWhitespace have been removed.

v2

Removed Internet Explorer (IE11) support.

v1

Upgraded domhandler to v4 and htmlparser2 to v6.

Release

Release and publish are automated by Release Please.

Special Thanks

License

MIT

Keywords

FAQs

Last updated on 12 Feb 2024

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc