Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More
Socket
Sign inDemoInstall
Socket

@thenja/html-parser

Package Overview
Dependencies
Maintainers
1
Versions
5
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

@thenja/html-parser

A simple forgiving html parser

  • 1.1.3
  • latest
  • Source
  • npm
  • Socket score

Version published
Weekly downloads
17
decreased by-34.62%
Maintainers
1
Weekly downloads
 
Created
Source

Test Coverage-shield-badge-1

Html-Parser

A simple forgiving html parser for javascript (browser and nodejs).

Features

  • Works in NodeJs or the browser
  • Parse HTML that may not be valid
  • HTML is parsed into a json object, this object can be modified and converted back into HTML
  • Ability to clean HTML, such as remove empty tags and more.

Not supported

  • CDATA in html is not supported

How to use

Installation

npm install @thenja/html-parser --save

Typescript

import { HtmlParser } from "@thenja/html-parser";
let htmlParser = new HtmlParser();

Javascript (browser)

<script src="dist/thenja-html-parser.min.js" type="text/javascript"></script>
var htmlParser = new Thenja.HtmlParser();

Parse HTML

Basic usage

let html = "<div><p>Hello world!</p></div>";
let output = htmlParser.parse(html);

Example output

[
  {
    "type": "tag",
    "tagType": "default",
    "name": "div",
    "attributes": {},
    "children": [
      {
        "type": "tag",
        "tagType": "default",
        "name": "p",
        "attributes": {},
        "children": [
          {
            "type": "text",
            "data": "Hello world!"
          }
        ]
      }
    ]
  }
]

Parse html and reverse the output

let html = "<div><p>Hello world!</p></div>";
let output = htmlParser.parse(html);
let reversedHtml = htmlParser.reverse(output);

Listen for errors

let html = "<div><p>Hello world!</p></div>";
let output = htmlParser.parse(html, (err) => {
  // handle errors here
});

Listen for nodes being added when parsing

// In this example we will replace .jpg extensions with .png
let html = "<div><img src='my-picture.jpg' /></div>";
let output = htmlParser.parse(html, null, (node, parentNode) => {
  if(node.name === 'img' && node.attributes && node.attributes.src) {
    node.attributes.src = node.attributes.src.replace('.jpg', '.png');
  }
});
let newHtml = htmlParser.reverse(output);
// newHtml will equal: <div><img src='my-picture.png' /></div>

Listen for nodes being stringified when reversing

// In this example we will remove the class attribute
let html = "<div class='my-style'></div>";
let output = htmlParser.parse(html);
let newHtml = htmlParser.reverse(output, (node) => {
  if(node.name === 'div') {
    delete node.attributes['class'];
  }
});
// newHtml will equal: <div></div>

Clean up the html

The clean function allows you to remove unwanted html tags (such as empty tags) and empty text nodes.

Available options:

OptionsDescription
removeEmptyTagsRemove empty html tags, such as <p></p>
removeEmptyTextNodesBasically remove a text node if it only contains whitespace
let html = "<div>Hi there<p></p></div>";
// by default, clean options are true, so this is only here for demo purposes
let cleanOptions = { removeEmptyTags: true, removeEmptyTextNodes: true };
let output = htmlParser.parse(html);
output = htmlParser.clean(output, cleanOptions);
let newHtml = htmlParser.reverse(output);
// newHtml will equal: <div>Hi there</div>

Development

npm run init - Setup the app for development (run once after cloning)

npm run dev - Run this command when you want to work on this app. It will compile typescript, run tests and watch for file changes.

Distribution

npm run build -- -v <version> - Create a distribution build of the app.

-v (version) - [Optional] Either "patch", "minor" or "major". Increase the version number in the package.json file.

The build command creates a /compiled directory which has all the javascript compiled code and typescript definitions. As well, a /dist directory is created that contains a minified javascript file.

Testing

Tests are automatically ran when you do a build.

npm run test - Run the tests. The tests will be ran in a nodejs environment. You can run the tests in a browser environment by opening the file /spec/in-browser/SpecRunner.html.

License

MIT © Nathan Anderson

ToDo

  1. Add in more unit tests

  2. Add in a flattenText() function. This will flatten many nested text nodes into one text node.

<p>My name is <strong>Nathan</strong></p>
Flattened to:
<p>My name is Nathan</p>

Keywords

FAQs

Package last updated on 01 Feb 2022

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc