🚀 DAY 5 OF LAUNCH WEEK:Introducing Webhook Events for Alert Changes.Learn more →
Socket
Book a DemoInstallSign in
Socket

generator-html-parser

Package Overview
Dependencies
Maintainers
2
Versions
5
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

generator-html-parser

A generator for Yeoman

latest
Source
npmnpm
Version
2.0.2
Version published
Maintainers
2
Created
Source

generator-html-parser

A generator for Yeoman.

It generates the basic structure of an html parser in node.js.

Useful if you are doing scraping with node.js.

Getting Started

How to install it

To install generator-html-parser from npm, run:

$ npm install -g generator-html-parser

How to use it

  • mkdir facebook-html-parser && cd $_
  • yo html-parser

That's it!

How to customize it to parse any html string you need

The main file is <site-name>-html-parser.js.

It contains two methods

  • parse(html,url): it receives as input the html (string) to parse and an url (string), useful if you need to resolve some relative url with the node module Url (already imported)
  • getNextPages(html,url): to get the urls of next pages to surf. Usually useful when you are scraping a list of pages. Still, it takes as input the html (string) to parse, and the url (string) to resolve eventually urls extracted from the html.

Test

The generated code contains code for testing as well. Have a look at the folder test/

Details of implementation

It is based on cheerio to parse the html.

Cheerio is like jQuery, but faster.

$ = cheerio.load(html);

$('.item').each(function() {
    var el=$(this);
	result.push(el.text());
})

License

MIT License

Keywords

yeoman-generator

FAQs

Package last updated on 31 Oct 2017

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts