New Research: Supply Chain Attack on Axios Pulls Malicious Dependency from npm.Details →
Socket
Book a DemoSign in
Socket

webidl-scraper

Package Overview
Dependencies
Maintainers
1
Versions
4
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

webidl-scraper

Scrape IDL definitions from Web standard specs

latest
Source
npmnpm
Version
0.0.4
Version published
Maintainers
1
Created
Source

webidl-scraper Build Status

Scrape IDL definitions from Web standard specs

Installation

Download node at nodejs.org and install it, if you haven't already.

npm install -g webidl-scraper

Usage

  webidl-scraper [options] <inputs: file | URL | "-" ...> (use - for stdin)

  Scrape IDL definitions from Web standard specs.

  Options:

    -h, --help                output usage information
    -V, --version             output the version number
    -o, --output-file <file>  output the scraped IDL to <file> (use - for stdout, the default)
    --with-class-extract      do not ignore <pre class="idl extract" />
    --with-data-no-idl        do not ignore <pre data-no-idl />
    --with-idl-index          do not ignore IDL after id="idl-index"

Examples

Scrape a Web page for IDL fragments:

webidl-scraper https://html.spec.whatwg.org/
# Output to stdout

webidl-scraper http://dev.w3.org/csswg/cssom/ -o cssom.idl
# Save to cssom.idl

curl -sL http://dev.w3.org/csswg/cssom/ | webidl-scraper - > cssom.idl 
# Use curl for HTTP and redirect stdout to cssom.idl

Scrape an HTML file for IDL fragments:

webidl-scraper html5-spec.html -o html5-spec.idl

Scraping algorithm

These steps are derived experimentally and may change. I have tried to include links to sources and/or motivating examples for all the rules.

  • Get the contents of <pre class="idl" />, tags, excluding class="idl extract" (reference #1, #2).
  • If the document has an IDL Index section (example) - marked by an element with id="idl-index" - ignore IDL fragments that follow, on the assumption that they will contain no new IDL.
  • Also ignore tags that have the data-no-idl attribute (following Bikeshed).

Tests

npm install
npm test
  Scraper CLI
    fixtures/html/*.html
      cssom-with-class-extract.html [--with-class-extract]
        √ should match cssom-with-class-extract.idl (111ms)
      cssom-with-idl-index.html [--with-idl-index]
        √ should match cssom-with-idl-index.idl
      cssom.html
        √ should match cssom.idl
      dom-with-data-no-idl.html [--with-data-no-idl]
        √ should match dom-with-data-no-idl.idl
      dom.html
        √ should match dom.idl
      html5.html
        √ should match html5.idl (553ms)
      noidl.html
        √ should match noidl.idl
    with input type
      URL
        √ should complete without errors
      glob pattern (test/**/cs*.html)
        √ should complete without errors
      file name (test/fixtures/html/cssom.html)
        √ should complete without errors
      stdin (-)
        √ should complete without errors
    with -o <file>
      √ should create the file
  scraper-core
    fixtures/html/*.html
      cssom-with-class-extract.html + options/cssom-with-class-extract.json
        √ should match cssom-with-class-extract.idl
      cssom-with-idl-index.html + options/cssom-with-idl-index.json
        √ should match cssom-with-idl-index.idl
      cssom.html
        √ should match cssom.idl
      dom-with-data-no-idl.html + options/dom-with-data-no-idl.json
        √ should match dom-with-data-no-idl.idl
      dom.html
        √ should match dom.idl
      html5.html
        √ should match html5.idl (566ms)
      noidl.html
        √ should match noidl.idl
  19 passing (2s)

Dependencies

  • commander: the complete solution for node.js command-line programs
  • glob: a little globber
  • htmlparser2: Fast & forgiving HTML/XML/RSS parser
  • request: Simplified HTTP request client.
  • rx: Library for composing asynchronous and event-based operations in JavaScript

Dev Dependencies

  • chai: BDD/TDD assertion library for node.js and the browser. Test framework agnostic.
  • find-port: find an unused port in your localhost
  • mocha: simple, flexible, fun test framework
  • node-static: simple, compliant file streaming module for node
  • rx-node: RxJS Bindings for Node.js and io.js
  • temp: Temporary files and directories

License

MIT

Keywords

webidl

FAQs

Package last updated on 01 Jun 2015

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts