Socket
Socket
Sign inDemoInstall

async-saxophone

Package Overview
Dependencies
Maintainers
1
Versions
3
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

async-saxophone

Fast and lightweight asynchronous iterator XML parser in pure JavaScript


Version published
Weekly downloads
24
decreased by-7.69%
Maintainers
1
Weekly downloads
 
Created
Source

async-saxophone

Fast and lightweight asynchonous XML parser in pure JavaScript.

Async-saxophone is based upon Saxophone, which, in turn, is inspired by SAX parsers such as sax-js and EasySax: unlike most XML parsers, but like Saxophone, async-saxophone does not create a Document Object Model (DOM) tree as a result of parsing documents.

Instead, it implements an async iterator. Async-saxophone takes as input an XML document in the form of a string or any iterator, including a stream. It parses the XML and then outputs the nodes (tags, text, comments, etc) encountered as they are parsed. As an async iterator, it is suitable for iteration using for await...of.

Async-saxophone was developed to assure that a new chunk of XML is not taken from its input until all nodes encountered have been processed, even if there is delay in processing. The asynchronous design assures synchronization of input and output.

The async-saxophone parser is based upon the Saxophone parser and inherits its light weight and speed. It does not maintain document state nor check the validity of the document. Modifications to the Saxophone parser include structuring it as an async generator function, substituting yield for emit, expecting an input string or iterator as an argument, rather than being piped to, and representing each node as a tuple-like array.

The parser does not parse the attribute string in a tag nor does it parse entities in text. Saxophone's parseAttrs and parseEntities functions may be used to parse the attribute string or entities. To avoid unnecessary dependencies, Saxophone must be installed seperately if these functions are needed.

Installation

This package requires Node.JS 10.0 or later. It may also work in recent browsers that support async generator functions and for await...of. To install with npm:

$ npm install async-saxophone

Tests

To run tests, use the following commands:

$ git clone https://github.com/randymized/async-saxophone.git
$ cd async-saxophone
$ npm install
$ npm test

Example

const {makeAsyncXMLParser} = require('async-saxophone');
const delay = ms => new Promise(_ => setTimeout(_, ms));

const xml = '<root><example id="1" /><example id="2" /></root>'

async function main() {
    const parser = makeAsyncXMLParser();
    for await (let node of parser(xml)) {
        console.dir(node);
        await delay(500);
    }
    console.log('done')
}
main().catch(console.error)

Output:

[ 'tagopen', 'root', '', '' ]
[ 'tagopen', 'example', 'id="1"', '/' ]
[ 'tagopen', 'example', 'id="2"', '/' ]
[ 'tagclose', 'root' ]
done

Exports:

const {makeAsyncXMLParser} = require('async-saxophone');

  • makeAsyncXMLParser(options) takes parser options and returns a generator function that will parse an XML document.

    • options are detailed below.
    • parser(iterator) is the async generator function returned from makeAsyncXMLParser.
      • It takes as an argument any iterator over an XML document.
      • It returns an async iterator over the nodes encountered as the document is parsed.
  • options

    • include: a list of node types to be output. See AvailableNodes above for a complete list. If option = {include:['tagopen','tagclose']}, for example, only opening and closing tags will be output. If include is not specified, all nodes will be output.
    • alwaysTagClose: If a self-closing tag is encountered a tagclose node will be output
    • noEmptyText: If truish, empty text nodes, or text that is all whitespace will not be output.

Output:

The parser returned from makeAsyncXMLParser is an async generator function. It takes an iterator as an argument and returns an async iterator over the nodes encountered during parsing. The types of nodes and their representation is as follows:

  • tagopen: ['tagopen', tag-name, attr-string, is-self-closing].
    • tag-name the tag's name, as found in the XML: <tag-name ...>
    • attr-string everything between the tag name and > or />. This string may be parsed with Saxophone.parseAttrs to convert it into a key/value object. Any leading or trailing whitespace will be trimmed off.
    • is-self-closing will be either '/' (truish) if the tag is self-closing or '' (falsish) if it is not.
  • tagclose: ['tagclose', tag-name]
  • text: ['text',content]. Entities in the text may be parsed with the Saxophone.parseEntities function.
  • cdata: ['cdata',content]
  • commment: ['comment',content]
  • processinginstruction: ['processinginstruction',content]. Content of the processing instruction is not parsed.

Contributions

This is free and open source software. All contributions (even small ones) are welcome. Check out the contribution guide to get started!

Thanks to:

  • Mattéo Delabre for Saxophone. The (modified) Saxophone parser is at the heart of this package.
  • Norman Rzepka for the check in Saxophone for opening and closing tags mismatch.
  • winston01 for spotting and fixing an error in the Saxophone parser when a tag sits astride two chunks.

License

Released under the MIT license. See the full license text.

Keywords

FAQs

Package last updated on 22 Aug 2020

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc