Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More
Socket
Sign inDemoInstall
Socket

@thi.ng/sax

Package Overview
Dependencies
Maintainers
1
Versions
283
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

@thi.ng/sax

Transducer-based, SAX-like, non-validating, speedy & tiny XML parser

  • 0.2.0
  • Source
  • npm
  • Socket score

Version published
Weekly downloads
144
increased by2.86%
Maintainers
1
Weekly downloads
 
Created
Source

@thi.ng/sax

npm (scoped)

This project is part of the @thi.ng/umbrella monorepo.

About

@thi.ng/transducers-based, SAX-like, non-validating, speedy & tiny XML parser (1.4KB gzipped).

Unlike the classic event-driven approach of SAX, this parser is implemented as a transducer function transforming an XML input into a stream of SAX-event-like objects. Being a transducer, the parser can be used in novel ways as part of a larger processing pipeline and can be composed with other pre or post-processing steps, e.g. to filter or transform element / attribute values.

Installation

yarn add @thi.ng/sax

Dependencies

Usage examples

import * as sax from "@thi.ng/sax";
import * as tx from "@thi.ng/transducers";

src=`<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE foo bar>
<a>
    <b1>
        <c x="23" y="42">ccc
            <d>dd</d>
        </c>
    </b1>
    <b2 foo="bar" />
</a>`

doc = [...tx.iterator(sax.parse(), src)]

// [ { type: 0,
//     tag: 'xml',
//     attribs: { version: '1.0', encoding: 'utf-8' } },
//   { type: 1, body: 'foo bar' },
//   { type: 3, tag: 'a', attribs: {} },
//   { type: 5, tag: 'a', body: '\n    ' },
//   { type: 3, tag: 'b1', attribs: {} },
//   { type: 5, tag: 'b1', body: '\n        ' },
//   { type: 3, tag: 'c', attribs: { x: '23', y: '42' } },
//   { type: 5, tag: 'c', body: 'ccc\n            ' },
//   { type: 3, tag: 'd', attribs: {} },
//   { type: 5, tag: 'd', body: 'dd' },
//   { type: 4, tag: 'd', attribs: {}, children: [], body: 'dd' },
//   { type: 4,
//     tag: 'c',
//     attribs: { x: '23', y: '42' },
//     children: [ { tag: 'd', attribs: {}, children: [], body: 'dd' } ],
//     body: 'ccc\n            ' },
//   { type: 4,
//     tag: 'b1',
//     attribs: {},
//     children: [ [Object] ],
//     body: '\n        ' },
//   { type: 4, tag: 'b2', attribs: { foo: 'bar' } },
//   { type: 4,
//     tag: 'a',
//     attribs: {},
//     children: [ [Object], [Object] ],
//     body: '\n    ' } ]

Result post-processing

As mentioned earlier, the transducer nature of this parser allows for its easy integration into larger transformation pipelines. The next example parses an SVG file, then extracts and selectively applies transformations to only the <circle> elements in the first group (<g>) element.

svg=`
<?xml version="1.0"?>
<svg version="1.1" height="300" width="300" xmlns="http://www.w3.org/2000/svg">
    <g fill="yellow">
        <circle cx="50.00" cy="150.00" r="50.00" />
        <circle cx="250.00" cy="150.00" r="50.00" />
        <circle cx="150.00" cy="150.00" fill="rgba(0,255,255,0.25)" r="100.00" stroke="#ff0000" />
        <rect x="80" y="80" width="140" height="140" fill="none" stroke="black" />
    </g>
    <g fill="none" stroke="black">
        <circle cx="150.00" cy="150.00" r="50.00" />
        <circle cx="150.00" cy="150.00" r="25.00" />
    </g>
</svg>`;

[...tx.iterator(
    tx.comp(
        // transform into parse events
        sax.parse(),
        // match 1st group end
        tx.matchFirst((e) => e.type == sax.Type.ELEM_END && e.tag == "g"),
        // extract group's children
        tx.mapcat((e) => e.children),
        // select circles only
        tx.filter((e) => e.tag == "circle"),
        // transform attributes
        tx.map((e)=> [e.tag, {
            ...e.attribs,
            cx: parseFloat(e.attribs.cy),
            cy: parseFloat(e.attribs.cy),
            r: parseFloat(e.attribs.r),
        }])
    ),
    svg
)]
// [ [ 'circle', { cx: 150, cy: 150, r: 50 } ],
//   [ 'circle', { cx: 150, cy: 150, r: 50 } ],
//   [ 'circle', { cx: 150, cy: 150, fill: 'rgba(0,255,255,0.25)', r: 100, stroke: '#ff0000' } ] ]

Emitted result type IDs

The type key in each emitted result object is a TypeScript enum with the following values:

IDEnumDescription
0Type.PROCProcessing instruction incl. attribs
1Type.DOCTYPEDoctype declaration body
2Type.COMMENTComment body
3Type.ELEM_STARTElement start incl. attributes
4Type.ELEM_ENDElement end incl. attributes, body & children
5Type.ELEM_BODYElement text body
6Type.ERRORParse error description

Authors

  • Karsten Schmidt

License

© 2018 Karsten Schmidt // Apache Software License 2.0

Keywords

FAQs

Package last updated on 19 Jun 2018

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc