What is node-expat?
The node-expat package is a fast XML parser for Node.js, which is a binding for the Expat XML parser library. It is used to parse XML documents efficiently and can handle large XML files with ease. It provides a streaming interface for XML parsing, making it suitable for applications that need to process XML data in real-time or in chunks.
What are node-expat's main functionalities?
Basic XML Parsing
This feature allows you to parse XML documents and handle different events such as start and end of elements, and text nodes. The code sample demonstrates how to set up a basic XML parser and handle these events.
const expat = require('node-expat');
const parser = new expat.Parser('UTF-8');
parser.on('startElement', (name, attrs) => {
console.log(`Start element: ${name}`);
console.log(attrs);
});
parser.on('endElement', (name) => {
console.log(`End element: ${name}`);
});
parser.on('text', (text) => {
console.log(`Text: ${text}`);
});
const xml = '<root><child id="1">Hello</child></root>';
parser.write(xml);
Streaming XML Parsing
This feature allows you to parse large XML files by streaming the data. The code sample demonstrates how to set up a streaming XML parser that reads from a file stream, making it suitable for processing large XML files without loading the entire file into memory.
const fs = require('fs');
const expat = require('node-expat');
const parser = new expat.Parser('UTF-8');
parser.on('startElement', (name, attrs) => {
console.log(`Start element: ${name}`);
console.log(attrs);
});
parser.on('endElement', (name) => {
console.log(`End element: ${name}`);
});
parser.on('text', (text) => {
console.log(`Text: ${text}`);
});
const stream = fs.createReadStream('path/to/large.xml');
stream.pipe(parser);
Other packages similar to node-expat
sax
The sax package is a simple, fast, and streaming XML parser for Node.js. It provides a similar streaming interface for XML parsing but is written entirely in JavaScript, making it more portable. Compared to node-expat, sax may be slower for very large XML files but is easier to install and use since it does not require native bindings.
xml2js
The xml2js package is a popular XML parser that converts XML documents into JavaScript objects. It is not a streaming parser like node-expat but is very convenient for converting XML data into a more easily manipulable format. It is suitable for smaller XML files or when you need to work with the entire XML document as a JavaScript object.
fast-xml-parser
The fast-xml-parser package is a fast and lightweight XML parser that can parse XML into JavaScript objects. It supports both synchronous and asynchronous parsing and can handle large XML files efficiently. Compared to node-expat, it offers more flexibility in terms of parsing options and does not require native bindings, making it easier to install and use.
node-expat
Motivation
You use Node.js for speed? You process XML streams? Then you want the fastest XML parser: libexpat!
Install
npm install node-expat
Usage
Important events emitted by a parser:
(function () {
"use strict";
var expat = require('node-expat')
var parser = new expat.Parser('UTF-8')
parser.on('startElement', function (name, attrs) {
console.log(name, attrs)
})
parser.on('endElement', function (name) {
console.log(name)
})
parser.on('text', function (text) {
console.log(text)
})
parser.on('error', function (error) {
console.error(error)
})
parser.write('<html><head><title>Hello World</title></head><body><p>Foobar</p></body></html>')
}())
API
#on('startElement' function (name, attrs) {})
#on('endElement' function (name) {})
#on('text' function (text) {})
#on('processingInstruction', function (target, data) {})
#on('comment', function (s) {})
#on('xmlDecl', function (version, encoding, standalone) {})
#on('startCdata', function () {})
#on('endCdata', function () {})
#on('entityDecl', function (entityName, isParameterEntity, value, base, systemId, publicId, notationName) {})
#on('error', function (e) {})
#stop()
pauses#resume()
resumes
Error handling
We don't emit an error event because libexpat doesn't use a callback
either. Instead, check that parse()
returns true
. A descriptive
string can be obtained via getError()
to provide user feedback.
Alternatively, use the Parser like a node Stream. write()
will emit
error events.
Namespace handling
A word about special parsing of xmlns: this is not necessary in a
bare SAX parser like this, given that the DOM replacement you are
using (if any) is not relevant to the parser.
Benchmark
npm run benchmark
Higher is better.
Testing
npm install -g standard
npm test
Windows
If you fail to install node-expat as a dependency of node-xmpp, please update node-xmpp as it doesn't use node-expat anymore.
Dependencies for node-gyp
https://github.com/TooTallNate/node-gyp#installation
See https://github.com/astro/node-expat/issues/78 if you are getting errors about not finding nan.h
.
expat.vcproj
VCBUILD : error : project file 'node-expat\build\deps\libexpat\expat.vcproj' was not found or not a valid proj
ect file. [C:\Users\admin\AppData\Roaming\npm\node_modules\node-expat\build\bin
ding.sln]
Install Visual Studio C++ 2012 and run npm with the --msvs_version=2012
flag.