
An extensible RSS/RDF/Atom feed parser for use with html-parser2.
Assuming the following RSS, with a custom my:tag
you want to extract into your object model
<?xml version="1.0"?>
<rss version="2.0" xmlns:my="http://example.com/my">
<title>Liftoff News</title>
<description>Liftoff to Space Exploration.</description>
<pubDate>Tue, 10 Jun 2003 04:00:00 GMT</pubDate>
<lastBuildDate>Tue, 10 Jun 2003 09:41:01 GMT</lastBuildDate>
<title>Star City</title>
<description>How do Americans get ready to work with Russians aboard the International Space Station? They take a crash course in culture, language and protocol at Russia's <a href="http://howe.iki.rssi.ru/GCTC/gctc_e.htm">Star City</a>.</description>
<pubDate>Tue, 03 Jun 2003 09:39:21 GMT</pubDate>
<my:tag>So glad I could sneak this new content in</my:tag>
You can extract the value of the custom tag using the extensions
var htmlparser2 = require('htmlparser2');
var FeedHandler = require('feedhandler');
function parse(xml, cb) {
var handler = new FeedHandler(cb, {
extensions: [
{input: "my:tag", output: "my_tag"}
try {
new htmlparser2.Parser(handler, {xmlMode: true}).parseComplete(xml);
} catch (ex) {
module.exports = parse;
Running parse
against the xml sample will return the following object:
type: "rss",
id: "",
title: "Liftoff News",
link: "http://liftoff.msfc.nasa.gov/",
description: "Liftoff to Space Exploration.",
updated: new Date("Tue, 10 Jun 2003 09:41:01 GMT"),
items: [{
id: "http://liftoff.msfc.nasa.gov/2003/06/03.html#item573",
title: "Star City",
link: "http://liftoff.msfc.nasa.gov/news/2003/news-starcity.asp",
description: "How do Americans get ready to work with Russians aboard the International Space Station? They take a crash course in culture, language and protocol at Russia's <a href=\"http://howe.iki.rssi.ru/GCTC/gctc_e.htm\">Star City</a>.",
pubDate: new Date("Tue, 03 Jun 2003 09:39:21 GMT"),
my_tag: "So glad I could sneak this new content in"