Security News
PyPI Introduces Digital Attestations to Strengthen Python Package Security
PyPI now supports digital attestations, enhancing security and trust by allowing package maintainers to verify the authenticity of Python packages.
rss-parser
Advanced tools
A simple, light-weight RSS parser. Parse strings, URLs, or files and get a JS object back
rss-parser is a lightweight and easy-to-use library for parsing RSS and Atom feeds in Node.js. It provides a simple API to fetch and parse feeds, making it easy to integrate RSS feed reading functionality into your applications.
Parsing a feed from a URL
This feature allows you to parse an RSS feed from a given URL. The code sample demonstrates how to fetch and parse the feed, then log the title of the feed and each item within it.
const Parser = require('rss-parser');
let parser = new Parser();
(async () => {
let feed = await parser.parseURL('https://example.com/rss');
console.log(feed.title);
feed.items.forEach(item => {
console.log(item.title + ':' + item.link);
});
})();
Parsing a feed from a string
This feature allows you to parse an RSS feed from a raw XML string. The code sample demonstrates how to parse the XML string and log the title of the feed and each item within it.
const Parser = require('rss-parser');
let parser = new Parser();
let xml = `<?xml version="1.0" encoding="UTF-8" ?>
<rss version="2.0">
<channel>
<title>Example RSS Feed</title>
<link>https://example.com/</link>
<description>This is an example RSS feed</description>
<item>
<title>Example Item</title>
<link>https://example.com/example-item</link>
<description>This is an example item</description>
</item>
</channel>
</rss>`;
(async () => {
let feed = await parser.parseString(xml);
console.log(feed.title);
feed.items.forEach(item => {
console.log(item.title + ':' + item.link);
});
})();
Customizing the parser
This feature allows you to customize the parser to include additional fields that are not part of the default RSS/Atom specification. The code sample demonstrates how to include a custom field ('media:content') and log it for each item in the feed.
const Parser = require('rss-parser');
let parser = new Parser({
customFields: {
item: ['media:content']
}
});
(async () => {
let feed = await parser.parseURL('https://example.com/rss');
console.log(feed.title);
feed.items.forEach(item => {
console.log(item.title + ':' + item['media:content']);
});
})();
Feedparser is a robust RSS and Atom feed parsing library for Node.js. It offers more control and customization options compared to rss-parser, but it has a steeper learning curve and requires more boilerplate code to get started.
The rss package is primarily focused on generating RSS feeds rather than parsing them. It allows you to create RSS feeds programmatically, which can be useful if you need to provide RSS feeds for your own content.
xml2js is a general-purpose XML parser for Node.js. While it is not specifically designed for RSS or Atom feeds, it can be used to parse any XML, including RSS feeds. It provides more flexibility but requires more manual handling of the XML structure.
A small library for turning RSS XML feeds into JavaScript objects.
npm install --save rss-parser
You can parse RSS from a URL (parser.parseURL
) or an XML string (parser.parseString
).
Both callbacks and Promises are supported.
Here's an example in NodeJS using Promises with async/await:
let Parser = require('rss-parser');
let parser = new Parser();
(async () => {
let feed = await parser.parseURL('https://www.reddit.com/.rss');
console.log(feed.title);
feed.items.forEach(item => {
console.log(item.title + ':' + item.link)
});
})();
We recommend using a bundler like webpack, but we also provide pre-built browser distributions in the
dist/
folder. If you use the pre-built distribution, you'll need a polyfill for Promise support.
Here's an example in the browser using callbacks:
<script src="/node_modules/rss-parser/dist/rss-parser.min.js"></script>
<script>
// Note: some RSS feeds can't be loaded in the browser due to CORS security.
// To get around this, you can use a proxy.
const CORS_PROXY = "https://cors-anywhere.herokuapp.com/"
let parser = new RSSParser();
parser.parseURL(CORS_PROXY + 'https://www.reddit.com/.rss', function(err, feed) {
console.log(feed.title);
feed.items.forEach(function(entry) {
console.log(entry.title + ':' + entry.link);
})
})
</script>
A few minor breaking changes were made in v3. Here's what you need to know:
new Parser()
before calling parseString
or parseURL
parseFile
is no longer available (for better browser support)options
are now passed to the Parser constructorparsed.feed
is now just feed
(top-level object removed)feed.entries
is now feed.items
(to better match RSS XML)Check out the full output format in test/output/reddit.json
feedUrl: 'https://www.reddit.com/.rss'
title: 'reddit: the front page of the internet'
description: ""
link: 'https://www.reddit.com/'
items:
- title: 'The water is too deep, so he improvises'
link: 'https://www.reddit.com/r/funny/comments/3skxqc/the_water_is_too_deep_so_he_improvises/'
pubDate: 'Thu, 12 Nov 2015 21:16:39 +0000'
creator: "John Doe"
content: '<a href="http://example.com">this is a link</a> - <b>this is bold text</b>'
contentSnippet: 'this is a link - this is bold text'
guid: 'https://www.reddit.com/r/funny/comments/3skxqc/the_water_is_too_deep_so_he_improvises/'
categories:
- funny
isoDate: '2015-11-12T21:16:39.000Z'
contentSnippet
field strips out HTML tags and unescapes HTML entitiesdc:
prefix will be removed from all fieldsdc:date
and pubDate
will be available in ISO 8601 format as isoDate
author
is specified, but not dc:creator
, creator
will be set to author
(see article)If your RSS feed contains fields that aren't currently returned, you can access them using the customFields
option.
let parser = new Parser({
customFields: {
feed: ['otherTitle', 'extendedDescription'],
item: ['coAuthor','subtitle'],
}
});
parser.parseURL('https://www.reddit.com/.rss', function(err, feed) {
console.log(feed.extendedDescription);
feed.items.forEach(function(entry) {
console.log(entry.coAuthor + ':' + entry.subtitle);
})
})
To rename fields, you can pass in an array with two items, in the format [fromField, toField]
:
let parser = new Parser({
customFields: {
item: [
['dc:coAuthor', 'coAuthor'],
]
}
})
To pass additional flags, provide an object as the third array item. Currently there is one such flag:
keepArray
: true
to return all values for fields that can have multiple entries. Default: return the first item only.let parser = new Parser({
customFields: {
item: [
['media:content', 'media:content', {keepArray: true}],
]
}
})
rss-parser
uses xml2js
to parse XML. You can pass these options
to new xml2js.Parser()
by specifying options.xml2js
:
let parser = new Parser({
xml2js: {
emptyTag: '--EMPTY--',
}
});
You can pass headers to the HTTP request:
let parser = new Parser({
headers: {'User-Agent': 'something different'},
});
By default, parseURL
will follow up to five redirects. You can change this
with options.maxRedirects
.
let parser = new Parser({maxRedirects: 100});
Contributions welcome!
The tests run the RSS parser for several sample RSS feeds in test/input
and outputs the resulting JSON into test/output
. If there are any changes to the output files the tests will fail.
To check if your changes affect the output of any test cases, run
npm test
To update the output files with your changes, run
WRITE_GOLDEN=true npm test
npm run build
git commit -a -m "Build distribution"
npm version minor # or major/patch
npm publish
git push --follow-tags
FAQs
A lightweight RSS parser, for Node and the browser
The npm package rss-parser receives a total of 190,759 weekly downloads. As such, rss-parser popularity was classified as popular.
We found that rss-parser demonstrated a not healthy version release cadence and project activity because the last version was released a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
PyPI now supports digital attestations, enhancing security and trust by allowing package maintainers to verify the authenticity of Python packages.
Security News
GitHub removed 27 malicious pull requests attempting to inject harmful code across multiple open source repositories, in another round of low-effort attacks.
Security News
RubyGems.org has added a new "maintainer" role that allows for publishing new versions of gems. This new permission type is aimed at improving security for gem owners and the service overall.