post-feed-reader
A library to fetch news, blog or podcast posts from any site.
It works by auto-discovering a post source, which can be an RSS/Atom/JSON feed or the Wordpress REST API, then fetches and parses the list of posts.
It's meant for NodeJS, but as it is built on Isomorphic Javascript, it can work on browsers if the website allows cross-origin requests.
Originally built for apps that need to list the posts with their own UI, but don't actually manage the blog and need automatic fallbacks when the blog technology does change.
Features
Getting Started
Install it with NPM or Yarn:
npm install post-feed-reader
You first need to discover the post source, which will return an object containing a URL to the RSS/Atom/JSON Feed or the Wordpress REST API.
Then you can pass the discovered source to the getPostList
, which will fetch and parse it.
import { discoverPostSource, getPostList } from 'post-feed-reader';
const source = await discoverPostSource('https://www.nytimes.com');
const list = await getPostList(source);
console.log(list.posts.map(post => post.title));
Simple enough, eh? Try it on RunKit
Output
See an example of the post list based on the Mozilla blog.
Options
const source = await discoverPostSource('https://techcrunch.com', {
axios: axios.create(...),
preferFeeds: false,
canUseSource: (source: DiscoveredSource) => true,
tryToGuessPaths: false,
wpApiPaths: ['./wp-json', '?rest_route=/'],
feedPaths: ['./feed', './atom', './rss', './feed.json', './feed.xml', '?feed=atom'],
});
const posts = await getPostList(source, {
axios: axios.create(...),
fillTextContents: false,
wordpress: {
includeEmbedded: true,
fetchBlogInfo: false,
limit: 10,
search: '',
authors: [...],
categories: [...],
tags: [...],
additionalParams: { ... },
},
});
Skip the auto-discovery
If you already have an Atom/RSS/JSON Feed or the Wordpress REST API url in hands, you can fetch the posts directly:
const feedPosts = await getFeedPostList('https://news.google.com/atom');
const wpApiPosts = await getWordpressPostList('https://blog.mozilla.org/en/wp-json/');
The post list may have pagination metadata attached. You can use it to navigate through pages. Here's an example:
const result = await getPostList(...);
if (result.pagination.next) {
const nextResult = await getPostList(result.pagination.next);
}
RSS is the most widely feed format used on the web, but not only it lacks information that might be trivial to your application, the specification is a mess with many vague to implementation properties, meaning how the information is formatted differs from feed to feed.
For instance, the description can be the full post as HTML, or just an excerpt, or in plain text, or even just an HTML link to the post page.
Atom's specification is way more rigid and robust, which makes relying on the data trustworthier. It's definitely the way to go in the topic of feeds. But it still lacks some properties that can only be fetched through the Wordpress REST API.
Since WordPress is by far the most used CMS, supporting its API is a great alternative. The Wordpress REST API supports the following over RSS and Atom feeds:
- Filtering by category, tag and/or author
- Searching
- Pagination
- Featured media
- Author profile
The JSON Feed format is also just as good as the Atom format, but at the moment very few websites produce it.
How does the auto-discovery works?
- Fetches the site's main page
- Looks for WordPress API Link headers
- Looks for RSS, Atom and JSON Feed
<link>
metatags - If
tryToGuessPaths
is set to true
, it will query a few common paths to try to find a feed or the WP API.
Most properties are optional, what am I guaranteed to have?
Nothing.
Yeah, there's no property that is required in all specs, thus we can't guarantee any of them will be present.
But! The most basic properties are very likely to be present, such as guid
, title
and link
.
For all the other properties, it's highly recommended implementing your own fallbacks.
For instance, showing a substring of the content when the summary isn't available.
The library will try its best to fetch the most data available.