New Case Study:See how Anthropic automated 95% of dependency reviews with Socket.Learn More →

reddit-rss-reader

Package Overview

Dependencies

Advanced tools

Install Socket

Detect and block malicious and high-risk dependencies

Install

reddit-rss-reader

A Wrapper around Reddit RSS feed

1.3.2
PyPI

Maintainers: 1

Reddit RSS Reader

This is wrapper around publicly/privately available Reddit RSS feeds. It can be used to fetch content from front page, subreddit, all comments of subreddit, all comments of a certain post, comments of certain reddit user, search pages and many more. For more details about what type of RSS feed is provided by Reddit refer these links: link1 and link2.

*Note: These feeds are rate limited hence can only be used for testing purpose. For serious scrapping register your bot at apps to get client details and use it with Praw.

Installation

Install via PyPi:

pip install reddit-rss-reader

Install from master branch (if you want to try the latest features):

git clone https://github.com/lalitpagaria/reddit-rss-reader
cd reddit-rss-reader
pip install --editable .

How to use

RedditRSSReader require feed url, hence refer link to generate. For example to fetch all comments on subreddit r/wallstreetbets -

https://www.reddit.com/r/wallstreetbets/comments/.rss?sort=new

Now you can run the following example -

import pprint
from datetime import datetime, timedelta

import pytz as pytz

from reddit_rss_reader.reader import RedditRSSReader


reader = RedditRSSReader(
    url="https://www.reddit.com/r/wallstreetbets/comments/.rss?sort=new"
)

# To consider comments entered in past 5 days only
since_time = datetime.utcnow().astimezone(pytz.utc) + timedelta(days=-5)

# fetch_content will fetch all contents if no parameters are passed.
# If `after` is passed then it will fetch contents after this date
# If `since_id` is passed then it will fetch contents after this id
reviews = reader.fetch_content(
    after=since_time
)

pp = pprint.PrettyPrinter(indent=4)
for review in reviews:
    pp.pprint(review.__dict__)

Reader return RedditContent which have following information (extracted_text and image_alt_text are extracted from Reddit content via BeautifulSoup) -

@dataclass
class RedditContent:
    title: str
    link: int
    id: str
    content: str
    extracted_text: Optional[str]
    image_alt_text: Optional[str]
    updated: datetime
    author_uri: str
    author_name: str
    category: str

The output is given with UTF-8 charsets, if you are scraping non-english reddits then set the environment to use UTF -

export LANG=en_US.UTF-8
export PYTHONIOENCODING=utf-8

FAQs

What is reddit-rss-reader?

Is reddit-rss-reader well maintained?

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

reddit-rss-reader

Reddit RSS Reader

Installation

How to use

Related posts

PyPI Now Supports iOS and Android Wheels for Mobile Python Development

Create React App Officially Deprecated Amid React 19 Compatibility Issues