Reddit2Text
reddit2text
is the Python library designed to effortlessly transform any Reddit thread into clean, readable text data.
Perfect for feeding to an LLM, performing textual/data analysis, or simply archiving for offline use, reddit2text
offers a straightforward interface to access and convert content from Reddit.
Table of Contents
Features
- Convert any Reddit thread (the post + all its comments) into structured text.
- Include all comments, with the ability to specify the maximum comment depth.
- Configure a custom comment delimiter, for visual separation of nested comments.
Have a Feature Idea?
Simply open an issue on github and tell us what should be added to the next release!
Installation
Easy install using pip
pip3 install reddit2text
Quickstart
First, you need to create a Reddit app to get your client_id and client_secret. Follow the instructions on Reddit's API documentation to set up your application.
Then, replace the client_id
, client_secret
, and user_agent
with your credentials.
The user agent can be anything you like, but we recommend following this convention according to Reddit's guidelines: '<app type>:<app name>:<version> (by <your username>)'
Here's an example:
from reddit2text import Reddit2Text
r2t = Reddit2Text(
client_id='123abc',
client_secret='123abc',
user_agent='script:my_app:v1.0 (by u/reddit2text)'
)
URL = 'https://www.reddit.com/r/MadeMeSmile/comments/1buyr0g/ryan_reynolds_being_wholesome/'
output = r2t.textualize_post(URL)
print(output)
Here is an example (truncated) output from the above code!
https://pastebin.com/mmHFJtcc
- max_comment_depth: Maximum depth of comments to output. Includes the top-most comment. Defaults to
None
or -1
to include all. - comment_delim: String/character used to indent comments according to their nesting level. Defaults to
|
to mimic reddit.
r2t = Reddit2Text(
max_comment_depth=3,
comment_delim='#'
)
Contributions
Contributions to reddit2text are welcome. Please submit pull requests or issues to our GitHub repository.
License
reddit2text is released under the MIT License. See the LICENSE file for more details.