Security News
tea.xyz Spam Plagues npm and RubyGems Package Registries
Tea.xyz, a crypto project aimed at rewarding open source contributions, is once again facing backlash due to an influx of spam packages flooding public package registries.
Readme
rssarchive
is a library for fetching multiple RSS source into SQLite database. It has with functionality of scraping full text via newspaper3k
library.
To install rssarchive
just use with pip:
pip install rssarchive
To use rssarchive
you can use over console or calling as library:
Using via console simply call:
rssarchive
Using as library:
#!/usr/bin/env python
import rssarchive as ra
newra = ra.RssArchive(CONFIG_TEST_MODE=True,CONFIG_FULL_TEXT_MODE = False)
newra.batch_save_rss()
When you run the batch_save_rss()
command the library will create two files in the current directory
After code finishes his task you can view/edit the SQLite file with SQLiteBrowser app.
You can modify the rsslist.csv
file for your own sources and re-run.
When you run code above you may notice the
newra = ra.RssArchive(CONFIG_TEST_MODE=True,CONFIG_FULL_TEXT_MODE = False)
construction. Here all parameters are defined:
CONFIG_DEFAULT_TABLE_NAME = 'tab_headline'
CONFIG_SQLITEDB_URL = "rssarchive.sqlite",
CONFIG_RSS_LIST = "rss_list.csv",
CONFIG_SINGLE_RSS_SOURCE_URL = "https://www.sabah.com.tr/rss/anasayfa.xml",
CONFIG_EASY_DEBUG = True,
CONFIG_TEST_VAR = "suatatan",
CONFIG_TEST_MODE = False,
CONFIG_FULL_TEXT_MODE = True,
Amgong these params just two parameters are critical:
CONFIG_EASY_DEBUG: If True you can show all messages in the code, if false you cannot
CONFIG_FULL_TEXT_MODE: If True library will fetch full text of each URL (it takes time) if False the library will getch RSS only
CONFIG_TEST_MODE: If True the library just fetch two sample resource , if false the code will process all RSS sources in the link (please keep it True for your real projects)
This library is open-source library developed within the turnusol.org project. This project is a social enterpreneurship for detecting hate-speech and fake-news in Turkish. If you want to contribute this library or our project please contact us via turnusol.org
python setup.py sdist bdist_wheel
python -m twine upload --skip-existing --repository testpypi dist/* -u suatatan -p password
FAQs
Archive your RSS into SQLite:
We found that rssarchive demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Tea.xyz, a crypto project aimed at rewarding open source contributions, is once again facing backlash due to an influx of spam packages flooding public package registries.
Security News
As cyber threats become more autonomous, AI-powered defenses are crucial for businesses to stay ahead of attackers who can exploit software vulnerabilities at scale.
Security News
UnitedHealth Group disclosed that the ransomware attack on Change Healthcare compromised protected health information for millions in the U.S., with estimated costs to the company expected to reach $1 billion.