
Research
Malicious npm Packages Impersonate Flashbots SDKs, Targeting Ethereum Wallet Credentials
Four npm packages disguised as cryptographic tools steal developer credentials and send them to attacker-controlled Telegram infrastructure.
Python library to work with ARC and WARC files, with fixes for ClueWeb09
Note: This is a fork of the original (now dead) warc repository.
Updated to handle problems with the ClueWeb09_ files.
.. _ClueWeb09: https://lemurproject.org/clueweb09/
Changes are based on this repository_ (which only supports python2)
.. _repository: https://github.com/cdegroc/warc-clueweb/blob/clueweb09/warc/warc.py
WARC (Web ARChive) is a file format for storing web crawls.
This warc
library makes it very easy to work with WARC files.::
import warc
with warc.open("test.warc") as f:
for record in f:
print(record['WARC-Target-URI'], record['Content-Length'])
And WET files.::
import warc
with warc.open("test.warc.wet") as f:
for record in f:
print(record['WARC-Target-URI'], record['Content-Length'])
The documentation of the warc library is available at http://warc.readthedocs.org/.
Apart from the install from pip, which will not work for this warc3 version, the interface as described there is unchanged.
This software is licensed under GPL v2. See LICENSE_ file for details.
.. LICENSE: http://github.com/internetarchive/warc/blob/master/LICENSE
Original Python2 Versions:
Python3 Port:
Modification
0.2.5 replace utf8 errors in headers
0.2.4 support ClueWeb09
0.2.3 Support seeking in WARC/WET
0.2.2 Allow WET parse
older... see https://github.com/internetarchive/warc
FAQs
Python library to work with ARC and WARC files, with fixes for ClueWeb09
We found that warc3-wet-clueweb09 demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Research
Four npm packages disguised as cryptographic tools steal developer credentials and send them to attacker-controlled Telegram infrastructure.
Security News
Ruby maintainers from Bundler and rbenv teams are building rv to bring Python uv's speed and unified tooling approach to Ruby development.
Security News
Following last week’s supply chain attack, Nx published findings on the GitHub Actions exploit and moved npm publishing to Trusted Publishers.