![Oracle Drags Its Feet in the JavaScript Trademark Dispute](https://cdn.sanity.io/images/cgdhsj6q/production/919c3b22c24f93884c548d60cbb338e819ff2435-1024x1024.webp?w=400&fit=max&auto=format)
Security News
Oracle Drags Its Feet in the JavaScript Trademark Dispute
Oracle seeks to dismiss fraud claims in the JavaScript trademark dispute, delaying the case and avoiding questions about its right to the name.
pip3 install simple-crawler
Set environment AUTO_CHARSET=1
to pass bytes
to beautifulsoup4 and let it detect the charset.
URL
: define a URLURLExt
: class to handle URL
Page
: define a request result of a URL
url
: type URL
content
, text
, json
: response content properties from library requests
type
: the response body type, is a enum which allows BYTES
, TEXT
, HTML
, JSON
is_html
: check whether is html accorrding to the response headers's Content-Type
soup
: BeautifulSoup
contains html if is_html
Crawler
: schedule the crawler by calling handler_page()
recusivelyfrom simple_crawler import *
class MyCrawler(Crawler):
name = 'output.txt'
aysnc def custom_handle_page(self, page):
print(page.url)
tags = page.soup.select("#container")
tag = tags and tags[0]
with open(self.name, 'a') as f:
f.write(tag.text)
# do some async call
def filter_url(self, url: URL) -> bool:
return url.url.startswith("https://xxx.com/xxx")
loop = get_event_loop(True)
c = MyCrawler("https://xxx.com/xxx", loop, concurrency=10)
schedule_future_in_loop(c.start(), loop=loop)
FAQs
my simple crawler
We found that simple-crawler demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Oracle seeks to dismiss fraud claims in the JavaScript trademark dispute, delaying the case and avoiding questions about its right to the name.
Security News
The Linux Foundation is warning open source developers that compliance with global sanctions is mandatory, highlighting legal risks and restrictions on contributions.
Security News
Maven Central now validates Sigstore signatures, making it easier for developers to verify the provenance of Java packages.