
Security News
Open Source Maintainers Feeling the Weight of the EU’s Cyber Resilience Act
The EU Cyber Resilience Act is prompting compliance requests that open source maintainers may not be obligated or equipped to handle.
A python package that enhances speed and simplicity of parsing robots files.
Basic usage, such as getting robots contents:
import robotsparse
#NOTE: The `find_url` parameter will redirect the url to the default robots location.
robots = robotsparse.getRobots("https://github.com/", find_url=True)
print(list(robots)) # output: ['user-agents']
The user-agents
key will contain each user-agent found in the robots file contents along with information associated with them.
Alternatively, we can assign the robots contents as an object, which allows faster accessability:
import robotsparse
# This function returns a class.
robots = robotsparse.getRobotsObject("https://duckduckgo.com/", find_url=True)
assert isinstance(robots, object)
print(robots.allow) # Prints allowed locations
print(robots.disallow) # Prints disallowed locations
print(robots.crawl_delay) # Prints found crawl-delays
print(robots.robots) # This output is equivalent to the above example
When parsing robots files, it sometimes may be useful to parse sitemap files:
import robotsparse
sitemap = robotsparse.getSitemap("https://pypi.org/", find_url=True)
The above code contains a variable named sitemap
which contains information that looks like this:
[{"url": "", "lastModified": ""}]
FAQs
A python package that enhances speed and simplicity of parsing robots files.
We found that robotsparse demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
The EU Cyber Resilience Act is prompting compliance requests that open source maintainers may not be obligated or equipped to handle.
Security News
Crates.io adds Trusted Publishing support, enabling secure GitHub Actions-based crate releases without long-lived API tokens.
Research
/Security News
Undocumented protestware found in 28 npm packages disrupts UI for Russian-language users visiting Russian and Belarusian domains.