
Security News
Django Joins curl in Pushing Back on AI Slop Security Reports
Django has updated its security policies to reject AI-generated vulnerability reports that include fabricated or unverifiable content.
redis-web-crawler
Advanced tools
Web Crawler to create directed graph of links among connected sites. Runs with Node.js and stores data with Redis
Read the blog post.
Web Crawler built with NodeJS. Fetch site data from a given URL and recursively follow links across the web.
Search the sites with either breadth first search, or depth first search.
Every URL will be saved to a Graph (using an adjacency list). The Graph is stored with Redis.
npm install --save redis-web-crawler
Run a local redis server to store output:
$ redis-server
Create a new crawler instance and pass in a configuration object. Call the run
method to begin crawling.
import WebCrawler from 'redis-web-crawler';
const crawlerSettings = {
startUrl: 'https://en.wikipedia.org/wiki/Main_Page',
followInternalLinks: false,
searchDepthLimit: null,
searchAlgorithm: "breadthFirstSearch",
}
var crawler = new WebCrawler(crawlerSettings);
crawler.run();
Name | Type | Description |
---|---|---|
startUrl | string | A valid URL off a page with links. |
followInternalLinks | boolean | Toggle searching through internal site links |
searchDepthLimit | integer | Set a limit on the recursive URL requests |
searchAlgorithm | string | "breadthFirstSearch" or "depthFirstSearch" |
slave
and port
of the redis-server (e.g. 6371)./bin/redis-dump -u 127.0.0.1:6371 > db_full.json
db_full.json
spencerlepine.com · GitHub @spencerlepine · Twitter @spencerlepine
FAQs
Web Crawler to create directed graph of links among connected sites. Runs with Node.js and stores data with Redis
We found that redis-web-crawler demonstrated a not healthy version release cadence and project activity because the last version was released a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Django has updated its security policies to reject AI-generated vulnerability reports that include fabricated or unverifiable content.
Security News
ECMAScript 2025 introduces Iterator Helpers, Set methods, JSON modules, and more in its latest spec update approved by Ecma in June 2025.
Security News
A new Node.js homepage button linking to paid support for EOL versions has sparked a heated discussion among contributors and the wider community.