Crawler is a ready-to-use web spider that works with proxies, asynchrony, rate limit, configurable request pools, jQuery, and HTTP/2 support.
Search for anything on web.
A library for efficiently walking a directory recursively
This repository contains a list of of HTTP user-agents used by robots, crawlers, and spiders as in single JSON file.
🤖/👨🦰 Recognise bots/crawlers/spiders using the user agent string.
Utilities to build Storybook crawling tools with Puppeteer
Pure javascript cross-platform module to extract text from PDFs.
HTTP request module customized for crawlers.
Very straightforward, event driven web crawler. Features a flexible queue interface and a basic cache mechanism with extensible backend.
A library to recursively retrieve and serialize Notion pages with customization for machine learning applications.
Stealth mode: Applies various techniques to make detection of headless puppeteer harder.
pure nodejs OPCUA SDK - module client-crawler (deprecated - use @sterfive/crawler module instead)
suplaser.cn download tool
[](https://www.npmjs.com/package/recrawl-sync) [](https://github.com/aleclarson/recrawl/actions/workflows/release.yml)
This is an ES6 adaptation of the original PHP library CrawlerDetect, this library will help you detect bots/crawlers/spiders vie the useragent.
JavaScript SDK for Firecrawl API
Apify API client for JavaScript
A web crawler that works with prember to discover URLs in your app
A set of shared utilities that can be used by crawlers
The scalable web crawling and scraping library for JavaScript/Node.js. Enables development of data extraction and web automation jobs (not only) with headless Chrome and Puppeteer.
The scalable web crawling and scraping library for JavaScript/Node.js. Enables development of data extraction and web automation jobs (not only) with headless Chrome and Puppeteer.
The scalable web crawling and scraping library for JavaScript/Node.js. Enables development of data extraction and web automation jobs (not only) with headless Chrome and Puppeteer.
The scalable web crawling and scraping library for JavaScript/Node.js. Enables development of data extraction and web automation jobs (not only) with headless Chrome and Puppeteer.
The scalable web crawling and scraping library for JavaScript/Node.js. Enables development of data extraction and web automation jobs (not only) with headless Chrome and Puppeteer.
The scalable web crawling and scraping library for JavaScript/Node.js. Enables development of data extraction and web automation jobs (not only) with headless Chrome and Puppeteer.
The scalable web crawling and scraping library for JavaScript/Node.js. Enables development of data extraction and web automation jobs (not only) with headless Chrome and Puppeteer.
Templates for the crawlee projects
The scalable web crawling and scraping library for JavaScript/Node.js. Enables development of data extraction and web automation jobs (not only) with headless Chrome and Puppeteer.
The scalable web crawling and scraping library for JavaScript/Node.js. Enables development of data extraction and web automation jobs (not only) with headless Chrome and Puppeteer.
Used to run a web crawler that checks for errors on specified pages.
TypeScript definitions for crawler
A web crawler made for the SEO based on plugins. Please wait or contribute ... still in beta
TypeScript definitions for simplecrawler
Promptbook: Run AI apps in plain human language across multiple models and platforms
Parser for XML Sitemaps to be used with Robots.txt and web crawlers
JavaScript module detecting bots/crawlers/spiders via user-agent
MCP server for Firecrawl web scraping integration. Supports both cloud and self-hosted instances. Features include web scraping, search, batch processing, structured data extraction, and LLM-powered content analysis.
An `URL` parser for crawling purpose.
Parse robot directives within HTML meta and/or HTTP headers.
JS client for WebcrawlerAPI
TypeScript definitions for x-ray-crawler
This plugin links your Netlify site with Algolia's Crawler. It will trigger a crawl on each successful build.