Search for anything on web.
A library for efficiently walking a directory recursively
[](https://www.npmjs.com/package/recrawl-sync) [](https://github.com/aleclarson/recrawl/actions/workflows/release.yml)
Stealth mode: Applies various techniques to make detection of headless puppeteer harder.
Pure javascript cross-platform module to extract text from PDFs.
Yet another node torrent scraper based on x-ray. (Support iptorrents, torrentleech, torrent9, Yyggtorrent, ThePiratebay, torrentz2, 1337x, KickassTorrent, Rarbg, TorrentProject, Yts, Limetorrents, Eztv)
A library to test if a url(request) is crawled, usually used in a web crawler. Compatible with `request` and `node-crawler`
Crawler is a web spider written with Nodejs. It gives you the full power of jQuery on the server to parse a big number of pages as they are downloaded, asynchronously
🤖 Fake fingerprints to bypass anti-bot systems. Simulate mouse and keyboard operations to make behavior like a real person.
Very straightforward, event driven web crawler. Features a flexible queue interface and a basic cache mechanism with extensible backend.
TypeScript definitions for crawler
Parse robot directives within HTML meta and/or HTTP headers.
pure nodejs OPCUA SDK - module -client-crawler
Automatically extracts structured information from webpages
This repository contains a list of of HTTP user-agents used by robots, crawlers, and spiders as in single JSON file.
HTTP request module customized for crawlers.
Parser for XML Sitemaps to be used with Robots.txt and web crawlers
Easily create XML sitemaps for your website.
[](https://www.npmjs.com/package/recrawl) [](https://github.com/aleclarson/recrawl/actions/workflows/release.yml) [![codeco
This is an ES6 adaptation of the original PHP library CrawlerDetect, this library will help you detect bots/crawlers/spiders vie the useragent.
A tiny node module to detect spiders/crawlers quickly and comes with optional middleware for ExpressJS
A web crawler that works with prember to discover URLs in your app
JavaScript module detecting bots/crawlers/spiders via user-agent
Pure javascript cross-platform module to extract page count from PDFs, based on pdf-parser.
Html Metadata scraper and parser for Node.js
A blazing fast recursive directory crawler with lazy sync and async iterator support.
Pure javascript cross-platform module to extract text from PDFs.
Crawler (spider) of site web pages by domain name
Headless Chrome abstraction to simplify the interaction with the browser. It may be used for crawling sites, test automation, etc
gRPC tokio based web crawler