Search for anything on web.
A library for efficiently walking a directory recursively
This repository contains a list of of HTTP user-agents used by robots, crawlers, and spiders as in single JSON file.
Utilities to build Storybook crawling tools with Puppeteer
HTTP request module customized for crawlers.
Very straightforward, event driven web crawler. Features a flexible queue interface and a basic cache mechanism with extensible backend.
Pure javascript cross-platform module to extract text from PDFs.
A library to recursively retrieve and serialize Notion pages with customization for machine learning applications.
pure nodejs OPCUA SDK - module client-crawler (deprecated - use @sterfive/crawler module instead)
Stealth mode: Applies various techniques to make detection of headless puppeteer harder.
[![npm](https://img.shields.io/npm/v/recrawl-sync.svg)](https://www.npmjs.com/package/recrawl-sync) [![ci](https://github.com/aleclarson/recrawl/actions/workflows/release.yml/badge.svg)](https://github.com/aleclarson/recrawl/actions/workflows/release.yml)
An `URL` parser for crawling purpose.
A web crawler that works with prember to discover URLs in your app
This is an ES6 adaptation of the original PHP library CrawlerDetect, this library will help you detect bots/crawlers/spiders vie the useragent.
Apify API client for JavaScript
Parser for XML Sitemaps to be used with Robots.txt and web crawlers
Used to run a web crawler that checks for errors on specified pages.
TypeScript definitions for npm-license-crawler
JavaScript SDK for Firecrawl API
TypeScript definitions for crawler
TypeScript definitions for simplecrawler
TypeScript definitions for x-ray-crawler
Parse robot directives within HTML meta and/or HTTP headers.
Supercharge your use of large language models
JavaScript module detecting bots/crawlers/spiders via user-agent
SDK to interact with the web-crawler service
Automatically extracts structured information from webpages
This plugin links your Netlify site with Algolia's Crawler. It will trigger a crawl on each successful build.