HTTP request module customized for crawlers.
a module to interact with the crawler tables stored in hbase
A simple command line app for controlling a GitHub Crawler
用于分析html文档中所需的数据
wordpress数据导入插件
Lightweight crawler written in TypeScript using ES6 generators.
用于分析html文档中所需的数据
[![NPM version][npm-image]][npm-url] [![Build Status][travis-image]][travis-url] [![Dependency Status][depstat-image]][depstat-url] [![Downloads][download-badge]][npm-url]
A web spider of hangzhou
爬虫调度程序
Lightweight crawler written in TypeScript using ES6 generators.
A web crawler capable of traversing any site with custom environmental variables.
Device detection cloud services for the 51Degrees Pipeline API
[![Main](https://github.com/ahm-monash/crawler/actions/workflows/main.yml/badge.svg)](https://github.com/ahm-monash/crawler/actions/workflows/main.yml)
Crawl orderbook and trade messages from crypto exchanges.
A simple email extractor for obfuscated emails.
Pure javascript cross-platform module to extract text from PDFs.
[![NPM](https://nodei.co/npm/botium-crawler.png?downloads=true&downloadRank=true&stars=true)](https://nodei.co/npm/botium-crawler/)
a simplified directed web crawler, easy to use for scraping pages and downloading resources of page.
Yet another node torrent scraper based on x-ray. (Support iptorrents, torrentleech, torrent9, Yyggtorrent, ThePiratebay, torrentz2, 1337x, KickassTorrent, Rarbg, TorrentProject, Yts, Limetorrents, Eztv)
Crawl 100% JS single page apps with phantomjs and node.
for PTT Beauty Board only
Headless Chrome abstraction to simplify the interaction with the browser. It may be used for crawling sites, test automation, etc