Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More
Socket
Sign inDemoInstall
Socket

@axetroy/crawler

Package Overview
Dependencies
Maintainers
1
Versions
3
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

@axetroy/crawler

crawler framework for nodejs

  • 0.1.0
  • Source
  • npm
  • Socket score

Version published
Weekly downloads
0
decreased by-100%
Maintainers
1
Weekly downloads
 
Created
Source

Build Status npm version 996.icu LICENSE

Nodejs 的爬虫框架

一个快速,简单,易用的 nodejs 爬虫框架.

WARNING: 当前仍处于开发阶段,任何存在的特性都可能会更改或者移除

特性

  • 访问频率限制
  • 暂停和恢复爬取
  • 连接上一次记录续爬
  • 内置的 jQuery 选择器
  • 内置的资源下载工具
  • IP 代理设置
  • 数据统计
  • 自定义 HTTP 方法、请求体
  • 错误重试

快速开始

npm install @axetroy/crawler
import { Crawler, Provider, Response } from "@axetroy/crawler";

class ScrapinghubProvider implements Provider {
  name = "scrapinghub";
  urls = ["https://blog.scrapinghub.com"];
  async parse($: Response) {
    const $nextPage = $("a.next-posts-link").eq(0);

    if ($nextPage) {
      $.follow($nextPage.prop("href"));
    }

    return $(".post-header>h2")
      .map((_, el) => $(el).text())
      .get();
  }
}

const spider = new Crawler(ScrapinghubProvider, {
  timeout: 1000 * 5,
  retry: 3
});

spider.on("data", (articles: string[]) => {
  for (const article of articles) {
    process.stdout.write(article + "\n");
  }
});

spider.on("error", (err, task) => {
  console.log(`request fail on ${task.url}: ${err.message}`);
});

spider.on("finish", () => {
  process.stdout.write("finish...\n");
});

spider.start();

API Reference

API Reference

Example

如何运行 demo ?

> npx ts-node examples/basic.ts

这里有相关的例子

捐赠我

如果你觉得这个项目能帮助到你,可以考虑 支付宝扫码(或搜索 511118132)领红包 支持我

甚至可以请我喝一杯 ☕️

微信支付宝支付宝红包

License

The Anti 996 License

FAQs

Package last updated on 08 May 2019

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc