Comparing version 3.2.11 to 3.2.12
{ | ||
"name": "x-crawl", | ||
"version": "3.2.11", | ||
"version": "3.2.12", | ||
"author": "coderHXL", | ||
@@ -5,0 +5,0 @@ "description": "x-crawl is a flexible nodejs crawler library.", |
@@ -44,16 +44,16 @@ # x-crawl [![npm](https://img.shields.io/npm/v/x-crawl.svg)](https://www.npmjs.com/package/x-crawl) [![GitHub license](https://img.shields.io/badge/license-MIT-blue.svg)](https://github.com/coder-hxl/x-crawl/blob/main/LICENSE) | ||
- [API](#API) | ||
* [x-crawl](#x-crawl-2) | ||
+ [Type](#Type-1) | ||
* [xCrawl](#xCrawl) | ||
+ [Type](#Type) | ||
+ [Example](#Example-1) | ||
* [crawlPage](#crawlPage) | ||
+ [Type](#Type-2) | ||
+ [Type](#Type-1) | ||
+ [Example](#Example-2) | ||
* [crawlData](#crawlData) | ||
+ [Type](#Type-3) | ||
+ [Type](#Type-2) | ||
+ [Example](#Example-3) | ||
* [crawlFile](#crawlFile) | ||
+ [Type](#Type-4) | ||
+ [Type](#Type-3) | ||
+ [Example](#Example-4) | ||
* [crawlPolling](#crawlPolling) | ||
+ [Type](#Type-5) | ||
+ [Type](#Type-4) | ||
+ [Example](#Example-5) | ||
@@ -68,3 +68,3 @@ - [Types](#Types) | ||
* [XCrawlBaseConfig](#XCrawlBaseConfig) | ||
* [CrawlPageConfig](#CrawlPageConfig ) | ||
* [CrawlPageConfig](#CrawlPageConfig) | ||
* [CrawlBaseConfigV1](#CrawlBaseConfigV1) | ||
@@ -76,3 +76,3 @@ * [CrawlDataConfig](#CrawlDataConfig) | ||
* [CrawlResCommonArrV1](#CrawlResCommonArrV1) | ||
* [CrawlPage](#CrawlPage-2) | ||
* [CrawlPage](#CrawlPage-1) | ||
* [FileInfo](#FileInfo) | ||
@@ -104,19 +104,21 @@ - [More](#More) | ||
// 3.Set the crawling task | ||
// Call the startPolling API to start the polling function, and the callback function will be called every other day | ||
myXCrawl.startPolling({ d: 1 }, (count, stopPolling) => { | ||
myXCrawl.crawlPage('https://zh.airbnb.com/s/*/plus_homes').then((res) => { | ||
const { jsdom } = res // By default, the JSDOM library is used to parse Page | ||
/* | ||
Call the startPolling API to start the polling function, | ||
and the callback function will be called every other day | ||
*/ | ||
myXCrawl.startPolling({ d: 1 }, async (count, stopPolling) => { | ||
// Call crawlPage API to crawl Page | ||
const { jsdom } = await myXCrawl.crawlPage('https://zh.airbnb.com/s/*/plus_homes') | ||
// Get the cover image elements for Plus listings | ||
const imgEls = jsdom.window.document | ||
.querySelector('.a1stauiv') | ||
?.querySelectorAll('picture img') | ||
// Get the cover image elements for Plus listings | ||
const imgEls = jsdom.window.document | ||
.querySelector('.a1stauiv') | ||
?.querySelectorAll('picture img') | ||
// set request configuration | ||
const requestConfig: string[] = [] | ||
imgEls?.forEach((item) => requestConfig.push(item.src)) | ||
// set request configuration | ||
const requestConfig: string[] = [] | ||
imgEls?.forEach((item) => requestConfig.push(item.src)) | ||
// Call the crawlFile API to crawl pictures | ||
myXCrawl.crawlFile({ requestConfig, fileConfig: { storeDir: './upload' } }) | ||
}) | ||
// Call the crawlFile API to crawl pictures | ||
myXCrawl.crawlFile({ requestConfig, fileConfig: { storeDir: './upload' } }) | ||
}) | ||
@@ -143,3 +145,3 @@ ``` | ||
Create a new **application instance** via [xCrawl()](#x-crawl-2): | ||
Create a new **application instance** via [xCrawl()](#xCrawl): | ||
@@ -329,9 +331,6 @@ ```js | ||
myXCrawl. startPolling({ h: 2, m: 30 }, (count, stopPolling) => { | ||
myXCrawl. startPolling({ h: 2, m: 30 }, async (count, stopPolling) => { | ||
// will be executed every two and a half hours | ||
// crawlPage/crawlData/crawlFile | ||
myXCrawl.crawlPage('https://xxx.com').then(res => { | ||
const { jsdom, browser, page } = res | ||
}) | ||
const { jsdom, browser, page } = await myXCrawl.crawlPage('https://xxx.com') | ||
}) | ||
@@ -485,3 +484,3 @@ ``` | ||
### x-crawl | ||
### xCrawl | ||
@@ -525,3 +524,3 @@ Create a crawler instance via call xCrawl. The request queue is maintained by the instance method itself, not by the instance itself. | ||
- Look at the [CrawlPageConfig](#CrawlPageConfig) type | ||
- Look at the [CrawlPage](#CrawlPage-2) type | ||
- Look at the [CrawlPage](#CrawlPage-1) type | ||
@@ -528,0 +527,0 @@ ```ts |
112781
835