Comparing version 3.2.2 to 3.2.3
{ | ||
"name": "x-crawl", | ||
"version": "3.2.2", | ||
"version": "3.2.3", | ||
"author": "coderHXL", | ||
@@ -5,0 +5,0 @@ "description": "x-crawl is a flexible nodejs crawler library.", |
@@ -5,3 +5,3 @@ # x-crawl | ||
x-crawl is a flexible nodejs crawler library. It is used to batch crawl data, network requests and download file resources. Support crawling data asynchronously or synchronously. Since it runs on nodejs, it is friendly to JS/TS developers. | ||
X-Crawl is a flexible Nodejs reptile bank. Used to crawl pages, batch network requests, and download file resources in batches. There are 5 kinds of RequestConfig writing, 3 ways to obtain results, and crawl data asynchronous or synchronized mode. Run on Nodejs and be friendly to JS/TS developers. | ||
@@ -12,9 +12,10 @@ If you feel good, you can support [x-crawl repository](https://github.com/coder-hxl/x-crawl) with a Star. | ||
- Support asynchronous/synchronous way to crawl data. | ||
- Support Promise/Callback method to get the result. | ||
- Anthropomorphic request interval. | ||
- Crawl pages, JSON, file resources, etc. with simple configuration. | ||
- Polling function, timing crawling. | ||
- The built-in puppeteer crawls the page and uses the jsdom library to parse the page. | ||
- Written in TypeScript, has type hints, and provides generics. | ||
- Cules data for asynchronous/synchronous ways. | ||
- In three ways to obtain the results of the three ways of supporting Promise, Callback, and Promise + Callback. | ||
- RquestConfig has 5 ways of writing. | ||
- The anthropomorphic request interval time. | ||
- In a simple configuration, you can capture pages, JSON, file resources, and so on. | ||
- The rotation function, crawl regularly. | ||
- The built -in Puppeteer crawl the page and uses the JSDOM library to analyze the page, or it can also be parsed by itself. | ||
- Chopening with TypeScript, possessing type prompts, and providing generic types. | ||
@@ -100,3 +101,2 @@ ## Relationship with puppeteer | ||
// 1.Import module ES/CJS | ||
import path from 'node:path' | ||
import xCrawl from 'x-crawl' | ||
@@ -131,9 +131,3 @@ | ||
// Call the crawlFile API to crawl pictures | ||
myXCrawl.crawlFile({ | ||
requestConfig, | ||
fileConfig: { storeDir: path.resolve(__dirname, './upload') } | ||
}) | ||
// Close the browser | ||
browser.close() | ||
myXCrawl.crawlFile({ requestConfig, fileConfig: { storeDir: './upload' } }) | ||
}) | ||
@@ -267,3 +261,2 @@ }) | ||
```js | ||
import path from 'node:path' | ||
import xCrawl from 'x-crawl' | ||
@@ -282,3 +275,3 @@ | ||
fileConfig: { | ||
storeDir: path.resolve(__dirname, './upload') // storage folder | ||
storeDir: './upload' // storage folder | ||
} | ||
@@ -308,5 +301,3 @@ }) | ||
const { jsdom, browser, page } = res | ||
// Close the browser | ||
browser.close() | ||
}) | ||
@@ -424,3 +415,3 @@ }) | ||
requestConfig, | ||
fileConfig: { storeDir: path. resolve(__dirname, './upload') } | ||
fileConfig: { storeDir: './upload' } | ||
}) | ||
@@ -435,3 +426,3 @@ .then((fileInfos) => { | ||
requestConfig, | ||
fileConfig: { storeDir: path. resolve(__dirname, './upload') } | ||
fileConfig: { storeDir: './upload' } | ||
}, | ||
@@ -448,3 +439,3 @@ (fileInfo) => { | ||
requestConfig, | ||
fileConfig: { storeDir: path. resolve(__dirname, './upload') } | ||
fileConfig: { storeDir: './upload' } | ||
}, | ||
@@ -602,3 +593,3 @@ (fileInfo) => { | ||
fileConfig: { | ||
storeDir: path.resolve(__dirname, './upload') // storage folder | ||
storeDir: './upload' // storage folder | ||
} | ||
@@ -605,0 +596,0 @@ }) |
109072
801