New Case Study:See how Anthropic automated 95% of dependency reviews with Socket.Learn More
Socket
Sign inDemoInstall
Socket

x-crawl

Package Overview
Dependencies
Maintainers
1
Versions
66
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

x-crawl - npm Package Compare versions

Comparing version 7.1.3 to 8.0.0

2

dist/index.d.ts

@@ -101,3 +101,3 @@ /// <reference types="node" />

crawlPage?: {
launchBrowser?: PuppeteerLaunchOptions
puppeteerLaunch?: PuppeteerLaunchOptions
}

@@ -104,0 +104,0 @@ }

{
"name": "x-crawl",
"version": "7.1.3",
"version": "8.0.0",
"author": "coderHXL",

@@ -34,6 +34,6 @@ "description": "x-crawl is a flexible Node.js multifunctional crawler library.",

"chalk": "4.1.2",
"https-proxy-agent": "^5.0.1",
"puppeteer": "19.10.0"
"https-proxy-agent": "^7.0.1",
"puppeteer": "21.1.0"
},
"devDependencies": {}
}

@@ -138,9 +138,9 @@ # x-crawl · [![npm](https://img.shields.io/npm/v/x-crawl.svg)](https://www.npmjs.com/package/x-crawl) [![GitHub license](https://img.shields.io/badge/license-MIT-blue.svg)](https://github.com/coder-hxl/x-crawl/blob/main/LICENSE)

```js
// 1.Import module ES/CJS
// 1. Import module ES/CJS
import xCrawl from 'x-crawl'
// 2.Create a crawler instance
const myXCrawl = xCrawl({ maxRetry: 3, intervalTime: { max: 3000, min: 2000 } })
// 2. Create a crawler instance
const myXCrawl = xCrawl({ maxRetry: 3, intervalTime: { max: 2000, min: 1000 } })
// 3.Set the crawling task
// 3. Set the crawling task
/*

@@ -151,6 +151,6 @@ Call the startPolling API to start the polling function,

myXCrawl.startPolling({ d: 1 }, async (count, stopPolling) => {
// Call crawlPage API to crawl Page
const res = await myXCrawl.crawlPage({
// Call the crawlPage API to crawl the page
const pageResults = await myXCrawl.crawlPage({
targets: [
'https://www.airbnb.cn/s/experiences',
'https://www.airbnb.cn/s/*/experiences',
'https://www.airbnb.cn/s/plus_homes'

@@ -161,24 +161,24 @@ ],

// Store the image URL to targets
const targets = []
const elSelectorMap = ['._fig15y', '._aov0j6']
for (const item of res) {
// Obtain the image URL by traversing the crawled page results
const imgUrls = []
for (const item of pageResults) {
const { id } = item
const { page } = item.data
const elSelector = id === 1 ? '.i9cqrtb' : '.c4mnd7m'
// Wait for the page to load
await new Promise((r) => setTimeout(r, 300))
// wait for the page element to appear
await page.waitForSelector(elSelector)
// Gets the URL of the page image
const urls = await page.$$eval(`${elSelectorMap[id - 1]} img`, (imgEls) => {
return imgEls.map((item) => item.src)
})
targets.push(...urls)
// Get the URL of the page image
const urls = await page.$$eval(`${elSelector} picture img`, (imgEls) =>
imgEls.map((item) => item.src)
)
imgUrls.push(...urls.slice(0, 8))
// Close page
// close the page
page.close()
}
// Call the crawlFile API to crawl pictures
myXCrawl.crawlFile({ targets, storeDirs: './upload' })
// Call crawlFile API to crawl pictures
await myXCrawl.crawlFile({ targets: imgUrls, storeDirs: './upload' })
})

@@ -190,11 +190,7 @@ ```

<div align="center">
<img src="https://raw.githubusercontent.com/coder-hxl/x-crawl/main/assets/en/crawler.png" />
<img src="https://raw.githubusercontent.com/coder-hxl/x-crawl/main/assets/example.gif" />
</div>
<div align="center">
<img src="https://raw.githubusercontent.com/coder-hxl/x-crawl/main/assets/en/crawler-result.png" />
</div>
**Note:** Please do not crawl randomly, you can check the **robots.txt** protocol before crawling. The class name of the website may change, this is just to demonstrate how to use x-crawl.
**Note:** Do not crawl at will, you can check the **robots.txt** protocol before crawling. This is just to demonstrate how to use x-crawl.
## Core Concepts

@@ -342,3 +338,3 @@

// Cancel running the browser in headless mode
crawlPage: { launchBrowser: { headless: false } }
crawlPage: { puppeteerLaunch: { headless: false } }
})

@@ -1310,3 +1306,3 @@

crawlPage?: {
launchBrowser?: PuppeteerLaunchOptions // puppeteer
puppeteerLaunch?: PuppeteerLaunchOptions // puppeteer
}

@@ -1313,0 +1309,0 @@ }

Sorry, the diff of this file is too big to display

Sorry, the diff of this file is not supported yet

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc