Socket
Socket
Sign inDemoInstall

puppeteer-infinite-scroll

Package Overview
Dependencies
41
Maintainers
1
Versions
2
Alerts
File Explorer

Advanced tools

Install Socket

Detect and block malicious and high-risk dependencies

Install

    puppeteer-infinite-scroll

Just a helper to scrape data in sites that use infinete scroll. ``` npm install puppeter-infinite-scroll ```


Version published
Maintainers
1
Created

Readme

Source

puppeter-infinite-scroll

Just a helper to scrape data in sites that use infinete scroll.

npm install puppeter-infinite-scroll

The problem

in most of solution that I found use a timing to scroll down the webpage and evaluate what you need, but if the request or network slow down and take more time than defined in the code and then the scraper just fail.

See working

npm run test

How use?

const puppeteerInfiniteScroll = require('./src/puppeter-infinite-scroll')

;(async ()=>{
try {
  const browser = new puppeteerInfiniteScroll()
  await browser.start()
  await browser.open({
    url: 'https://medium.com/search?q=python',
    endpoint: 'https://medium.com/search/posts?q',
    loadImages: false,
    onResponse: (res)=>{
      //console.log(res)
    },
    onScroll: ()=>{
      console.log(`onScroll ${browser.scrollCount}`)

    }
  })
} catch (e) {
  console.error(e)
}

})()

async browser.start() = puppeteer.lauch(opts)

    //params(opts)
    //default: { headless: false, devtools: true }
    await browser.start()

async browser.open()

this method create a new page. setViewport({ width: 1280, height: 926 }), setRequestInterception(true)

    //params(opts)
    //default: { url, onResponse, onScroll, loadImages = true, endpoint }
    //url = 'https://medium.com/search?q=python' - url of the page to be loaded
    //endpoint = 'https://medium.com/search?q=python' - endpoint wich load content to page
    //loadImages = true - if you need to prevent to load images set to false
    //onResponse = (response)=>{ } - if you need do something with request object
    //onScroll = ()=>{} - trigged after every scroll

    await browser.open({
    url: 'https://medium.com/search?q=python',
    endpoint: 'https://medium.com/search/posts?q',
    loadImages: false,
    onResponse: (res)=>{
      //console.log(res)
    },
    onScroll: ()=>{
      console.log(`onScroll ${browser.scrollCount}`)
    }
  })

FAQs

Last updated on 06 Jun 2018

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc