Security News
Oracle Drags Its Feet in the JavaScript Trademark Dispute
Oracle seeks to dismiss fraud claims in the JavaScript trademark dispute, delaying the case and avoiding questions about its right to the name.
puppeteer-infinite-scroller
Advanced tools
Provides a simple and efficient solution for scraping data loaded through infinite scrolling on web pages using Puppeteer.
Puppeteer-Infinite-Scroller provides a simple and efficient solution for scraping data loaded through infinite scrolling on web pages using Puppeteer.
You can install the package using npm:
npm install puppeteer-infinite-scroller
Import the puppeteerInfiniteScroller function from the package and use it to scrape data from infinite scrolling web pages.
const puppeteer = require("puppeteer");
const puppeteerInfiniteScroller = require('puppeteer-infinite-scroller');
(async () => {
const pageUrl = "https://infiniteajaxscroll.com/examples/blocks/";
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.setViewport({
width: 1200,
height: 800,
});
await page.goto(pageUrl);
await page.waitForSelector(".blocks .blocks__block");
const options = {
scrollDelay: 1000, // Milliseconds between scrolls
itemCount: 50, // Number of items to scrape
selector: '.blocks .blocks__block', // CSS selector for items
// OR
// pageFunction: () => { /* Custom page function for scraping */ }
};
const scrapedData = await puppeteerInfiniteScroller(page, options);
console.log(scrapedData);
await browser.close();
})();
The following options can be configured when using the puppeteerInfiniteScroller
function:
scrollDelay
(optional): Milliseconds between scrolls. Default is 1000ms.itemCount
(optional): Number of items to scrape. Default is 10.selector
(optional): CSS selector for the items. Either this or pageFunction
must be provided.pageFunction
(optional): Custom function for scraping data from the page. Either this or selector
must be provided.const puppeteer = require("puppeteer");
const puppeteerInfiniteScroller = require('puppeteer-infinite-scroller');
(async () => {
const pageUrl = "https://infiniteajaxscroll.com/examples/blocks/";
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.setViewport({
width: 1200,
height: 800,
});
await page.goto(pageUrl);
await page.waitForSelector(".blocks .blocks__block");
function extractElements() {
const items = [];
const extractedElements = document.querySelectorAll(".blocks .blocks__block");
for (let element of extractedElements) {
items.push({
class: element.getAttribute("class"),
id: element.getAttribute("id"),
tagName: element.tagName,
});
}
return items;
}
const options = {
scrollDelay: 1000, // Milliseconds between scrolls
itemCount: 50, // Number of items to scrape
pageFunction: extractElements
};
const scrapedData = await puppeteerInfiniteScroller(page, options);
console.log(scrapedData);
await browser.close();
})();
License This project is licensed under the MIT License - see the LICENSE file for details.
FAQs
Provides a simple and efficient solution for scraping data loaded through infinite scrolling on web pages using Puppeteer.
The npm package puppeteer-infinite-scroller receives a total of 1,802 weekly downloads. As such, puppeteer-infinite-scroller popularity was classified as popular.
We found that puppeteer-infinite-scroller demonstrated a not healthy version release cadence and project activity because the last version was released a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Oracle seeks to dismiss fraud claims in the JavaScript trademark dispute, delaying the case and avoiding questions about its right to the name.
Security News
The Linux Foundation is warning open source developers that compliance with global sanctions is mandatory, highlighting legal risks and restrictions on contributions.
Security News
Maven Central now validates Sigstore signatures, making it easier for developers to verify the provenance of Java packages.