
Security News
Vite Releases Technical Preview of Rolldown-Vite, a Rust-Based Bundler
Vite releases Rolldown-Vite, a Rust-based bundler preview offering faster builds and lower memory usage as a drop-in replacement for Vite.
universal-pdp-scrapper
Advanced tools
This is a universal PDP scrapper that can scrape any product detail page and extract the following information using ai and custom logic:
product_name?: string;
images?: string[];
height?: string;
width?: string;
depth?: string;
material?: string;
price?: string;
sku?: string;
artist?: string;
type?: Types;
product_url?: string;
source?: string;
glbs?: string[];
glb_to_use?: string;
description?: string;
tags?: string;
'supporting-surface'?: 'floor' | 'wall';
Under the hood, it uses OpenAI to extract estimated information from the product page and Google Custom Search JSON API / SerpAPI to extract images from product pages using Google Image Search coupled with Ikea's product search API and some custom scrappers for some popular furniture websites.
npm i universal-pdp-scrapper
import { UniversalPDPScrapper } from 'universal-pdp-scrapper';
// Initialize the client and set the API keys
// You can set API Keys using environment variables as well: check [.env.sample](./.env.sample)
const client = new UniversalPDPScrapper({
openaiApiKey: '',
openaiOrgId: '',
openaiModelId: '',
googleApiKey: '',
googleCseId: ''
});
const result = await client.scrape('https://www.ikea.com/us/en/p/jokkmokk-table-and-4-chairs-antique-stain-50211104/');
console.log(result);
Here's a demo of running the scrapper in a server environment integrated with a React app
Check here.
NOTE: This library doesn't solve the issue of CORS for images or glbs. If you encounter cors, its better to use this library in the server environment and download the images and glbs to your server and serve them from there.
FAQs
A universal pdp scrapper using cheerio & ChatGPT
The npm package universal-pdp-scrapper receives a total of 67 weekly downloads. As such, universal-pdp-scrapper popularity was classified as not popular.
We found that universal-pdp-scrapper demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 0 open source maintainers collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Vite releases Rolldown-Vite, a Rust-based bundler preview offering faster builds and lower memory usage as a drop-in replacement for Vite.
Research
Security News
A malicious npm typosquat uses remote commands to silently delete entire project directories after a single mistyped install.
Research
Security News
Malicious PyPI package semantic-types steals Solana private keys via transitive dependency installs using monkey patching and blockchain exfiltration.