What is puppeteer-core?
The puppeteer-core package is a version of Puppeteer, a Node library which provides a high-level API to control headless Chrome or Chromium over the DevTools Protocol. It is intended to be a lightweight version that can be used when you want to bring your own browser. It does not download any browsers by default, unlike the full puppeteer package.
What are puppeteer-core's main functionalities?
Page Automation
Automate and control a web page, including navigation, screenshot taking, and DOM manipulation.
const puppeteer = require('puppeteer-core');
(async () => {
const browser = await puppeteer.launch({executablePath: '/path/to/your/Chrome'});
const page = await browser.newPage();
await page.goto('https://example.com');
await page.screenshot({path: 'example.png'});
await browser.close();
})();
Form Submission
Automate form submissions by typing into fields and clicking buttons.
const puppeteer = require('puppeteer-core');
(async () => {
const browser = await puppeteer.launch({executablePath: '/path/to/your/Chrome'});
const page = await browser.newPage();
await page.goto('https://example.com/login');
await page.type('#username', 'myUsername');
await page.type('#password', 'myPassword');
await page.click('#submit');
await page.waitForNavigation();
await browser.close();
})();
Web Scraping
Extract data from web pages by running JavaScript in the context of the page.
const puppeteer = require('puppeteer-core');
(async () => {
const browser = await puppeteer.launch({executablePath: '/path/to/your/Chrome'});
const page = await browser.newPage();
await page.goto('https://example.com');
const data = await page.evaluate(() => {
return document.querySelector('h1').textContent;
});
console.log(data);
await browser.close();
})();
PDF Generation
Generate PDFs of web pages for offline viewing or archiving.
const puppeteer = require('puppeteer-core');
(async () => {
const browser = await puppeteer.launch({executablePath: '/path/to/your/Chrome'});
const page = await browser.newPage();
await page.goto('https://example.com', {waitUntil: 'networkidle0'});
await page.pdf({path: 'example.pdf', format: 'A4'});
await browser.close();
})();
Automated Testing
Perform automated testing on web applications, including end-to-end tests, performance testing, and more.
const puppeteer = require('puppeteer-core');
(async () => {
const browser = await puppeteer.launch({executablePath: '/path/to/your/Chrome', headless: false});
const page = await browser.newPage();
await page.goto('https://example.com');
// Perform various tests, like checking if a button exists
const buttonExists = await page.$('button') !== null;
console.assert(buttonExists, 'Button should exist on the page');
await browser.close();
})();
Other packages similar to puppeteer-core
playwright
Playwright is a Node library to automate the Chromium, WebKit, and Firefox browsers with a single API. It is similar to puppeteer-core but provides support for multiple browsers out of the box. It also offers additional features like network interception and emulation capabilities.
selenium-webdriver
Selenium WebDriver is one of the most well-known tools for automated web testing. It supports multiple browsers and languages, making it a versatile choice for web automation. Compared to puppeteer-core, Selenium is more mature and has a larger community but can be slower and more complex to set up.
nightmare
Nightmare is a high-level browser automation library. It is built on top of Electron, which is a framework for creating native applications with web technologies. Nightmare is designed to be simpler and more approachable than Puppeteer, but it is less powerful and only works with Electron's version of Chromium.
cypress
Cypress is a front-end testing tool built for the modern web. It is both a library for writing automated tests and a test runner that can execute them. Cypress is more focused on testing than general browser automation and provides a rich interactive interface for developing tests.
Puppeteer
Puppeteer is a Node.js library which provides a high-level API to control Chrome/Chromium over the DevTools Protocol. Puppeteer runs in headless mode by default, but can be configured to run in full (non-headless) Chrome/Chromium.
What can I do?
Most things that you can do manually in the browser can be done using Puppeteer! Here are a few examples to get you started:
- Generate screenshots and PDFs of pages.
- Crawl a SPA (Single-Page Application) and generate pre-rendered content (i.e. "SSR" (Server-Side Rendering)).
- Automate form submission, UI testing, keyboard input, etc.
- Create an automated testing environment using the latest JavaScript and browser features.
- Capture a timeline trace of your site to help diagnose performance issues.
- Test Chrome Extensions.
Getting Started
Installation
To use Puppeteer in your project, run:
npm i puppeteer
When you install Puppeteer, it automatically downloads a recent version of Chromium (~170MB macOS, ~282MB Linux, ~280MB Windows) that is guaranteed to work with Puppeteer. For a version of Puppeteer without installation, see puppeteer-core
.
Environment Variables
Puppeteer looks for certain environment variables for customizing behavior.
If Puppeteer doesn't find them in the environment during the installation step, a lowercased variant of these variables will be used from the npm config.
HTTP_PROXY
, HTTPS_PROXY
, NO_PROXY
- defines HTTP proxy settings that are used to download and run the browser.PUPPETEER_CACHE_DIR
- defines the directory to be used by Puppeteer for caching. Defaults to os.homedir()/.cache/puppeteer
.PUPPETEER_SKIP_CHROMIUM_DOWNLOAD
- do not download bundled Chromium during installation step.PUPPETEER_TMP_DIR
- defines the directory to be used by Puppeteer for creating temporary files. Defaults to os.tmpdir()
.PUPPETEER_DOWNLOAD_HOST
- specifies the URL prefix that is used to download Chromium. Note: this includes protocol and might even include path prefix. Defaults to https://storage.googleapis.com
.PUPPETEER_DOWNLOAD_PATH
- specifies the path for the downloads folder. Defaults to <cache>/chromium
, where <cache>
is Puppeteer's cache directory.PUPPETEER_BROWSER_REVISION
- specifies a certain version of the browser you'd like Puppeteer to use. See puppeteer.launch
on how executable path is inferred.PUPPETEER_EXECUTABLE_PATH
- specifies an executable path to be used in puppeteer.launch
.PUPPETEER_PRODUCT
- specifies which browser you'd like Puppeteer to use. Must be either chrome
or firefox
. This can also be used during installation to fetch the recommended browser binary. Setting product
programmatically in puppeteer.launch
supersedes this environment variable.PUPPETEER_EXPERIMENTAL_CHROMIUM_MAC_ARM
— specify Puppeteer download Chromium for Apple M1. On Apple M1 devices Puppeteer by default downloads the version for Intel's processor which runs via Rosetta. It works without any problems, however, with this option, you should get more efficient resource usage (CPU and RAM) that could lead to a faster execution time.
Environment variables except for PUPPETEER_CACHE_DIR
are not used for puppeteer-core
since core does not automatically handle browser downloading.
puppeteer-core
Every release since v1.7.0 we publish two packages:
puppeteer
is a product for browser automation. When installed, it downloads a version of
Chromium, which it then drives using puppeteer-core
. Being an end-user product, puppeteer
supports a bunch of convenient PUPPETEER_*
env variables to tweak its behavior.
puppeteer-core
is a library to help drive anything that supports DevTools protocol. puppeteer-core
doesn't download Chromium when installed. Being a library, puppeteer-core
is fully driven through its programmatic interface.
You should only use puppeteer-core
if you are connecting to a remote browser or managing browsers yourself. If you are managing browsers yourself, you will need to call puppeteer.launch
with an explicit executablePath
or channel
.
When using puppeteer-core
, remember to change the import:
import puppeteer from 'puppeteer-core';
Usage
Puppeteer follows the latest maintenance LTS version of Node.
Puppeteer will be familiar to people using other browser testing frameworks. You launch/connect a browser, create some pages, and then manipulate them with Puppeteer's API.
For more in-depth usage, check our guides and examples.
Example
The following example searches developers.google.com/web for articles tagged "Headless Chrome" and scrape results from the results page.
import puppeteer from 'puppeteer';
(async () => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto('https://developers.google.com/web/');
await page.type('.devsite-search-field', 'Headless Chrome');
const allResultsSelector = '.devsite-suggest-all-results';
await page.waitForSelector(allResultsSelector);
await page.click(allResultsSelector);
const resultsSelector = '.gsc-results .gs-title';
await page.waitForSelector(resultsSelector);
const links = await page.evaluate(resultsSelector => {
return [...document.querySelectorAll(resultsSelector)].map(anchor => {
const title = anchor.textContent.split('|')[0].trim();
return `${title} - ${anchor.href}`;
});
}, resultsSelector);
console.log(links.join('\n'));
await browser.close();
})();
Default runtime settings
1. Uses Headless mode
Puppeteer launches Chromium in headless mode. To launch a full version of Chromium, set the headless
option when launching a browser:
const browser = await puppeteer.launch({headless: false});
2. Runs a bundled version of Chromium
By default, Puppeteer downloads and uses a specific version of Chromium so its API
is guaranteed to work out of the box. To use Puppeteer with a different version of Chrome or Chromium,
pass in the executable's path when creating a Browser
instance:
const browser = await puppeteer.launch({executablePath: '/path/to/Chrome'});
You can also use Puppeteer with Firefox Nightly (experimental support). See Puppeteer.launch
for more information.
See this article
for a description of the differences between Chromium and Chrome. This article
describes some differences for Linux users.
3. Creates a fresh user profile
Puppeteer creates its own browser user profile which it cleans up on every run.
Using Docker
See our guide on using Docker.
Using Chrome Extensions
See our guide on using Chrome extensions.
Resources
Contributing
Check out our contributing guide to get an overview of Puppeteer development.
FAQ
Our FAQ has migrated to our site.