puppeteer-proxy 🎎
Proxies Puppeteer Page requests.
Motivation
This package addresses several issues with Puppeteer:
- It allows to set a proxy per Page and per Request (#678)
- It allows to authenticate against proxy when making HTTPS requests (#3253)
The side-benefit of this implementation is that it allows to route all traffic through Node.js, i.e. you can use externally hosted Chrome instance (such as Browserless.io) to render DOM & evaluate JavaScript, and route all HTTP traffic through your Node.js instance.
The downside of this implementation is that it will introduce additional latency, i.e. requests will take longer to execute as request/ response will need to be always exchanged between Puppeteer and Node.js.
Implementation
puppeteer-proxy intercepts requests after it receives the request metadata from Puppeteer. puppeteer-proxy uses Node.js to make the HTTP requests. The response is then returned to the browser. When using puppeteer-proxy, browser never makes outbound HTTP requests.
Setup
You must call page.setRequestInterception(true)
before using pageProxy.proxyRequest
.
API
import {
Agent as HttpAgent,
} from 'http';
import {
Agent as HttpsAgent,
} from 'https';
import type {
Page,
Request,
} from 'puppeteer';
import {
proxyRequest,
} from 'puppeteer-proxy';
type ProxyRequestConfigurationType = {|
+agent?: HttpAgent | HttpsAgent,
+page: Page,
+proxyUrl?: string | { http: string, https: string },
+request: Request,
|};
proxyRequest(configuration: ProxyRequestConfigurationType): PageProxyType;
Usage
Making a GET request using proxy
import puppeteer from 'puppeteer';
import {
proxyRequest,
} from 'puppeteer-proxy';
(async () => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.setRequestInterception(true);
page.on('request', async (request) => {
await proxyRequest({
page,
proxyUrl: 'http://127.0.0.1:3000',
request,
});
});
await page.goto('http://gajus.com');
})();