Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More
Socket
Sign inDemoInstall
Socket

puppeteer-page-proxy

Package Overview
Dependencies
Maintainers
1
Versions
15
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

puppeteer-page-proxy

Additional Node.js module to use with 'puppeteer' for setting proxies per page basis.

  • 1.2.4
  • Source
  • npm
  • Socket score

Version published
Weekly downloads
2.6K
decreased by-8.91%
Maintainers
1
Weekly downloads
 
Created
Source

puppeteer-page-proxy

Additional Node.js module to use with puppeteer for setting proxies per page basis.

Forwards intercepted requests from the browser to Node.js where it handles the request then returns the response to the browser, changing the proxy as a result.

Features

  • Proxy per page and per request
  • Supports ( http, https, socks4, socks5 ) proxies
  • Authentication
  • Cookie handling internally

Installation

npm i puppeteer-page-proxy

API

PageProxy(pageOrReq, proxy)
  • pageOrReq <object> 'Page' or 'Request' object to set a proxy for.
  • proxy <string> Proxy to use in the current page.
    • Begins with a protocol (e.g. http://, https://, socks://)
PageProxy.lookup(page[, lookupService, isJSON, timeout])
  • page <object> 'Page' object to execute the request on.
  • lookupService <string> External lookup service to request data from.
    • Fetches data from api.ipify.org by default.
  • isJSON <boolean> Whether to JSON.parse the received response.
    • Defaults to true.
  • timeout <number|string> Time in milliseconds after which the request times out.
    • Defaults to 30000.
  • returns: <Promise> Promise which resolves to the response of the lookup request.

NOTE: By default this method expects a response in JSON format and JSON.parse's it to a usable javascript object. To disable this functionality, set isJSON to false.

Examples

Proxy per page:
const puppeteer = require('puppeteer');
const useProxy = require('puppeteer-page-proxy');

(async () => {
    const site = 'https://example.com';
    const proxy = 'http://host:port';
    const proxy2 = 'https://host:port';
    
    const browser = await puppeteer.launch({headless: false});

    const page = await browser.newPage();
    await useProxy(page, proxy);
    await page.goto(site);

    const page2 = await browser.newPage();
    await useProxy(page2, proxy2);
    await page2.goto(site);
})();

To remove a proxy set this way, simply pass a falsy value (e.g null) instead of the proxy;

await useProxy(page, null);
Proxy per request:
const puppeteer = require('puppeteer');
const useProxy = require('puppeteer-page-proxy');

(async () => {
    const site = 'https://example.com';
    const proxy = 'socks://host:port';

    const browser = await puppeteer.launch({headless: false});
    const page = await browser.newPage();

    await page.setRequestInterception(true);
    page.on('request', req => {
        useProxy(req, proxy);
    });
    await page.goto(site);
})();

The request object itself is passed as the first argument. The proxy can now be changed every request. Leaving it as is will have the same effect as applying a proxy for the whole page by passing in the page object as an argument. Basically, the same proxy will be used for all requests within the page.

Using it with other interception methods is straight forward aswell:

await page.setRequestInterception(true);
page.on('request', req => {
    if (req.resourceType() === 'image') {
        req.abort();
    } else {
        useProxy(req, proxy);
    }
});

All requests can be handled exactly once, so it's not possible to intercept the same request after a proxy has been applied to it. This means that it will not be possible to call (e.g. request.abort, request.continue) on the same request without getting a 'Request is already handled!' error message. This is because puppeteer-page-proxy internally calls request.respond which fulfills the request.

NOTE: It is necessary to set page.setRequestInterception to true when setting proxies this way, otherwise the function will fail.

Authentication:
const proxy = 'https://login:pass@host:port';
Lookup IP used by proxy:
const puppeteer = require('puppeteer');
const useProxy = require('puppeteer-page-proxy');

(async () => {
    const site = 'https://example.com';
    const proxy1 = 'http://host:port';
    const proxy2 = 'https://host:port';
    
    const browser = await puppeteer.launch({headless: false});

    // 1
    const page1 = await browser.newPage();
    await useProxy(page1, proxy1);
    let data = await useProxy.lookup(page1); // Waits until done, 'then' continues
        console.log(data.ip);
    await page1.goto(site);
    
    // 2
    const page2 = await browser.newPage();
    await useProxy(page2, proxy2);
    useProxy.lookup(page2).then(data => {   // Executes and 'comes back' once done
        console.log(data.ip);
    });
    await page2.goto(site);
})();

FAQ

How does puppeteer-page-proxy work?

It takes over the task of requesting resources from the browser to instead do it internally. This means that the requests that the browser is usually supposed to make directly, are instead intercepted and made indirectly via Node using a requests library. This naturally means that Node also receives the responses that the browser would have normally received from those requests. For changing the proxy, the requests are routed through the specified proxy server using *-proxy-agent's. The responses are then forwarded back to the browser as mock/simulated responses using the request.respond method, making the browser think that a response has been received from the server, thus fulfilling the request and rendering any content from the response onto the screen.

Why does the browser show "Your connection to this site is not secure" when connecting to https sites?

This is simply because the server and the browser are unable perform the secure handshakes for the connections due to the requests being intercepted and effectively blocked by Node when forwarding responses to the browser. However, despite the browser alerting of an insecure connection, the requests are infact made securely through Node as seen from the connection property of the response object:

connection: TLSSocket {
    _tlsOptions: {
        secureContext: [SecureContext],
        requestCert: true,
        rejectUnauthorized: true,
    },
    _secureEstablished: true,
    authorized: true,
    encrypted: true,
}

While a proxy is applied, the browser is just an empty drawing board used for rendering content on the screen. All the network requests and responses, both secure and non-secure, are made by Node. Because of this, it makes no difference whether the site in the browser is shown as insecure or not.

Dependencies

Keywords

FAQs

Package last updated on 18 May 2020

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc