Security News
RubyGems.org Adds New Maintainer Role
RubyGems.org has added a new "maintainer" role that allows for publishing new versions of gems. This new permission type is aimed at improving security for gem owners and the service overall.
puppeteer-extra-plugin
Advanced tools
The puppeteer-extra-plugin package is a modular plugin framework for Puppeteer, which allows you to easily extend the functionality of Puppeteer with various plugins. It provides a way to enhance Puppeteer's capabilities, such as stealth mode, ad-blocking, and more.
Stealth Mode
The Stealth Mode feature allows Puppeteer to mimic human-like behavior and avoid detection by anti-bot systems. This is useful for web scraping and automation tasks where detection can be an issue.
const puppeteer = require('puppeteer-extra');
const StealthPlugin = require('puppeteer-extra-plugin-stealth');
puppeteer.use(StealthPlugin());
(async () => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto('https://example.com');
// Perform actions as a stealthy browser
await browser.close();
})();
Ad Blocking
The Ad Blocking feature allows Puppeteer to block ads on web pages, making the browsing experience cleaner and faster. This is particularly useful for scraping content without being interrupted by ads.
const puppeteer = require('puppeteer-extra');
const AdblockerPlugin = require('puppeteer-extra-plugin-adblocker');
puppeteer.use(AdblockerPlugin());
(async () => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto('https://example.com');
// Ads will be blocked on the page
await browser.close();
})();
User Data
The User Data feature allows Puppeteer to save and reuse user data, such as cookies and local storage. This is useful for maintaining sessions and state across different browsing sessions.
const puppeteer = require('puppeteer-extra');
const UserDataPlugin = require('puppeteer-extra-plugin-user-data');
puppeteer.use(UserDataPlugin({
userDataDir: './user_data'
}));
(async () => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto('https://example.com');
// User data will be saved in the specified directory
await browser.close();
})();
puppeteer-cluster is a package that provides a simple and efficient way to manage multiple Puppeteer instances. It allows you to run multiple browser instances in parallel, making it ideal for large-scale web scraping and automation tasks. Compared to puppeteer-extra-plugin, puppeteer-cluster focuses more on parallelism and resource management.
puppeteer-core is a lightweight version of Puppeteer that does not include the bundled Chromium browser. It allows you to use Puppeteer with any existing browser installation. While puppeteer-core does not offer plugins like puppeteer-extra-plugin, it provides more flexibility in terms of browser choice and version management.
Playwright is a Node.js library developed by Microsoft for browser automation. It supports multiple browsers (Chromium, Firefox, and WebKit) and offers features like auto-waiting, network interception, and more. Playwright is similar to Puppeteer but provides broader browser support and additional features, making it a strong alternative to puppeteer-extra-plugin.
yarn add puppeteer-extra-plugin
Base class for puppeteer-extra
plugins.
Provides convenience methods to avoid boilerplate.
All common puppeteer
browser events will be bound to
the plugin instance, if a respectively named class member is found.
Please refer to the puppeteer API documentation as well.
Type: function (opts)
opts
(optional, default {}
)Example:
// hello-world-plugin.js
const PuppeteerExtraPlugin = require('puppeteer-extra-plugin')
class Plugin extends PuppeteerExtraPlugin {
constructor (opts = { }) { super(opts) }
get name () { return 'hello-world' }
async onPageCreated (page) {
this.debug('page created', page.url())
const ua = await page.browser().userAgent()
this.debug('user agent', ua)
}
}
module.exports = function (pluginConfig) { return new Plugin(pluginConfig) }
// foo.js
const puppeteer = require('puppeteer-extra')
puppeteer.use(require('./hello-world-plugin')())
;(async () => {
const browser = await puppeteer.launch({headless: false})
const page = await browser.newPage()
await page.goto('http://example.com', {waitUntil: 'domcontentloaded'})
await browser.close()
})()
Plugin name (required).
Convention:
puppeteer-extra-plugin-anonymize-ua
anonymize-ua
Type: string
Example:
get name () { return 'anonymize-ua' }
Plugin defaults (optional).
If defined will be (deep-)merged with the (optional) user supplied options (supplied during plugin instantiation).
The result of merging defaults with user supplied options can be accessed through this.opts
.
Type: Object
Example:
get defaults () {
return {
stripHeadless: true,
makeWindows: true,
customFn: null
}
}
// Users can overwrite plugin defaults during instantiation:
puppeteer.use(require('puppeteer-extra-plugin-foobar')({ makeWindows: false }))
Plugin requirements (optional).
Signal certain plugin requirements to the base class and the user.
Currently supported:
headful
headless: true
mode,
will output a warning to the user.dataFromPlugins
this.getDataFromPlugins()
.runLast
Example:
get requirements () {
return new Set(['runLast', 'dataFromPlugins'])
}
Plugin dependencies (optional).
Missing plugins will be required() by puppeteer-extra.
Example:
get dependencies () {
return new Set(['user-preferences'])
}
// Will ensure the 'puppeteer-extra-plugin-user-preferences' plugin is loaded.
Plugin data (optional).
Plugins can expose data (an array of objects), which in turn can be consumed by other plugins,
that list the dataFromPlugins
requirement (by using this.getDataFromPlugins()
).
Convention: [ {name: 'Any name', value: 'Any value'} ]
Type: function ()
Example:
// plugin1.js
get data () {
return [
{
name: 'userPreferences',
value: { foo: 'bar' }
},
{
name: 'userPreferences',
value: { hello: 'world' }
}
]
// plugin2.js
get requirements () { return new Set(['dataFromPlugins']) }
async beforeLaunch () {
const prefs = this.getDataFromPlugins('userPreferences').map(d => d.value)
this.debug(prefs) // => [ { foo: 'bar' }, { hello: 'world' } ]
}
Access the plugin options (usually the defaults
merged with user defined options)
To skip the auto-merging of defaults with user supplied opts don't define a defaults
property and set the this._opts
Object in your plugin constructor directly.
Type: Object
Example:
get defaults () { return { foo: "bar" } }
async onPageCreated (page) {
this.debug(this.opts.foo) // => bar
}
Convenience debug logger based on the debug module. Will automatically namespace the logging output to the plugin package name.
# toggle output using environment variables
DEBUG=puppeteer-extra-plugin:<plugin_name> node foo.js
# to debug all the things:
DEBUG=puppeteer-extra,puppeteer-extra-plugin:* node foo.js
Type: Function
Example:
this.debug('hello world')
// will output e.g. 'puppeteer-extra-plugin:anonymize-ua hello world'
Can be used to modify the puppeteer launch options by modifying or returning them.
Plugins using this method will be called in sequence to each be able to update the launch options.
Type: function (options)
options
Object Puppeteer launch optionsExample:
async beforeLaunch (options) {
if (this.opts.flashPluginPath) {
options.args.push(`--ppapi-flash-path=${this.opts.flashPluginPath}`)
}
}
After the browser has launched.
Note: Don't assume that there will only be a single browser instance during the lifecycle of a plugin.
It's possible that pupeeteer.launch
will be called multiple times and more than one browser created.
In order to make the plugins as stateless as possible don't store a reference to the browser instance
in the plugin but rather consider alternatives.
E.g. when using onPageCreated
you can get a browser reference by using page.browser()
.
Alternatively you could expose a class method that takes a browser instance as a parameter to work with:
const fancyPlugin = require('puppeteer-extra-plugin-fancy')()
puppeteer.use(fancyPlugin)
const browser = await puppeteer.launch()
await fancyPlugin.killBrowser(browser)
Type: function (browser, options)
browser
Puppeteer.Browser The puppeteer
browser instance.options
Object? The launch options used. (optional, default {}
)Example:
async afterLaunch (browser, options) {
this.debug('browser has been launched', options)
}
Called when a target is created, for example when a new page is opened by window.open or browser.newPage.
Note: This includes target creations in incognito browser contexts.
Type: function (target)
target
Puppeteer.TargetSame as onTargetCreated
but prefiltered to only contain Pages, for convenience.
Note: This includes page creations in incognito browser contexts.
Type: function (target)
target
Puppeteer.TargetExample:
async onPageCreated (page) {
let ua = await page.browser().userAgent()
if (this.opts.stripHeadless) {
ua = ua.replace('HeadlessChrome/', 'Chrome/')
}
this.debug('new ua', ua)
await page.setUserAgent(ua)
}
Called when the url of a target changes.
Note: This includes target changes in incognito browser contexts.
Type: function (target)
target
Puppeteer.TargetCalled when a target is destroyed, for example when a page is closed.
Note: This includes target destructions in incognito browser contexts.
Type: function (target)
target
Puppeteer.TargetCalled when Puppeteer gets disconnected from the Chromium instance. This might happen because of one of the following:
browser.disconnect
method was calledType: function ()
Sometimes onDisconnected
is not catching all exit scenarios.
In order for plugins to clean up properly (e.g. deleting temporary files)
the onClose
method can be used.
Note: Might be called multiple times on exit.
Type: function ()
After the plugin has been registered in puppeteer-extra
.
Normally right after puppeteer.use(plugin)
is called
Type: function ()
Helper method to retrieve data
objects from other plugins.
A plugin needs to state the dataFromPlugins
requirement
in order to use this method. Will be mapped to puppeteer.getPluginData
.
Type: function (name)
name
string? Filter data by name
property (optional, default null
)FAQs
Base class for puppeteer-extra plugins.
We found that puppeteer-extra-plugin demonstrated a not healthy version release cadence and project activity because the last version was released a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
RubyGems.org has added a new "maintainer" role that allows for publishing new versions of gems. This new permission type is aimed at improving security for gem owners and the service overall.
Security News
Node.js will be enforcing stricter semver-major PR policies a month before major releases to enhance stability and ensure reliable release candidates.
Security News
Research
Socket's threat research team has detected five malicious npm packages targeting Roblox developers, deploying malware to steal credentials and personal data.