
Security News
Oracle Drags Its Feet in the JavaScript Trademark Dispute
Oracle seeks to dismiss fraud claims in the JavaScript trademark dispute, delaying the case and avoiding questions about its right to the name.
If you would like to overhaul this code to bring it up to date, please contact me
Note: this is a continuation of the pyppeteer project
Unofficial Python port of puppeteer JavaScript (headless) chrome/chromium browser automation library.
pyppeteer requires Python >= 3.8
Install with pip
from PyPI:
pip install pyppeteer
Or install the latest version from this github repo:
pip install -U git+https://github.com/pyppeteer/pyppeteer@dev
Note: When you run pyppeteer for the first time, it downloads the latest version of Chromium (~150MB) if it is not found on your system. If you don't prefer this behavior, ensure that a suitable Chrome binary is installed. One way to do this is to run
pyppeteer-install
command before prior to using this library.
Full documentation can be found here. Puppeteer's documentation and its troubleshooting guide are also great resources for pyppeteer users.
Open web page and take a screenshot:
import asyncio
from pyppeteer import launch
async def main():
browser = await launch()
page = await browser.newPage()
await page.goto('https://example.com')
await page.screenshot({'path': 'example.png'})
await browser.close()
asyncio.get_event_loop().run_until_complete(main())
Evaluate javascript on a page:
import asyncio
from pyppeteer import launch
async def main():
browser = await launch()
page = await browser.newPage()
await page.goto('https://example.com')
await page.screenshot({'path': 'example.png'})
dimensions = await page.evaluate('''() => {
return {
width: document.documentElement.clientWidth,
height: document.documentElement.clientHeight,
deviceScaleFactor: window.devicePixelRatio,
}
}''')
print(dimensions)
# >>> {'width': 800, 'height': 600, 'deviceScaleFactor': 1}
await browser.close()
asyncio.get_event_loop().run_until_complete(main())
pyppeteer strives to replicate the puppeteer API as close as possible, however, fundamental differences between Javascript and Python make this difficult to do precisely. More information on specifics can be found in the documentation.
puppeteer uses an object for passing options to functions/methods. pyppeteer methods/functions accept both dictionary (python equivalent to JavaScript's objects) and keyword arguments for options.
Dictionary style options (similar to puppeteer):
browser = await launch({'headless': True})
Keyword argument style options (more pythonic, isn't it?):
browser = await launch(headless=True)
In python, $
is not a valid identifier. The equivalent methods to Puppeteer's $
, $$
, and $x
methods are listed below, along with some shorthand methods for your convenience:
puppeteer | pyppeteer | pyppeteer shorthand |
---|---|---|
Page.$() | Page.querySelector() | Page.J() |
Page.$$() | Page.querySelectorAll() | Page.JJ() |
Page.$x() | Page.xpath() | Page.Jx() |
Page.evaluate()
and Page.querySelectorEval()
puppeteer's version of evaluate()
takes a JavaScript function or a string representation of a JavaScript expression. pyppeteer takes string representation of JavaScript expression or function. pyppeteer will try to automatically detect if the string is function or expression, but it will fail sometimes. If an expression is erroneously treated as function and an error is raised, try setting force_expr
to True
, to force pyppeteer to treat the string as expression.
Get a page's textContent
:
content = await page.evaluate('document.body.textContent', force_expr=True)
Get an element's textContent
:
element = await page.querySelector('h1')
title = await page.evaluate('(element) => element.textContent', element)
See projects
FAQs
Headless chrome/chromium automation library (unofficial port of puppeteer)
We found that pyppeteer demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 3 open source maintainers collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Oracle seeks to dismiss fraud claims in the JavaScript trademark dispute, delaying the case and avoiding questions about its right to the name.
Security News
The Linux Foundation is warning open source developers that compliance with global sanctions is mandatory, highlighting legal risks and restrictions on contributions.
Security News
Maven Central now validates Sigstore signatures, making it easier for developers to verify the provenance of Java packages.