Security News
Research
Data Theft Repackaged: A Case Study in Malicious Wrapper Packages on npm
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
Note: this is a WIP continuation of pyppeteer project
Unofficial Python port of puppeteer JavaScript (headless) chrome/chromium browser automation library.
Pyppeteer requires python 3.6+. (experimentally supports python 3.5)
Install by pip from PyPI:
python3 -m pip install pyppeteer
Or install latest version from github:
python3 -m pip install -U git+https://github.com/miyakogi/pyppeteer.git@dev
Note: When you run pyppeteer first time, it downloads a recent version of Chromium (~100MB). If you don't prefer this behavior, run
pyppeteer-install
command before running scripts which uses pyppeteer.
Example: open web page and take a screenshot.
import asyncio
from pyppeteer import launch
async def main():
browser = await launch()
page = await browser.newPage()
await page.goto('http://example.com')
await page.screenshot({'path': 'example.png'})
await browser.close()
asyncio.get_event_loop().run_until_complete(main())
Example: evaluate script on the page.
import asyncio
from pyppeteer import launch
async def main():
browser = await launch()
page = await browser.newPage()
await page.goto('http://example.com')
await page.screenshot({'path': 'example.png'})
dimensions = await page.evaluate('''() => {
return {
width: document.documentElement.clientWidth,
height: document.documentElement.clientHeight,
deviceScaleFactor: window.devicePixelRatio,
}
}''')
print(dimensions)
# >>> {'width': 800, 'height': 600, 'deviceScaleFactor': 1}
await browser.close()
asyncio.get_event_loop().run_until_complete(main())
Pyppeteer has almost same API as puppeteer. More APIs are listed in the document.
Puppeteer's document and troubleshooting are also useful for pyppeteer users.
Pyppeteer is to be as similar as puppeteer, but some differences between python and JavaScript make it difficult.
These are differences between puppeteer and pyppeteer.
Puppeteer uses object (dictionary in python) for passing options to functions/methods. Pyppeteer accepts both dictionary and keyword arguments for options.
Dictionary style option (similar to puppeteer):
browser = await launch({'headless': True})
Keyword argument style option (more pythonic, isn't it?):
browser = await launch(headless=True)
$
-> querySelector
)In python, $
is not usable for method name.
So pyppeteer uses
Page.querySelector()
/Page.querySelectorAll()
/Page.xpath()
instead of
Page.$()
/Page.$$()
/Page.$x()
. Pyppeteer also has shorthands for these
methods, Page.J()
, Page.JJ()
, and Page.Jx()
.
Page.evaluate()
and Page.querySelectorEval()
Puppeteer's version of evaluate()
takes JavaScript raw function or string of
JavaScript expression, but pyppeteer takes string of JavaScript. JavaScript
strings can be function or expression. Pyppeteer tries to automatically detect
the string is function or expression, but sometimes it fails. If expression
string is treated as function and error is raised, add force_expr=True
option,
which force pyppeteer to treat the string as expression.
Example to get page content:
content = await page.evaluate('document.body.textContent', force_expr=True)
Example to get element's inner text:
element = await page.querySelector('h1')
title = await page.evaluate('(element) => element.textContent', element)
This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.
FAQs
Headless chrome/chromium automation library (unofficial port of puppeteer)
We found that pyppeteer2 demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 2 open source maintainers collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Research
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
Research
Security News
Attackers used a malicious npm package typosquatting a popular ESLint plugin to steal sensitive data, execute commands, and exploit developer systems.
Security News
The Ultralytics' PyPI Package was compromised four times in one weekend through GitHub Actions cache poisoning and failure to rotate previously compromised API tokens.