Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More
Socket
Sign inDemoInstall
Socket

pywebcapture

Package Overview
Dependencies
Maintainers
1
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

pywebcapture

A package that allows users to capture full-page screenshots of websites using Selenium and Chrome webdriver.

  • 0.0.3
  • PyPI
  • Socket score

Maintainers
1

Pywebcapture

A package that allows users to capture full-page screenshots of websites using Selenium and Chrome webdriver.

Tested with Python version 3.8.3

Installation

  1. Download the latest version of Chrome webdriver
  2. Add chrome webdriver path to your system PATH (its also possible to pass the absolute path of your driver to the Driver instance)
  3. Run pip install pywebcapture

Basic Usage

Import the modules:

from pywebcapture import loader, driver

Use the CSVLoader to load your csv file containing the urls and optional file names:

Options:

  • input_filepath - The absolute path to your csv file (str)
  • has_header - Whether your csv has a header row or now (bool)
  • uri_column - The column that contains the uri's, can use either column name (str) or the index position (int)
  • filename_column - The column that contains the desired file names (str), can be set to None, where the driver will use the uri netloc as the filename
csv_file = loader.CSVLoader("example.csv", True, 3, None)

Call the get_uri_dict() method from the CSVLoader instance, this parses the CSV into a Python dictionary:

uri_dict = csv_file.get_uri_dict()

Create instance of the web driver:

Options:

  • driver_path - This is the absolute path to the chrome webdriver, if None or "chromedriver" it will attempt to search %PATH
  • output_path - This is the output path that you want to save screen shots at (str)
  • delay - This is the delay in seconds between each page request, minimum is 2 seconds, please crawl pages respectfully :)
  • uri_dict - The Python dictionary containing your file names and uri's
d = driver.Driver("path/to/chrome/webdriver", None, 3, uri_dict)

Run the driver, this will loop through all uri's, get the maximum scrollheight and then take a screenshot

d.run()

FAQs


Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc