Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More →

browserlify

Package Overview

Dependencies

Advanced tools

Install Socket

Detect and block malicious and high-risk dependencies

Install

browserlify

PDF generation, Web Scraping with headless chrome from browserlify.com

1.0.4
PyPI

Maintainers: 1

Browserlify Python SDK

GitHub

The Browserlify API Python Library

Usage

Requirements

You should be signed up as a developer on the browserlify.com so that you can create and manage your API token, It's free Sign up.

Installation

To easily install or upgrade to the latest release, use pip.

pip install --upgrade browserlify

Getting Started

First create a new api key in the Dashboard, and retrieve your API Token.
We then need to supply these keys to the browserlify Option class so that it knows how to authenticate.

from browserlify import pdf, Option

opt = Option(YOUR_TOKEN)

PDF generation

from browserlify import pdf, Option

opt = Option(YOUR_TOKEN)
opt.paper = 'A4'
opt.full_page = True
opt.wait_load = 5000 # Wait document loaded <= 5,000 ms

try:
    content = pdf('https://example.org', opt)
    open('example.org.pdf','wb+').write(content)
except Exception as bre:
    print('pdf fail', bre)

Screenshot

from browserlify import screenshot, Option

opt = Option(YOUR_TOKEN)
opt.full_page = True
opt.wait_load = 5000 # Wait document loaded <= 5,000 ms

try:
    content = screenshot('https://example.org', opt)
    open('example.org.png','wb+').write(content)
except Exception as bre:
    print('screenshot fail', bre)

Web Scraping

from browserlify import scrape, Option,Flow

opt = Option(YOUR_TOKEN)
opt.flows = [
    Flow(action="waitload", timeout=5000), # Wait document loaded <= 5,000 ms
    Flow(name="title", action="text", selector="h1")
]

try:
    content = scrape('https://example.org', opt)
    print(content)
    # output:
    # {"page":1,"data":{"title":"Example Domain"}}
except Exception as bre:
    print('scrape fail', bre)

cli

scripts/browserlify: The cli tool has a free token: cli_oss_free_token

pdf pdf generation
screenshot take screenshot
content get website content
scrape get website content

browserlify cli tool

positional arguments:
  {pdf,screenshot,content,scrape}
                        commands help
    pdf                 pdf generation help
    screenshot          take screenshot help
    content             get content help
    scrape              web scrape help

optional arguments:
  -h, --help            show this help message and exit
  --version, -v         show program's version number and exit

Convert Url To PDF

browserlify pdf -t YOUR_TOKEN -o browserlify.com.pdf -w 5000 --paper A3 https://example.org

Take Screenshot

browserlify screenshot -t YOUR_TOKEN -o browserlify.com.png -w 5000  --fullpage  https://example.org

Get Page Content

browserlify content -t YOUR_TOKEN -o browserlify.com.json -w 5000 https://example.org

Scrape Page

flows.json is written in browserlify's IDE, and you can get web page content without code

[
    {
        "action": "waitload",
        "timeout": 5000
    },
    {
        "action": "text",
        "name": "title",
        "selector": "title"
    }
]

browserlify scrape -t YOUR_TOKEN -o example.com.json  -f flows.json https://example.org

FAQs

What is browserlify?

Is browserlify well maintained?

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

browserlify

Browserlify Python SDK

Usage

Requirements

Installation

Table of Contents

Getting Started

PDF generation

Screenshot

Web Scraping

cli

Convert Url To PDF

Take Screenshot

Get Page Content

Scrape Page

Related posts

browserlify

Browserlify Python SDK

Usage

Requirements

Installation

Table of Contents

Getting Started

PDF generation

Screenshot

Web Scraping

cli

Convert Url To PDF

Take Screenshot

Get Page Content

Scrape Page

Related posts

Input Validation Vulnerabilities Dominate MITRE's 2024 CWE Top 25 List

Risky Business Podcast: Why Open Source Software Needs Better Malware Tracking