Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More
Socket
Sign inDemoInstall
Socket

Dhalang

Package Overview
Dependencies
Maintainers
1
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

Dhalang

  • 0.7.1
  • Rubygems
  • Socket score

Version published
Maintainers
1
Created
Source

Dhalang Build Gem Version

Dhalang is a Ruby wrapper for Google's Puppeteer.

Features

  • Generate PDFs from webpages
  • Generate PDFs from HTML ( external images/stylesheets supported )
  • Capture screenshots from webpages
  • Scrape HTML from webpages

Prerequisites

  • Node ≥ 18
  • Puppeteer ≥ 22
  • Unix shell ( Dhalang will not work on Windows shells )

Installation

Add this line to your application's Gemfile:

gem 'Dhalang'

And then execute:

$ bundle update

Install puppeteer or puppeteer-core in your application's root directory:

$ npm install puppeteer 
or
$ npm install puppeteer-core

Usage

PDF of a website url

Dhalang::PDF.get_from_url("https://www.google.com")

It is important to pass the complete url, leaving out https://, http:// or www. will result in an error.

PDF of a HTML string

Dhalang::PDF.get_from_html("<html><head></head><body><h1>examplestring</h1></body></html>") 

PNG screenshot of a website

Dhalang::Screenshot.get_from_url("https://www.google.com", :png)  

JPEG screenshot of a website

Dhalang::Screenshot.get_from_url("https://www.google.com", :jpeg)  

WEBP screenshot of a website

Dhalang::Screenshot.get_from_url("https://www.google.com", :webp)  

HTML of a website

Dhalang::Scraper.html("https://www.google.com")  

Above methods either return a string containing the PDF/JPEG/PNG/WEBP in binary or the scraped HTML.

Custom options

To override the default options that are set by Dhalang you can pass as last argument a hash with the custom options you want to set.

For example to set custom margins for PDFs:

Dhalang::PDF.get_from_url("https://www.google.com", {margin: { top: 100, right: 100, bottom: 100, left: 100}})

For example to only take a screenshot of the visible part of the page:

Dhalang::Screenshot.get_from_url("https://www.google.com", :webp, {fullPage: false})

A list of all possible PDF options that can be set, can be found at: https://github.com/puppeteer/puppeteer/blob/main/docs/api.md#pagepdfoptions

A list of all possible screenshot options that can be set, can be found at: https://github.com/puppeteer/puppeteer/blob/main/docs/api.md#pagescreenshotoptions

The default Puppeteer options contain the options headerTemplate and footerTemplate. Puppeteer expects these to be HTML strings. By default, the Dhalang gem passes all options as arguments in a node ... shell command. In case the HTML strings are too long they might surpass the maximum argument length of the host. For example, on Linux the MAX_ARG_LEN is 128kB. Therefore, you can also pass the headers and footers as file path using the options headerTemplateFile and footerTemplateFile. These non-Puppeteer-options will be used to populate the Puppeteer-options headerTemplate and footerTemplate.

For example: Dhalang::PDF.get_from_url("https://www.google.com", {headerTemplateFile: '/tmp/header.html', footerTemplateFile: '/tmp/footer.html'})

Below table lists more configuration parameters that can be set:

KeyDescriptionDefault
browserWebsocketUrlWebsocket url of remote chromium browser to useNone
navigationTimeoutAmount of milliseconds until Puppeteer while timeout when navigating to the given page10000
printToPDFTimeoutAmount of milliseconds until Puppeteer while timeout when calling Page.printToPDF0 (unlimited)
navigationWaitForSelectorIf set, Dhalang will wait for the specified selector to appear before creating the screenshot or PDFNone
navigationWaitForXPathIf set, Dhalang will wait for the specified XPath to appear before creating the screenshot or PDFNone
userAgentUser agent to send with the requestDefault Puppeteer one
isHeadlessIndicates if Chromium should be launched headlesstrue
isAutoHeightWhen set to true the height of generated PDFs will be based on the scrollHeight property of the document bodyfalse
viewPortCustom viewport to use for the requestDefault Puppeteer one
httpAuthenticationCredentialsCustom HTTP authentication credentials to use for the requestNone
chromeOptionsA array of options that can be passed to puppeteer in addition to the mandatory ['--no-sandbox', '--disable-setuid-sandbox'][]

Examples of using Dhalang

To return a PDF from a Rails controller you can do the following:

def example_controller_method
  binary_pdf = Dhalang::PDF.get_from_url("https://www.google.com")  
  send_data(binary_pdf, filename: 'pdfofgoogle.pdf', type: 'application/pdf')  
end

To return a screenshot from a Rails controller you can do the following:

def example_controller_method
  binary_png = Dhalang::Screenshot.get_from_url("https://www.google.com", :png)
  send_data(binary_png, filename: 'screenshotofgoogle.png', type: 'image/png')   
end

FAQs

Package last updated on 11 Aug 2024

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc