Socket
Book a DemoInstallSign in
Socket

readable-proxy

Package Overview
Dependencies
Maintainers
1
Versions
8
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

readable-proxy

Node service attempting to fetch readable contents from any URL.

latest
npmnpm
Version
1.6.1
Version published
Maintainers
1
Created
Source

readable-proxy

Build Status Dependency Status

Proxy server to retrieve a readable version of any provided url, powered by Node, PhantomJS and Readability.js.

Installation

$ git clone https://github.com/n1k0/readable-proxy
$ cd readable-proxy
$ npm install

Run

Starts server on localhost:3000:

$ npm start

Note about CORS: by design, the server will allow any origin to access it, so browsers can consume it from pages hosted on a different domain.

Configuration

By default, the proxy server will use the Readability.js version it ships with; to override this, you can set the READABILITY_LIB_PATH environment variable to the absolute path to the library file on your local system:

$ READABILITY_LIB_PATH=/path/to/my/own/version/of/Readability.js npm start

Usage

Web UI

Just head to http://localhost:3000/, enter some URL and start enjoying both original and readable renderings side by side.

REST/JSON API

The HTTP Rest API is available under /api.

Disclaimer: Truly REST implementation is probably far from being considered achieved.

GET /api/get

Required parameters
  • url: The URL to retrieve retrieve readable contents from, eg. https://nicolas.perriault.net/code/2013/get-your-frontend-javascript-code-covered/.
Optional parameters
  • sanitize: A boolean string to enable HTML sanitization (valid truthy boolean strings: "1", "on", "true", "yes", "y"; everything else will be considered falsy):
  • userAgent: A custom User Agent string. By default, it will use the PhantomJS one.

Note: Enabling contents sanitization loses Readability.js specific HTML semantics, though is probably safer for users if you plan to publish retrieved contents on a public website.

Example

Content sanitization enabled:

$ curl http://0.0.0.0:3000/api/get\?sanitize=y&url\=https://nicolas.perriault.net/code/2013/get-your-frontend-javascript-code-covered/
{
  "byline":"Nicolas Perriault —",
  "content":"<p><strong>So finally you&#39;re <a href=\"https://nicolas.perriault.net/code/2013/testing-frontend-javascript-code-using-mocha-chai-and-sinon/\">testing",
  "length":2867,
  "title":"Get your Frontend JavaScript Code Covered | Code",
  "uri":"https://nicolas.perriault.net/code/2013/get-your-frontend-javascript-code-covered/",
  "isProbablyReaderable": true
}

Content sanitization disabled (default):

$ curl http://0.0.0.0:3000/api/get\?url\=https://nicolas.perriault.net/code/2013/get-your-frontend-javascript-code-covered/
{
  "byline":"Nicolas Perriault —",
  "content":"<div id=\"readability-page-1\" class=\"page\"><section class=\"\">\n<p><strong>So finally you're…",
  "length":3851,
  "title":"Get your Frontend JavaScript Code Covered | Code",
  "uri":"https://nicolas.perriault.net/code/2013/get-your-frontend-javascript-code-covered/",
  "isProbablyReaderable": true
}

Note: the isProbablyReaderable property tells if Readability has determined if page contents were parseable or not.

Usage from node

scrape() function

The scrape function scrapes a URL and returns a Promise with the JSON result object described above:

var scrape = require("readable-proxy").scrape;
var url = "https://nicolas.perriault.net/code/2013/get-your-frontend-javascript-code-covered/";

scrape(url, {sanitize: true, userAgent: "My custom User-Agent string"})
  .then(console.error.log(console))
  .catch(console.error.bind(console));

Tests

$ npm test

License

MPL 2.0.

Keywords

readable

FAQs

Package last updated on 08 Jan 2017

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts