You're Invited:Meet the Socket Team at BlackHat and DEF CON in Las Vegas, Aug 4-6.RSVP
Socket
Book a DemoInstallSign in
Socket

url-reader

Package Overview
Dependencies
Maintainers
1
Versions
3
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

url-reader

Convert your URLs to JSON/Markdown/Text format.

1.0.2
latest
npmnpm
Version published
Weekly downloads
1
Maintainers
1
Weekly downloads
 
Created
Source

URL READER

This project helps you to read the content of URLs, and return the title, length, html, text, markdown, excerpt.

"node": ">=20.11.0"

Installation

yarn add url-reader
# or npm install url-reader

Usage

import URLReader from 'url-reader';

const reader = new URLReader();
await reader.init();

const results = await reader.read({
  urls: ['https://www.google.com'],
  timeout: 10000, // ms, default: 60000
  enableMarkdown: false, // default: true
  runScripts: 'dangerously', // run the scripts included in the HTML and fetch remote resources, default is closed.
});

Parsed Result:

interface IReaderResult {
  title: string;
  length: number;
  html: string;
  text: string;
  markdown?: string;
  excerpt: string;
}

Server

  • start server
git clone https://github.com/yokingma/url-reader.git
cd url-reader

# default listen on port 3030
yarn install & yarn run start
  • api
GET /reader?url=https://www.google.com

POST /reader
Body:
{
  urls: ['https://www.google.com', 'https://www.bing.com']
}

Docker

docker build -t urlreader . # urlreader is your image's tag name

The service will listen on port 3030.

Tips

  • puppeteer When you install Puppeteer, it will automatically downloads a recent version of Chrome for Testing (~170MB macOS, ~282MB Linux, ~280MB Windows) and a chrome-headless-shell binary.

Troubleshooting

  • install error with puppeteer
Error [ERR_TLS_CERT_ALTNAME_INVALID]: Hostname/IP does not match certificate's altnames...

remove .npmrc file and re-install.

Keywords

puppeteer

FAQs

Package last updated on 02 May 2024

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts