Big News: Socket raises $60M Series C at a $1B valuation to secure software supply chains for AI-driven development.Announcement
Sign In

page-text-parser

Package Overview
Dependencies
Maintainers
1
Versions
8
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

page-text-parser

A function that returns an array of text content for a webpage given a jQuery-like selector and a URL.

latest
Source
npmnpm
Version
1.1.6
Version published
Maintainers
1
Created
Source

page-text-parser

A function that returns an array of text content for a webpage given a jQuery-like selector and a URL.

TypeScript Usage:

import { pageTextParser } from 'page-text-parser';

async function run() {
    const texts = await pageTextParser('https://google.com','a');

    // prints out text content of all anchor tags on google.com. If nothing was found or there was an error with website retrieval, texts will be an empty array.
    texts.forEach(text => {
        console.log(text);
    })
}

run();

Extended usage with optional attribute value to also retrieve with the text

import { pageTextParser } from 'page-text-parser';

async function run() {
    const texts = await pageTextParser('https://google.com','a', 'href');

    // prints out an array of objects with text: and attributeValue: keys containing text content of all anchor tags and the href value of on google.com. If nothing was found or there was an error with website retrieval, texts will be an empty array.
    texts.forEach(text => {
        console.log(JSON.stringify(text));
    })
}

run();

Why is the selector only 'jQuery-like'?

I am using Cheerio which is a Node implementation of JQuery, and their documentation states:

Like jQuery, it’s the primary method for selecting elements in the document, but unlike jQuery it’s built on top of the CSSSelect library, which implements most of the Sizzle selectors.

Read the Cheerio docs for more detailed information on how the selector syntax differs from actual jQuery.

Keywords

cheerio

FAQs

Package last updated on 25 Jun 2020

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts