New Research: Supply Chain Attack on Axios Pulls Malicious Dependency from npm.Details
Socket
Book a DemoSign in
Socket

scraped-core

Package Overview
Dependencies
Maintainers
1
Versions
1
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

scraped-core

Serves to scrape the core of the website.

latest
Source
npmnpm
Version
1.0.0
Version published
Weekly downloads
0
Maintainers
1
Weekly downloads
 
Created
Source

scraped-core

Serves to scrape the core of the website.

:cloud: Installation

# Using npm
npm install --save scraped-core

# Using yarn
yarn add scraped-core

:clipboard: Example

const scrapedCore = require("scraped-core")

// Promise interface
const data = scrapedCore(`
    <h1 class='heading'>Hello World</h1>
    <img src="test.jpg" />
    <p>Test</p>
    <ul>
        <li>1</li>
        <li>2</li>
        <li>3</li>
        <li>4</li>
        <li>5</li>
        <li><i>6</i></li>
    </ul>
`, {
    title: ".heading"
  , desc: "p"
  , avatar: {
        selector: "img"
      , attr: "src"
    }
  , items: {
        listItem: "ul > li"
      , data: {
            content: {
                how: "text"
            }
        }
    }
})

console.log(data)
// { title: 'Hello World',
//   desc: 'Test',
//   avatar: 'test.jpg',
//   items:
//    [ { content: '1' },
//      { content: '2' },
//      { content: '3' },
//      { content: '4' },
//      { content: '5' },
//      { content: '6' } ] }

:question: Get Help

There are few ways to get help:

  • Please post questions on Stack Overflow. You can open issues with questions, as long you add a link to your Stack Overflow question.
  • For bug reports and feature requests, open issues. :bug:

:memo: Documentation

scrapedCore($, opts)

Scrapes the data in the provided element.

For the format of the selector, please refer to the Selectors section of the Cheerio library

Params

  • Cheerio|String $: The input element or the HTML as a string.

  • Object opts: An object containing the scraping information. If you want to scrape a list, you have to use the listItem selector:

    • listItem (String): The list item selector.
    • data (Object): The fields to include in the list objects:
      • <fieldName> (Object|String): The selector or an object containing:
        • selector (String): The selector.
        • convert (Function): An optional function to change the value.
        • how (Function|String): A function or function name to access the value.
        • attr (String): If provided, the value will be taken based on the attribute name.
        • trim (Boolean): If false, the value will not be trimmed (default: true).
        • closest (String): If provided, returns the first ancestor of the given element.
        • eq (Number): If provided, it will select the nth element.
        • texteq (Number): If provided, it will select the nth direct text child. Deep text child selection is not possible yet. Overwrites the how key.
        • listItem (Object): An object, keeping the recursive schema of the listItem object. This can be used to create nested lists.

    Example:

    {
       articles: {
           listItem: ".article"
         , data: {
               createdAt: {
                   selector: ".date"
                 , convert: x => new Date(x)
               }
             , title: "a.article-title"
             , tags: {
                   listItem: ".tags > span"
               }
             , content: {
                   selector: ".article-content"
                 , how: "html"
               }
             , traverseOtherNode: {
                   selector: ".upperNode"
                 , closest: "div"
                 , convert: x => x.length
               }
           }
       }
    }
    

    If you want to collect specific data from the page, just use the same schema used for the data field.

    Example:

    {
         title: ".header h1"
       , desc: ".header h2"
       , avatar: {
             selector: ".header img"
           , attr: "src"
         }
    }
    

Return

  • Object The scraped data.

:yum: How to contribute

Have an idea? Found a bug? See [how to contribute][contributing].

:sparkling_heart: Support my projects

I open-source almost everything I can, and I try to reply to everyone needing help using these projects. Obviously, this takes time. You can integrate and use these projects in your applications for free! You can even change the source code and redistribute (even resell it).

Thanks!

License

MIT © KazeDevID

Keywords

scraped

FAQs

Package last updated on 29 Oct 2023

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts