Socket
Socket
Sign inDemoInstall

link-scraper

Package Overview
Dependencies
61
Maintainers
1
Versions
8
Alerts
File Explorer

Advanced tools

Install Socket

Detect and block malicious and high-risk dependencies

Install

    link-scraper

This npm package will provide an easy way to scrap links from a page in an array. This also advances the fact by getting all internal or all external links from a page


Version published
Maintainers
1
Install size
6.23 MB
Created

Readme

Source

This npm package will provide an easy way to scrap links from a page in an array. This also advances the fact by getting all internal or all external links from a page.

This will intend to offer all type of link scraping in future to help developers.

About:

Link Scraper will intend to offer all type of link scraping in future to help developers.

Installation:

  • To download and save in package.json file

      npm install --save link-scraper
    
  • To download without saving

      npm install link-scraper
    

Usage:

Initialization:
    var linkScraper = require("link-scraper")
  • Parameters:

    1. url
    2. callback function (optional)
  • Results Array of object containg keys as text of link(if available, blank otherwise) and value as url

    linkScraper.getAllLinks(url,[callback])

Example:

linkScraper.getAllLinks("https://www.npmjs.com/", function(urls){
    //write your code here
})
  • Parameters:

    1. urls (array of objects having values as link)
  • Results Array of object containg keys as text of link(if available, blank otherwise) and value as url (same as first parameter except that some objects are removed)

    linkScraper.removeSamePageLinks(urls)

Example:

urls = [{
    "test1": "https://www.npmjs.com/";
    "test2": "https://www.npmjs.com/npm/enterprise";
    "test3": "https://www.npmjs.com/#test";
    
}];
console.log(linkScraper.removeSamePageLinks(, "https://www.npmjs.com"))

Output:

[{"test1": "https://www.npmjs.com/";}
{"test2": "https://www.npmjs.com/npm/enterprise";}]
    
  • Parameters:

    1. url
    2. callback function (optional)
  • Results Array of object containg keys as text of link(if available, blank otherwise) and value as url (same as first parameter with some objects having internal links are removed)

    linkScraper.getAllLinksExcludeSamePage(url, [callback])

Example:

linkScraper.getAllLinksExcludeSamePage("https://www.npmjs.com/", function (urls) {
    //your code here
});
  • Parameters:

    1. url
    2. callback function (optional)
  • Results Array of object containg keys as text of link(if available, blank otherwise) and value as url of all the internal links of the page you have searched

    linkScraper.getInternalLinks(url, [callback])

Example:

linkScraper.getInternalLinks("https://www.npmjs.com/", function (urls) {
    //your code here
});
  • Parameters:

    1. url
    2. callback function (optional)
  • Results Array of object containg keys as text of link(if available, blank otherwise) and value as url of all the external links of the page you have searched

    linkScraper.getExternalLinks(url, [callback])

Example:

linkScraper.getExternalLinks("https://www.npmjs.com/", function (urls) {
    //your code here
});

dependencies:

  • request
  • cheerio

Core Contributors:

  • Jagdish Singh

Please submit bug reports to https://github.com/JDchauhan/link-scraper.

Pull requests are welcome.

Keywords

FAQs

Last updated on 06 Aug 2018

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc