New Research: Supply Chain Attack on Axios Pulls Malicious Dependency from npm.Details →
Socket
Book a DemoSign in
Socket

html-urls

Package Overview
Dependencies
Maintainers
1
Versions
103
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

html-urls

Get all links from a HTML markup

latest
Source
npmnpm
Version
2.4.66
Version published
Maintainers
1
Created
Source

html-urls

Last version Coverage Status NPM Status

Get all URLs from a HTML markup. It's based on W3C link checker.

Install

$ npm install html-urls --save

Usage

const got = require('got')
const htmlUrls = require('html-urls')

;(async () => {
  const url = process.argv[2]
  if (!url) throw new TypeError('Need to provide an url as first argument.')
  const { body: html } = await got(url)
  const links = htmlUrls({ html, url })

  links.forEach(({ url }) => console.log(url))

  // => [
  //   'https://microlink.io/component---src-layouts-index-js-86b5f94dfa48cb04ae41.js',
  //   'https://microlink.io/component---src-pages-index-js-a302027ab59365471b7d.js',
  //   'https://microlink.io/path---index-709b6cf5b986a710cc3a.js',
  //   'https://microlink.io/app-8b4269e1fadd08e6ea1e.js',
  //   'https://microlink.io/commons-8b286eac293678e1c98c.js',
  //   'https://microlink.io',
  //   ...
  // ]
})()

It returns the following structure per every value detect on the HTML markup:

value

Type: <string>

The original value.

url

Type: <string|undefined>

The normalized URL, if the value can be considered an URL.

uri

Type: <string|undefined>

The normalized value as URI.


See examples for more!

API

htmlUrls([options])

options

html

Type: string
Default: ''

The HTML markup.

url

Type: string
Default: ''

The URL associated with the HTML markup.

It is used for resolve relative links that can be present in the HTML markup.

whitelist

Type: array
Default: []

A list of links to be excluded from the final output. It supports regex patterns.

See matcher for know more.

removeDuplicates

Type: boolean
Default: true

Remove duplicated links detected over all the HTML tags.

  • xml-urls – Get all urls from a Feed/Atom/RSS/Sitemap xml markup.
  • css-urls – Get all URLs referenced from stylesheet files.

License

html-urls © Kiko Beats, released under the MIT License.
Authored and maintained by Kiko Beats with help from contributors.

kikobeats.com · GitHub @Kiko Beats · X @Kikobeats

Keywords

href

FAQs

Package last updated on 23 Jun 2025

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts