You're Invited:Meet the Socket Team at BlackHat and DEF CON in Las Vegas, Aug 4-6.RSVP
Socket
Book a DemoInstallSign in
Socket

github.com/rhodeon/html-link-parser

Package Overview
Dependencies
Alerts
File Explorer
Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

github.com/rhodeon/html-link-parser

v1.0.0
Source
Go
Version published
Created
Source

A Go library for parsing HTML hyperlink elements into native Go objects.

Useful for creating sitemap generators, web crawlers and web scrapers.

Install

go get -u github.com/rhodeon/html-link-parser

Usage

A Link struct represents an HTML hyperlink element:

type Link struct {
Url  string // href
Text string // content
}

To parse an HTML reader to a list of Link objects:

html := `<html>
<body>
  <h1>Hello!</h1>
  <a href="github.com">share repository</a>
  <a href="/another-page">A link to yet another page</a>
</body>
</html>
`
links, _ := BuildLinks(strings.NewReader(html))
fmt.Printf("%+v", links)
// Output: [{Url:github.com Text:share repository} {Url:/another-page Text:A link to yet another page}]

To map links to a list of URLs:

urls := links.GetUrls()
fmt.Printf("%#v", urls)
// Output: []string{"github.com", "/another-page"}

To map links to a list of Texts:

texts := links.GetTexts()
fmt.Printf("%#v", texts)
// Output: []string{"share repository", "A link to yet another page"}

FAQs

Package last updated on 28 Feb 2022

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts