
Security News
CISA Kills Off RSS Feeds for KEVs and Cyber Alerts
CISA is discontinuing official RSS support for KEV and cybersecurity alerts, shifting updates to email and social media, disrupting automation workflows.
link-checker
Advanced tools
Link checker for HTML pages which checks href
attributes including the anchor in the target.
The Command Line Interface expects a directory on your local file system which will be scanned.
Why did I wrote this tool?
I was using a nice CLI called html-proofer, but was using a preprocessing step in order to get Javadoc and Scaladoc working because of the iframe setup. At some point it didn't scale anymore. Scaladoc link checker with html-proofer took 5 minutes.
link-checker
is using cheerio for parsing HTML, which is using the fastest HTML parser for Node.js: htmlparser2. Same Scaladoc which took 5 minutes with html-proofer takes now 5 seconds with link-checker
. Also URL transformation for iframes can be turned on on-the-fly via --javadoc
. In this mode links like /index.html#com.org.company.product.library.Main@init
will check for a HTML in the pathcom/org/company/product/library/Main.html
and the anchor init
.
Just use a website-scraper and download all the pages to your file system.
I've used the module with this options:
{
urls: [urlToScrape],
directory: outputDirectory,
recursive: true,
filenameGenerator: 'bySiteStructure',
urlFilter: function(url) {
return url.indexOf(urlToScrape) != -1;
}
}
You can install it via npm
npm install -g link-checker
You can also install it without -g
but then you need to put the binary,
located in node_modules/.bin/link-checker
to your $PATH
.
https://hub.docker.com/r/timaschew/link-checker/
docker pull timaschew/link-checker
You need to pass exactly one path where to check links
Usage: link-checker path [options]
Options:
--version Show version number [boolean]
--allow-hash-href If `true`, ignores the `href` `#` [boolean]
--disable-external disable checks HTTP links [boolean]
--external-only check HTTP links only [boolean]
--file-ignore RegExp to ignore files to scan [array]
--url-ignore RegExp to ignore URLs [array]
--url-swap RegExp for URLs which can be replaced on the fly [array]
--limit-scope forbid to follow URLs which are out of provided path,
like ../somewhere [boolean]
--mkdocs transforming URLS from foo/#bar to foo/index.html#bar
[boolean]
--javadoc Enable special URL transforming which allows to check
iframe deeplinks for local javadoc and scaladoc[boolean]
--javadoc-external Domain or base URL to do URL transformation to check
iframe deeplinks [array]
--http-status-ignore pass HTTP status code which will be ignore, by default
only 2xx are allowed [array]
--json print errors as JSON [boolean]
--http-redirects Amount of allowed HTTP redirects [default: 0]
--http-timeout HTTP timeout in milliseconds [default: 5000]
--http-always-get Use always HTTP GET requests, by default HEAD is used
for pages without any anchors [boolean]
--warn-name-attr show warning if name attribute instead of id was used
for an anchor [boolean]
--http-cache Directory to store the non failing HTTP responses. If
none is specified responses won't be cached. [string]
--http-cache-max-age Invalidate the cache after the given period. Allowed
values: https://www.npmjs.com/package/ms [default: "1w"]
-h, --help Show help [boolean]
Examples:
link-checker path/to/html/files checks directory with HTMLfiles for broken
links and anchors
linkcheckerrc
configurationThe above configuration can, alternatively or in addition, be provided by a .linkcheckerrc
in the project root:
{
"allow-hash-href": true,
"disable-external": true,
...
}
In addition, this format also provides means to override these settings based on URL regular expression matching:
{
"overrides": {
"https://www\\.google.com/#": {
"allow-hash-href": true,
"http-status-ignore": [403, 404]
},
"marketplace\\.visualstudio\\.com": {
"http-always-get": true
}
}
}
FAQs
CLI which is testing existence of linked pages and anchors
The npm package link-checker receives a total of 198 weekly downloads. As such, link-checker popularity was classified as not popular.
We found that link-checker demonstrated a not healthy version release cadence and project activity because the last version was released a year ago. It has 2 open source maintainers collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
CISA is discontinuing official RSS support for KEV and cybersecurity alerts, shifting updates to email and social media, disrupting automation workflows.
Security News
The MCP community is launching an official registry to standardize AI tool discovery and let agents dynamically find and install MCP servers.
Research
Security News
Socket uncovers an npm Trojan stealing crypto wallets and BullX credentials via obfuscated code and Telegram exfiltration.