CLI which is testing existence of linked pages and anchors


Link checker for HTML pages which checks href attributes including the anchor in the target.
The Command Line Interface expects a directory on your local file system which will be scanned.

Why did I wrote this tool?

I was using a nice CLI called html-proofer, but was using a preprocessing step in order to get Javadoc and Scaladoc working because of the iframe setup. At some point it didn't scale anymore. Scaladoc link checker with html-proofer took 5 minutes.

link-checker is using cheerio for parsing HTML, which is using the fastest HTML parser for Node.js: htmlparser2. Same Scaladoc which took 5 minutes with html-proofer takes now 5 seconds with link-checker. Also URL transformation for iframes can be turned on on-the-fly via --javadoc. In this mode links like /index.html#com.org.company.product.library.Main@init will check for a HTML in the pathcom/org/company/product/library/Main.html and the anchor init.


Just use a website-scraper and download all the pages to your file system.

I've used the module with this options:

{ urls: [urlToScrape], directory: outputDirectory, recursive: true, filenameGenerator: 'bySiteStructure', urlFilter: function(url) { return url.indexOf(urlToScrape) != -1; } }



You can install it via npm

npm install -g link-checker

You can also install it without -g but then you need to put the binary, located in node_modules/.bin/link-checker to your $PATH.



docker pull timaschew/link-checker


You need to pass exactly one path where to check links Usage: link-checker path [options] Options: --version Show version number [boolean] --allow-hash-href If `true`, ignores the `href` `#` [boolean] --disable-external disable checks HTTP links [boolean] --external-only check HTTP links only [boolean] --file-ignore RegExp to ignore files to scan [array] --url-ignore RegExp to ignore URLs [array] --url-swap RegExp for URLs which can be replaced on the fly [array] --limit-scope forbid to follow URLs which are out of provided path, like ../somewhere [boolean] --mkdocs transforming URLS from foo/#bar to foo/index.html#bar [boolean] --javadoc Enable special URL transforming which allows to check iframe deeplinks for local javadoc and scaladoc[boolean] --javadoc-external Domain or base URL to do URL transformation to check iframe deeplinks [array] --http-status-ignore pass HTTP status code which will be ignore, by default only 2xx are allowed [array] --json print errors as JSON [boolean] --http-redirects Amount of allowed HTTP redirects [default: 0] --http-timeout HTTP timeout in milliseconds [default: 5000] --http-always-get Use always HTTP GET requests, by default HEAD is used for pages without any anchors [boolean] --warn-name-attr show warning if name attribute instead of id was used for an anchor [boolean] --http-cache Directory to store the non failing HTTP responses. If none is specified responses won't be cached. [string] --http-cache-max-age Invalidate the cache after the given period. Allowed values: https://www.npmjs.com/package/ms [default: "1w"] -h, --help Show help [boolean] Examples: link-checker path/to/html/files checks directory with HTMLfiles for broken links and anchors

linkcheckerrc configuration

The above configuration can, alternatively or in addition, be provided by a .linkcheckerrc in the project root:

{ "allow-hash-href": true, "disable-external": true, ... }

In addition, this format also provides means to override these settings based on URL regular expression matching:

{ "overrides": { "https://www\\.google.com/#": { "allow-hash-href": true, "http-status-ignore": [403, 404] }, "marketplace\\.visualstudio\\.com": { "http-always-get": true } } }



Last updated on 15 Feb 2021

