URL Sanitizer

URL sanitizer for Node.js (>=18), browsers and web sites.
Experimental
Install
npm i url-sanitizer
For browsers and web sites, standalone ESM builds are available in dist/
directory.
- node_modules/url-sanitizer/dist/url-sanitizer.js
- node_modules/url-sanitizer/dist/url-sanitizer.min.js
Or, download them from Releases.
Usage
import urlSanitizer, {
isURI, isURISync, parseURL, parseURLSync, sanitizeURL, sanitizeURLSync
} from 'url-sanitizer';
sanitizeURL(url, opt)
Sanitize the given URL.
data
and file
schemes must be explicitly allowed.
Parameters
url
string URL input.opt
object Options.
opt.allow
Array<string> Array of allowed schemes, e.g. ['data']
.opt.deny
Array<string> Array of denied schemes, e.g. ['web+foo']
.opt.only
Array<string> Array of specific schemes to allow, e.g. ['git', 'https']
.
only
takes precedence over allow
and deny
.opt.remove
boolean Remove tag or quote and the rest following it.
Returns Promise<string?> Sanitized URL, null
able.
const res1 = await sanitizeURL('http://example.com/"onmouseover="alert(1)"?<script>alert(\'XSS\');</script>');
console.log(decodeURIComponent(res1));
const res2 = await sanitizeURL('data:text/html,<div><script>alert(1);</script></div><p onclick="alert(2)"></p>', {
allow: ['data']
})
console.log(decodeURIComponent(res2));
const base64data3 = btoa('<div><script>alert(1);</script></div>');
const res3 = await sanitizeURL(`data:text/html;base64,${base64data3}`, {
allow: ['data']
})
console.log(decodeURIComponent(res3));
const base64data4 = btoa('<div><img src="javascript:alert(1)"></div>');
const res4 = await sanitizeURL(`data:text/html;base64,${base64data4}`);
console.log(decodeURIComponent(res4));
const res5 = await sanitizeURL('web+foo://example.com', {
deny: ['web+foo']
});
const res6 = await sanitizeURL('http://example.com', {
only: ['data', 'git', 'https']
});
const res7 = await sanitizeURL('https://example.com/"onmouseover="alert(1)"', {
only: ['data', 'git', 'https']
});
console.log(decodeURIComponent(res7));
const res8 = await sanitizeURL('git+https://example.com/foo.git?<script>alert(1)</script>', {
only: ['data', 'git', 'https']
});
console.log(decodeURIComponent(res8));
const res9 = await sanitizeURL('https://example.com/" onclick="alert(1)"', {
remove: true
});
const res10 = await sanitizeURL('https://example.com/?<script>alert(1)</script>', {
remove: true
});
sanitizeURLSync
Synchronous version of the sanitizeURL().
parseURL(url)
Parse the given URL.
Parameters
Returns Promise<ParsedURL> Result.
ParsedURL
Object with additional properties based on URL API.
Type: object
Properties
input
string URL input.valid
boolean Is valid URI.data
object Parsed result of data URL, null
able.
data.mime
string MIME type.data.base64
boolean true
if base64 encoded.data.data
string Data part of the data URL.
href
string Sanitized URL input.origin
string Scheme, domain and port of the sanitized URL.protocol
string Protocol scheme of the sanitized URL.username
string Username specified before the domain name.password
string Password specified before the domain name.host
string Domain and port of the sanitized URL.hostname
string Domain of the sanitized URL.port
string Port number of the sanitized URL.pathname
string Path of the sanitized URL.search
string Query string of the sanitized URL.hash
string Fragment identifier of the sanitized URL.
const res1 = await parseURL('javascript:alert(1)');
const res2 = await parseURL('https://www.example.com/?foo=bar#baz');
const res3 = await parseURL('');
const res4 = await parseURL('');
parseURLSync(url)
Synchronous version of the parseURL().
isURI(uri)
Verify if the given URI is valid.
Parameters
Returns Promise<boolean> Result.
- Always
true
for web+*
and ext+*
schemes, except web+javascript
, web+vbscript
, ext+javascript
, ext+vbscript
.
const res1 = await isURI('https://example.com/foo');
const res2 = await isURI('javascript:alert(1)');
const res3 = await isURI('mailto:foo@example.com');
const res4 = await isURI('foo:bar');
const res5 = await isURI('web+foo:bar');
const res6 = await isURI('web+javascript:alert(1)');
isURISync(uri)
Synchronous version of the isURI().
urlSanitizer
Instance of the sanitizer.
urlSanitizer.get()
Get a list of registered URI schemes.
Returns Array<string> Array of registered URI schemes.
- Includes schemes registered at iana.org by default.
- Historical schemes omitted.
moz-extension
scheme added.
- Also includes custom schemes added via urlSanitizer.add().
const schemes = urlSanitizer.get();
urlSanitizer.has(scheme)
Check if the given scheme is registered.
Parameters
Returns boolean Result.
const res1 = urlSanitizer.has('https');
const res2 = urlSanitizer.has('foo');
urlSanitizer.add(scheme)
Add a scheme to the list of registered URI schemes.
javascript
and vbscript
schemes can not be registered. It throws.
Parameters
Returns Array<string> Array of registered URI schemes.
console.log(urlSanitizer.has('foo'));
const res = urlSanitizer.add('foo');
console.log(urlSanitizer.has('foo'));
urlSanitizer.remove(scheme)
Remove a scheme from the list of registered URI schemes.
Parameters
Returns boolean Result.
true
if the scheme is successfully removed, false
otherwise.
console.log(urlSanitizer.has('aaa'));
const res1 = urlSanitizer.remove('aaa');
console.log(urlSanitizer.has('aaa'));
const res2 = urlSanitizer.remove('foo');
Acknowledgments
The following resources have been of great help in the development of the URL Sanitizer.
Copyright (c) 2023 asamuzaK (Kazz)