Research
Security News
Malicious npm Packages Inject SSH Backdoors via Typosquatted Libraries
Socket’s threat research team has detected six malicious npm packages typosquatting popular libraries to insert SSH backdoors.
url-regex-safe
Advanced tools
Regular expression matching for URL's. Maintained, safe, and browser-friendly version of url-regex. Resolves CVE-2020-7661. Works in Node v10.12.0+ and browsers.
Regular expression matching for URL's. Maintained, safe, and browser-friendly version of url-regex. Resolves CVE-2020-7661 for Node.js servers. Works in Node v10.12.0+ and browsers.
After discovering CVE-2020-7661 and disclosing it publicly (through my work on Spam Scanner and Forward Email) – I used an implementation of url-regex with some extra glue on top to filter out bad URL matches.
However after using it on Forward Email in production (which processes hundreds of thousands of emails per week), I found and documented several more core issues with url-regex.
Realizing that url-regex is no longer actively maintained, has 9 open pull requests as of this writing, and also lacked browser support – I decided to write this package for everyone and merge all the open pull requests.
This package should hopefully more closely resemble real-world intended usage of a URL regular expression, and also allowing the user to configure it as they wish. Please check out Forward Email if this package helped you, and explore our source code on GitHub which shows how we use this package.
npm:
npm install url-regex-safe
yarn:
yarn add url-regex-safe
We've resolved CVE-2020-7661 by including RE2 for Node.js usage. You will not have to manually wrap your URL regular expressions with new RE2(urlRegex())
anymore through url-regex-safe
(we do it automatically for you).
const urlRegexSafe = require('url-regex-safe');
const str = 'some long string with url.com in it';
const matches = str.match(urlRegexSafe());
for (const match of matches) {
console.log('match', match);
}
console.log(urlRegexSafe({ exact: true }).test('github.com'));
Since RE2 is not made for the browser, it will not be used, and therefore CVE-2020-7661 is still an issue on the client-side. However it is not severe since the most it would do is crash the browser tab (as on the Node.js side it would have crashed the entire process and thrown an out of memory exception).
This is the solution for you if you're just using <script>
tags everywhere!
<script src="https://unpkg.com/url-regex-safe"></script>
<script type="text/javascript">
(function() {
var str = 'some long string with url.com in it';
var matches = str.match(urlRegexSafe());
for (var i=0; i<matches.length; i++) {
console.log('match', matches[i]);
}
console.log(urlRegexSafe({ exact: true }).test('github.com'));
})();
</script>
Assuming you are using browserify, webpack, rollup, or another bundler, you can simply follow Node usage above.
Property | Type | Default Value | Description | |
---|---|---|---|---|
exact | Boolean | false | Only match an exact String. Useful with regex.test(str) to check if a String is a URL. We set this to false by default in order to match String values such as github.com (as opposed to requiring a protocol or www subdomain). We feel this closely more resembles real-world intended usage of this package. | |
strict | Boolean | false | Force URL's to start with a valid protocol or www if set to true . If true , then it will allow any TLD as long as it is a minimum of 2 valid characters. If it is false , then it will match the TLD against the list of valid TLD's using tlds. | |
auth | Boolean | false | Match against Basic Authentication headers. We set this to false by default since it was deprecated in Chromium, and otherwise it leaves the user with unwanted URL matches (more closely resembles real-world intended usage of this package by having it set to false by default too). | |
localhost | Boolean | true | Allows localhost in the URL hostname portion. See the test/test.js for more insight into the localhost test and how it will return a value which may be unwanted. A pull request would be considered to resolve the "pic.jp" vs. "pic.jpg" issue. | |
parens | Boolean | false | Match against Markdown-style trailing parenthesis. We set this to false because it should be up to the user to parse for Markdown URL's. | |
apostrophes | Boolean | false | Match against apostrophes. We set this to false because we don't want the String background: url('http://example.com/pic.jpg'); to result in http://example.com/pic.jpg' . See this issue for more information. | |
trailingPeriod | Boolean | false | Match against trailing periods. We set this to false by default since real-world behavior would want example.com versus example.com. as the match (this is different than url-regex where it matches the trailing period in that package). | |
ipv4 | Boolean | true | Match against IPv4 URL's. | |
ipv6 | Boolean | true | Match against IPv6 URL's. | |
tlds | Array | tlds | Match against a specific list of tlds, or the default list provided by tlds. | |
returnString | Boolean | false | Return the RegExp as a String instead of a RegExp (useful for custom logic, such as we did with Spam Scanner). |
You must override the default and set strict: true
if you do not wish to match github.com
by itself (though www.github.com
will work if strict: false
).
Unlike the deprecated and unmaintained package url-regex, we do a few things differently:
strict
to false
by default (url-regex had this set to true
)auth
option, which is set to false
by default (url-regex matches against Basic Authentication; had this set to true
- however this is a deprecated behavior in Chromium).parens
and ipv6
options, which are set to true
by default (url-regex had parens
set to true
and ipv6
was non-existent or set to false
rather).apostrophe
option, which is set to false
by default (url-regex had this set to true
).trailingPeriod
option, which is set to false
by default (which means matches won't contain trailing periods, whereas url-regex had this set to true
).Since we cannot use regular expression's "negative lookbehinds" functionality (due to RE2 limitations), we could not merge the logic from this pull request. This would have allowed us to make it so example.jpeg
would match only if it was example.jp
, however if you pass example.jpeg
right now it will extract example.jp
from it (since .jp
is a TLD). An alternative solution may exist, and we welcome community contributions regarding this issue.
Name | Website |
---|---|
Nick Baugh | http://niftylettuce.com/ |
Kevin Mårtensson | |
Diego Perini |
FAQs
Regular expression matching for URL's. Maintained, safe, and browser-friendly version of url-regex. Resolves CVE-2020-7661. Works in Node v10.12.0+ and browsers.
The npm package url-regex-safe receives a total of 113,946 weekly downloads. As such, url-regex-safe popularity was classified as popular.
We found that url-regex-safe demonstrated a not healthy version release cadence and project activity because the last version was released a year ago. It has 3 open source maintainers collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Research
Security News
Socket’s threat research team has detected six malicious npm packages typosquatting popular libraries to insert SSH backdoors.
Security News
MITRE's 2024 CWE Top 25 highlights critical software vulnerabilities like XSS, SQL Injection, and CSRF, reflecting shifts due to a refined ranking methodology.
Security News
In this segment of the Risky Business podcast, Feross Aboukhadijeh and Patrick Gray discuss the challenges of tracking malware discovered in open source softare.