Security News
Input Validation Vulnerabilities Dominate MITRE's 2024 CWE Top 25 List
MITRE's 2024 CWE Top 25 highlights critical software vulnerabilities like XSS, SQL Injection, and CSRF, reflecting shifts due to a refined ranking methodology.
regexpu-core
Advanced tools
regexpu’s core functionality (i.e. `rewritePattern(pattern, flag)`), capable of translating ES6 Unicode regular expressions to ES5.
The regexpu-core package is a utility that helps in transforming Unicode-aware regular expressions into ES5-compatible versions. This is particularly useful when you want to ensure your regular expressions work across environments that may not fully support ES6 Unicode features.
Transform Unicode regular expressions
This feature allows you to transform a Unicode regular expression into an ES5-compatible version. The example shows how to transform a pattern that matches any letter using Unicode property escapes.
const rewritePattern = require('regexpu-core');
const pattern = rewritePattern('\\p{L}', 'u');
console.log(pattern); // Transformed pattern that matches any letter
Use with flags
This feature enables the transformation of Unicode regular expressions with specific flags. In the example, the 'useUnicodeFlag' option is used to indicate that the Unicode flag should be preserved in the transformed pattern.
const rewritePattern = require('regexpu-core');
const pattern = rewritePattern('\\p{L}', 'u', { 'useUnicodeFlag': true });
console.log(pattern); // Transformed pattern with the Unicode flag enabled
XRegExp provides augmented, extensible regular expressions. It includes support for additional syntax, flags, and methods beyond what is available in native JavaScript RegExp. It is similar to regexpu-core in that it enhances regular expressions, but it offers a broader set of features and an extended syntax.
regexp-tree is a regular expression processor, which includes a parser, a regexp transformer, and an optimizer. It is similar to regexpu-core in its ability to transform regular expressions, but it also provides optimization capabilities and a more comprehensive analysis and manipulation API.
regexpu is a source code transpiler that enables the use of ES2015 Unicode regular expressions in JavaScript-of-today (ES5).
regexpu-core contains regexpu’s core functionality, i.e. rewritePattern(pattern, flag)
, which enables rewriting regular expressions that make use of the ES2015 u
flag into equivalent ES5-compatible regular expression patterns.
To use regexpu-core programmatically, install it as a dependency via npm:
npm install regexpu-core --save
Then, require
it:
const rewritePattern = require('regexpu-core');
This module exports a single function named rewritePattern
.
rewritePattern(pattern, flags, options)
This function takes a string that represents a regular expression pattern as well as a string representing its flags, and returns an ES5-compatible version of the pattern.
rewritePattern('foo.bar', 'u', { unicodeFlag: "transform" });
// → 'foo(?:[\\0-\\t\\x0B\\f\\x0E-\\u2027\\u202A-\\uD7FF\\uDC00-\\uFFFF]|[\\uD800-\\uDBFF][\\uDC00-\\uDFFF]|[\\uD800-\\uDBFF])bar'
rewritePattern('[\\u{1D306}-\\u{1D308}a-z]', 'u', { unicodeFlag: "transform" });
// → '(?:[a-z]|\\uD834[\\uDF06-\\uDF08])'
rewritePattern('[\\u{1D306}-\\u{1D308}a-z]', 'ui', { unicodeFlag: "transform" });
// → '(?:[a-z\\u017F\\u212A]|\\uD834[\\uDF06-\\uDF08])'
regexpu-core can rewrite non-ES6 regular expressions too, which is useful to demonstrate how their behavior changes once the u
and i
flags are added:
// In ES5, the dot operator only matches BMP symbols:
rewritePattern('foo.bar', '', { unicodeFlag: "transform" });
// → 'foo(?:[\\0-\\t\\x0B\\f\\x0E-\\u2027\\u202A-\\uFFFF])bar'
// But with the ES2015 `u` flag, it matches astral symbols too:
rewritePattern('foo.bar', 'u', { unicodeFlag: "transform" });
// → 'foo(?:[\\0-\\t\\x0B\\f\\x0E-\\u2027\\u202A-\\uD7FF\\uDC00-\\uFFFF]|[\\uD800-\\uDBFF][\\uDC00-\\uDFFF]|[\\uD800-\\uDBFF])bar'
The optional options
argument recognizes the following properties:
These options can be set to false
or 'transform'
. When using 'transform'
, the corresponding features are compiled to older syntax that can run in older browsers. When using false
(the default), they are not compiled and they can be relied upon to compile more modern features.
unicodeFlag
- The u
flag, enabling support for Unicode code point escapes in the form \u{...}
.
rewritePattern('\\u{ab}', '', {
unicodeFlag: 'transform'
});
// → '\\u{ab}'
rewritePattern('\\u{ab}', 'u', {
unicodeFlag: 'transform'
});
// → '\\xAB'
dotAllFlag
- The s
(dotAll
) flag.
rewritePattern('.', '', {
dotAllFlag: 'transform'
});
// → '[\\0-\\t\\x0B\\f\\x0E-\\u2027\\u202A-\\uFFFF]'
rewritePattern('.', 's', {
dotAllFlag: 'transform'
});
// → '[\\0-\\uFFFF]'
rewritePattern('.', 'su', {
dotAllFlag: 'transform'
});
// → '(?:[\\0-\\uD7FF\\uE000-\\uFFFF]|[\\uD800-\\uDBFF][\\uDC00-\\uDFFF]|[\\uD800-\\uDBFF](?![\\uDC00-\\uDFFF])|(?:[^\\uD800-\\uDBFF]|^)[\\uDC00-\\uDFFF])'
unicodePropertyEscapes
- Unicode property escapes.
By default they are compiled to Unicode code point escapes of the form \u{...}
. If the unicodeFlag
option is set to 'transform'
they often result in larger output, although there are cases (such as \p{Lu}
) where it actually decreases the output size.
rewritePattern('\\p{Script_Extensions=Anatolian_Hieroglyphs}', 'u', {
unicodePropertyEscapes: 'transform'
});
// → '[\\u{14400}-\\u{14646}]'
rewritePattern('\\p{Script_Extensions=Anatolian_Hieroglyphs}', 'u', {
unicodeFlag: 'transform',
unicodePropertyEscapes: 'transform'
});
// → '(?:\\uD811[\\uDC00-\\uDE46])'
namedGroups
- Named capture groups.
rewritePattern('(?<name>.)\\k<name>', '', {
namedGroups: 'transform'
});
// → '(.)\1'
unicodeSetsFlag
- The v
(unicodeSets
) flag
rewritePattern('[\\p{Emoji}&&\\p{ASCII}]', 'v', {
unicodeSetsFlag: 'transform'
});
// → '[#\\*0-9]'
By default, patterns with the v
flag are transformed to patterns with the u
flag. If you want to downlevel them more you can set the unicodeFlag: 'transform'
option.
rewritePattern('[^[a-h]&&[f-z]]', 'v', {
unicodeSetsFlag: 'transform'
});
// → '[^f-h]' (to be used with /u)
rewritePattern('[^[a-h]&&[f-z]]', 'v', {
unicodeSetsFlag: 'transform',
unicodeFlag: 'transform'
});
// → '(?:(?![f-h])[\s\S])' (to be used without /u)
modifiers
- Inline i
/m
/s
modifiers
rewritePattern('(?i:[a-z])[a-z]', '', {
modifiers: 'transform'
});
// → '(?:[a-zA-Z])([a-z])'
These options can be set to false
, 'parse'
and 'transform'
. When using 'transform'
, the corresponding features are compiled to older syntax that can run in older browsers. When using 'parse'
, they are parsed and left as-is in the output pattern. When using false
(the default), they result in a syntax error if used.
Once these features become stable (when the proposals are accepted as part of ECMAScript), they will be parsed by default and thus 'parse'
will behave like false
.
onNamedGroup
This option is a function that gets called when a named capture group is found. It receives two parameters: the name of the group, and its index.
rewritePattern('(?<name>.)\\k<name>', '', {
onNamedGroup(name, index) {
console.log(name, index);
// → 'name', 1
}
});
onNewFlags
This option is a function that gets called to pass the flags that the resulting pattern must be interpreted with.
rewritePattern('abc', 'um', '', {
unicodeFlag: 'transform',
onNewFlags(flags) {
console.log(flags);
// → 'm'
}
})
namedGroups: 'transform'
, regexpu-core only takes care of the syntax: you will still need a runtime wrapper around the regular expression to populate the .groups
property of RegExp.prototype.match()
's result. If you are using regexpu-core via Babel, it's handled automatically.On the main
branch, bump the version number in package.json
:
npm version patch -m 'Release v%s'
Instead of patch
, use minor
or major
as needed.
Note that this produces a Git commit + tag.
Push the release commit and tag:
git push --follow-tags
Our CI then automatically publishes the new release to npm.
Once the release has been published to npm, update regexpu
to make use of it, and cut a new release of regexpu
as well.
Mathias Bynens |
regexpu-core is available under the MIT license.
FAQs
regexpu’s core functionality (i.e. `rewritePattern(pattern, flag)`), capable of translating ES6 Unicode regular expressions to ES5.
The npm package regexpu-core receives a total of 23,213,332 weekly downloads. As such, regexpu-core popularity was classified as popular.
We found that regexpu-core demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 4 open source maintainers collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
MITRE's 2024 CWE Top 25 highlights critical software vulnerabilities like XSS, SQL Injection, and CSRF, reflecting shifts due to a refined ranking methodology.
Security News
In this segment of the Risky Business podcast, Feross Aboukhadijeh and Patrick Gray discuss the challenges of tracking malware discovered in open source softare.
Research
Security News
A threat actor's playbook for exploiting the npm ecosystem was exposed on the dark web, detailing how to build a blockchain-powered botnet.