Xeger
More expressive regular expressions for JavaScript.
Pronounced "zeeger"
Install
Use npm. If you're targeting the browser, use Browserify.
npm install xeger
Usage
var xeger = require('xeger');
var boringRegex= /(https?)\:\/\/([^\/]+)(.+)\?(.*)/;
var coolRegex = xeger(function (x) {
x.group(function (x) {
x.literal('http');
x.literal('s', { optional: true });
});
x.literal('://');
x.group(function (x) {
x.not('/', { multiple: true });
});
x.group(function (x) {
x.any({ multiple: true });
});
x.literal('?');
x.group(function (x) {
x.any({ multiple: true, optional: true });
});
});
var matched = coolRegex.exec('https://www.google.com/search?q=my_search');
matched:
[ 'https://www.google.com/search?q=my_search',
'https',
'www.google.com',
'/search',
'q=my_search',
index: 0,
input: 'https://www.google.com/search?q=my_search' ]
API
xeger([function], [options])
Call this to start the construction of the regex, passing in a callback function. It returns a RegExp object.
Use the rest of the functions in this section (the rule functions) to construct a regex by calling them within the callback.
The callback function will be called with one parameter, the xeger object. The rest of the functions here should be called on the xeger object. The callback is also called with the xeger object assigned to this
.
The options object passed here is different from the options object used in the rest of the API. This one takes the following keys:
- global: [boolean] - Will attempt to match the regex multiple times.
- multiline: [boolean] - Will attempt to match the regex multiple times.
- insensitive: [boolean] - Case insensitive matching.
x.literal([string], [options])
Matches the exact string passed in. x.literal
will escape any non-alpha numeric character.
xeger(function (x) {
x.literal('exact?!');
});
x.any([string|function|optional], [options])
Without a parameter, will match any single character. If you pass in a string, it's match any of the characters in the string.
xeger(function (x) {
x.literal('abc');
x.any();
x.literal('123');
});
xeger(function (x) {
x.any('abc');
x.any(function () {
x.literal('A');
x.to();
x.literal('Z');
});
});
x.not([string|function], [options])
The inverse of any
. Creates a set of characters to not match against.
xeger(function (x) {
x.literal('abc');
x.not('xyz');
x.not(function () {
x.literal('0');
x.to();
x.literal('9');
});
});
x.to()
Used to create the '-' inside any and not functions (see examples for any and not).
If you were to just do x.any('A-Z')
the -
would be escaped: /[A\-Z]/
x.alphanumeric([options])
Matches any single alpha-numeric character (includes letters, numbers, and the underscore).
xeger(function (x) {
x.alphanumeric();
});
x.number([options])
Matches a single number character.
xeger(function (x) {
x.number();
});
x.whitespace([options])
Matches a white-space character (e.g. tab, newline, and space)
xeger(function (x) {
x.whitespace();
});
x.newline([options])
Matches a newline character
xeger(function (x) {
x.newline();
});
x.start()
Matches the start of the string.
xeger(function (x) {
x.start();
x.literal('hi');
});
x.end()
Matches the end of the string.
xeger(function (x) {
x.start();
x.literal('hi');
x.end();
});
x.group([function], [options])
Creates a capture group for all the rules declared within the passed in callback function.
Use the ignore option to create a non-capture group.
xeger(function (x) {
x.group(function () {
x.literal('abc');
});
x.group(function (x) {
x.any('123');
}, { ignore: true });
});
options
You can pass in a few options to the above rule functions.
- multiple: [boolean] - Will try to continuously apply the rule.
- optional: [boolean] - Will try to match but will skip over if it cannot match.
- repeat: [integer] - Applies the specified rule the specified number of times.
- from: [integer] - Similar to repeat, but specifies the minimum.
- to: [integer] - Similar to from, but specifies the maximum. Often used with from.
xeger(function (x) {
x.literal('a', { optional: true });
x.any({ multiple: true, optional: true);
x.literal('123', { multiple: true });
x.literal('!', { from: 2, to: 3 });
x.literal('$', { repeat: 5 });
});