snapdragon-lexer
![Linux Build Status](https://img.shields.io/travis/here-be-snapdragons/snapdragon-lexer.svg?style=flat&label=Travis)
Converts a string into an array of tokens, with useful methods for looking ahead and behind, capturing, matching, et cetera.
Please consider following this project's author, Jon Schlinkert, and consider starring the project to show your :heart: and support.
Install
Install with npm:
$ npm install --save snapdragon-lexer
Usage
const Lexer = require('snapdragon-lexer');
const lexer = new Lexer();
lexer.capture('slash', /^\//);
lexer.capture('text', /^\w+/);
lexer.capture('star', /^\*/);
console.log(lexer.tokenize('foo/*'));
API
Create a new Lexer
with the given options
.
Params
input
{String|Object}: (optional) Input string or options. You can also set input directly on lexer.input
after initializing.options
{Object}
Example
const Lexer = require('snapdragon-lexer');
const lexer = new Lexer('foo/bar');
Create a new Token with the given type
and val
.
Params
type
{String|Object}: (required) The type of token to createval
{String}: (optional) The captured stringmatch
{Array}: (optional) Match arguments returned from String.match
or RegExp.exec
returns
{Object}: Returns an instance of snapdragon-token
Events
Example
console.log(lexer.token({type: 'star', val: '*'}));
console.log(lexer.token('star', '*'));
console.log(lexer.token('star'));
Returns true if the given value is a snapdragon-token instance.
Params
token
{Object}returns
{Boolean}
Example
const Token = require('snapdragon-token');
lexer.isToken({});
lexer.isToken(new Token({type: 'star', val: '*'}));
Register a lexer handler function for creating tokens by matching substrings of the given type.
Params
type
{String}fn
{Function}: The handler function to register.
Example
lexer.set('star', function() {
const match = this.match(regex, type);
if (match) {
return this.token({val: match[0]}, match);
}
});
Get the registered lexer handler function of the given type
. If a handler is not found, an error is thrown.
Params
type
{String}fn
{Function}: The handler function to register.
Example
const handler = lexer.get('text');
Removes the given string
from the beginning of lexer.string
and adds it to the end of lexer.consumed
, then updates lexer.line
and lexer.column
with the current cursor position.
Params
string
{String}returns
{Object}: Returns the instance for chaining.
Example
lexer.consume('*');
Capture a substring from lexer.string
with the given regex
. Also validates the regex to ensure that it starts with ^
since matching should always be against the beginning of the string, and throws if the regex matches an empty string, which can cause catastrophic backtracking in some cases.
Params
regex
{RegExp}: (required)returns
{Array}: Returns the match arguments from RegExp.exec
or null.
Example
const lexer = new Lexer('foo/bar');
const match = lexer.match(/^\w+/);
console.log(match);
Scan for a matching substring by calling .match() with the given regex
. If a match is found, 1) a token of the specified type
is created, 2) match[0]
is used as token.value
, and 3) the length of match[0]
is sliced from lexer.string
(by calling .consume()).
Params
type
{String}regex
{RegExp}returns
{Object}: Returns a token if a match is found, otherwise undefined.
Events
Example
lexer.string = '/foo/';
console.log(lexer.scan(/^\//, 'slash'));
console.log(lexer.scan(/^\w+/, 'text'));
console.log(lexer.scan(/^\//, 'slash'));
Capture a token of the specified type
using the provide regex
for scanning and matching substrings. When .tokenize is use, captured tokens are pushed onto the lexer.tokens
array.
Params
type
{String}: (required) The type of token being captured.regex
{RegExp}: (required) The regex for matching substrings.fn
{Function}: (optional) If supplied, the function will be called on the token before pushing it onto lexer.tokens
.returns
{Object}
Example
lexer.capture('text', /^\w+/);
lexer.capture('text', /^\w+/, tok => {
if (tok.match[1] === 'foo') {
}
return tok;
});
Calls handler type
on lexer.string
.
Params
type
{String}: The handler type to call on lexer.string
returns
{Object}: Returns a token of the given type
or undefined.
Events
Example
const lexer = new Lexer('/a/b');
lexer.capture('slash', /^\//);
lexer.capture('text', /^\w+/);
console.log(lexer.lex('text'));
console.log(lexer.lex('slash'));
console.log(lexer.lex('text'));
Get the next token by iterating over lexer.handlers
and calling each handler on lexer.string
until a handler returns a token. If no handlers return a token, an error is thrown with the substring that couldn't be lexed.
returns
{Object}: Returns the first token returned by a handler.
Example
const token = lexer.advance();
Tokenizes a string and returns an array of tokens.
Params
input
{String}: The string to tokenize.returns
{Array}: Returns an array of tokens.
Example
lexer.capture('slash', /^\//);
lexer.capture('text', /^\w+/);
const tokens = lexer.tokenize('a/b/c');
console.log(tokens);
Push a token onto the lexer.queue
array.
Params
token
{Object}returns
{Object}: Returns the given token with updated token.index
.
Example
console.log(lexer.queue.length);
lexer.enqueue(new Token('star', '*'));
console.log(lexer.queue.length);
Shift a token from lexer.queue
.
returns
{Object}: Returns the given token with updated token.index
.
Example
console.log(lexer.queue.length);
lexer.dequeue();
console.log(lexer.queue.length);
Lookbehind n
tokens.
Params
n
{Number}returns
{Object}
Example
const token = lexer.lookbehind(2);
Get the current token.
returns
{Object}: Returns a token.
Example
const token = lexer.current();
Get the previous token.
returns
{Object}: Returns a token.
Example
const token = lexer.prev();
Lookahead n
tokens and return the last token. Pushes any intermediate tokens onto lexer.tokens.
To lookahead a single token, use .peek().
Params
n
{Number}returns
{Object}
Example
const token = lexer.lookahead(2);
Lookahead a single token.
Example
const token = lexer.peek();
Get the next token, either from the queue
or by advancing.
returns
{Object}: Returns a token.
Example
const token = lexer.next();
Skip n
tokens. Skipped tokens are not enqueued.
Params
n
{Number}returns
{Object}: returns the very last lexed/skipped token.
Example
const token = lexer.skip(1);
Skip the given token types
.
Params
types
{String|Array}: One or more token types to skip.returns
{Array}: Returns an array if skipped tokens
Example
lexer.skipType('space');
lexer.skipType(['newline', 'space']);
Push a token onto lexer.tokens
.
Params
token
{Object}returns
{Object}: Returns the given token with updated token.index
.
Events
Example
console.log(lexer.tokens.length);
lexer.push(new Token('star', '*'));
console.log(lexer.tokens.length);
Returns true when the end-of-string has been reached, and
lexer.queue
is empty.
Throw a formatted error message with details including the cursor position.
Params
msg
{String}: Message to use in the Error.node
{Object}returns
{undefined}
Example
parser.set('foo', function(tok) {
if (tok.val !== 'foo') {
throw this.error('expected token.val to be "foo"', tok);
}
});
Static method that returns true if the given value is an instance of snapdragon-lexer
.
Params
lexer
{Object}returns
{Boolean}
Example
const Lexer = require('snapdragon-lexer');
const lexer = new Lexer();
console.log(Lexer.isLexer(lexer));
console.log(Lexer.isLexer({}));
Static method that returns true if the given value is an instance of snapdragon-token
. This is a proxy to Token#isToken
.
Params
lexer
{Object}returns
{Boolean}
Example
const Token = require('snapdragon-token');
const Lexer = require('snapdragon-lexer');
console.log(Lexer.isToken(new Token({type: 'foo'})));
console.log(Lexer.isToken({}));
Plugins
Pass plugins to the lexer.use()
method.
Example
The snapdragon-position plugin adds a position
property with line and column to tokens as they're created:
const position = require('snapdragon-position');
const Lexer = require('snapdragon-lexer');
const lexer = new Lexer();
lexer.use(position());
lexer.capture('slash', /^\//);
lexer.capture('text', /^\w+/);
lexer.capture('star', /^\*/);
console.log(lexer.advance());
console.log(lexer.advance());
console.log(lexer.advance());
Results in:
Token {
type: 'text',
val: 'foo',
match: [ 'foo', index: 0, input: 'foo/*' ],
position: {
start: { index: 0, column: 1, line: 1 },
end: { index: 3, column: 4, line: 1 }
}
}
Token {
type: 'slash',
val: '/',
match: [ '/', index: 0, input: '/*' ],
position: {
start: { index: 3, column: 4, line: 1 },
end: { index: 4, column: 5, line: 1 }
}
}
Token {
type: 'star',
val: '*',
match: [ '*', index: 0, input: '*' ],
position: {
start: { index: 4, column: 5, line: 1 },
end: { index: 5, column: 6, line: 1 }
}
}
Plugin Conventions
Plugins are just functions that take an instance of snapdragon-lexer. However, it's recommended that you wrap your plugin function in a function that takes an options object, to allow users to pass options when using the plugin. Even if your plugin doesn't take options, it's a best practice for users to always be able to use the same signature.
Example
const Lexer = require('snapdragon-lexer');
const lexer = new Lexer();
function yourPlugin(options) {
return function(lexer) {
};
}
lexer.use(yourPlugin());
About
Contributing
Pull requests and stars are always welcome. For bugs and feature requests, please create an issue.
Please read the contributing guide for advice on opening issues, pull requests, and coding standards.
Running Tests
Running and reviewing unit tests is a great way to get familiarized with a library and its API. You can install dependencies and run tests with the following command:
$ npm install && npm test
Building docs
(This project's readme.md is generated by verb, please don't edit the readme directly. Any changes to the readme must be made in the .verb.md readme template.)
To generate the readme, run the following command:
$ npm install -g verbose/verb
Related projects
You might also be interested in these projects:
Author
Jon Schlinkert
License
Copyright © 2017, Jon Schlinkert.
Released under the MIT License.
This file was generated by verb-generate-readme, v0.6.0, on November 30, 2017.