Socket
Socket
Sign inDemoInstall

snapdragon-lexer

Package Overview
Dependencies
6
Maintainers
1
Versions
5
Alerts
File Explorer

Advanced tools

Install Socket

Detect and block malicious and high-risk dependencies

Install

    snapdragon-lexer

Converts a string into an array of tokens, with useful methods for looking ahead and behind, capturing, matching, et cetera.


Version published
Weekly downloads
701
increased by37.99%
Maintainers
1
Created
Weekly downloads
 

Changelog

Source

[2.0.0] - 2018-01-08

Breaking changes

The following changes were made in an effort to make the API closer to other popular parsing libraries, such as babel and acorn.

  • Renamed token.val to token.value
  • lexer.loc.column was changed from a 1-index number to a 0-index number
  • .current is now a property set by the .handle() method. The value of lexer.current is whatever is returned by a handler.
  • .prev() now returns the previously lexed token
  • .push()

Added

  • If lexer.options.mode is set to character, lexer.advance() will consume and return a single character each time it's called, instead of iterating over the handlers.
  • the token.match array is now decorated with a .consumed property, which is the value of lexer.consumed before the match was created.
  • adds lexer.stack for tracking opening/closing structures
  • adds lexer.stash for storing an array of strings (in addition to lexer.tokens, which stores objects)
  • adds .append
  • adds .skipWhile
  • adds .skipSpaces

Readme

Source

snapdragon-lexer NPM version NPM monthly downloads NPM total downloads Linux Build Status

Converts a string into an array of tokens, with useful methods for looking ahead and behind, capturing, matching, et cetera.

Please consider following this project's author, Jon Schlinkert, and consider starring the project to show your :heart: and support.

Table of Contents

Details

Install

Install with npm:

$ npm install --save snapdragon-lexer

Breaking changes in v2.0!

Please see the changelog for details!

Usage

const Lexer = require('snapdragon-lexer');
const lexer = new Lexer();

lexer.capture('slash', /^\//);
lexer.capture('text', /^\w+/);
lexer.capture('star', /^\*/);

console.log(lexer.tokenize('foo/*'));

API

Lexer

Create a new Lexer with the given options.

Params

  • input {String|Object}: (optional) Input string or options. You can also set input directly on lexer.input after initializing.
  • options {Object}

Example

const Lexer = require('snapdragon-lexer');
const lexer = new Lexer('foo/bar');

.token

Create a new Token with the given type and value.

Params

  • type {String|Object}: (required) The type of token to create
  • value {String}: (optional) The captured string
  • match {Array}: (optional) Match arguments returned from String.match or RegExp.exec
  • returns {Object}: Returns an instance of snapdragon-token

Events

  • emits: token

Example

console.log(lexer.token({type: 'star', value: '*'}));
console.log(lexer.token('star', '*'));
console.log(lexer.token('star'));

.isToken

Returns true if the given value is a snapdragon-token instance.

Params

  • token {Object}
  • returns {Boolean}

Example

const Token = require('snapdragon-token');
lexer.isToken({}); // false
lexer.isToken(new Token({type: 'star', value: '*'})); // true

.consume

Consume the given length from lexer.string. The consumed value is used to update lexer.consumed, as well as the current position.

Params

  • len {Number}
  • value {String}: Optionally pass the value being consumed.
  • returns {String}: Returns the consumed value

Example

lexer.consume(1);
lexer.consume(1, '*');

.match

Capture a substring from lexer.string with the given regex. Also validates the regex to ensure that it starts with ^ since matching should always be against the beginning of the string, and throws if the regex matches an empty string, which can cause catastrophic backtracking in some cases.

Params

  • regex {RegExp}: (required)
  • returns {Array}: Returns the match arguments from RegExp.exec or null.

Example

const lexer = new Lexer('foo/bar');
const match = lexer.match(/^\w+/);
console.log(match);
//=> [ 'foo', index: 0, input: 'foo/bar' ]

.scan

Scan for a matching substring by calling .match() with the given regex. If a match is found, 1) a token of the specified type is created, 2) match[0] is used as token.value, and 3) the length of match[0] is sliced from lexer.string (by calling .consume()).

Params

  • type {String}
  • regex {RegExp}
  • returns {Object}: Returns a token if a match is found, otherwise undefined.

Events

  • emits: scan

Example

lexer.string = '/foo/';
console.log(lexer.scan(/^\//, 'slash'));
//=> Token { type: 'slash', value: '/' }
console.log(lexer.scan(/^\w+/, 'text'));
//=> Token { type: 'text', value: 'foo' }
console.log(lexer.scan(/^\//, 'slash'));
//=> Token { type: 'slash', value: '/' }

.capture

Capture a token of the specified type using the provide regex for scanning and matching substrings. Automatically registers a handler when a function is passed as the last argument.

Params

  • type {String}: (required) The type of token being captured.
  • regex {RegExp}: (required) The regex for matching substrings.
  • fn {Function}: (optional) If supplied, the function will be called on the token before pushing it onto lexer.tokens.
  • returns {Object}

Example

lexer.capture('text', /^\w+/);
lexer.capture('text', /^\w+/, token => {
  if (token.value === 'foo') {
    // do stuff
  }
  return token;
});

.handle

Calls handler type on lexer.string.

Params

  • type {String}: The handler type to call on lexer.string
  • returns {Object}: Returns a token of the given type or undefined.

Events

  • emits: handle

Example

const lexer = new Lexer('/a/b');
lexer.capture('slash', /^\//);
lexer.capture('text', /^\w+/);
console.log(lexer.handle('text'));
//=> undefined
console.log(lexer.handle('slash'));
//=> { type: 'slash', value: '/' }
console.log(lexer.handle('text'));
//=> { type: 'text', value: 'a' }

.advance

Get the next token by iterating over lexer.handlers and calling each handler on lexer.string until a handler returns a token. If no handlers return a token, an error is thrown with the substring that couldn't be lexed.

  • returns {Object}: Returns the first token returned by a handler, or the first character in the remaining string if options.mode is set to character.

Example

const token = lexer.advance();

.tokenize

Tokenizes a string and returns an array of tokens.

Params

  • input {String}: The string to tokenize.
  • returns {Array}: Returns an array of tokens.

Example

lexer.capture('slash', /^\//);
lexer.capture('text', /^\w+/);
const tokens = lexer.tokenize('a/b/c');
console.log(tokens);
// Results in:
// [ Token { type: 'text', value: 'a' },
//   Token { type: 'slash', value: '/' },
//   Token { type: 'text', value: 'b' },
//   Token { type: 'slash', value: '/' },
//   Token { type: 'text', value: 'c' } ]

.enqueue

Push a token onto the lexer.queue array.

Params

  • token {Object}
  • returns {Object}: Returns the given token with updated token.index.

Example

console.log(lexer.queue.length); // 0
lexer.enqueue(new Token('star', '*'));
console.log(lexer.queue.length); // 1

.dequeue

Shift a token from lexer.queue.

  • returns {Object}: Returns the given token with updated token.index.

Example

console.log(lexer.queue.length); // 1
lexer.dequeue();
console.log(lexer.queue.length); // 0

.lookbehind

Lookbehind n tokens.

Params

  • n {Number}
  • returns {Object}

Example

const token = lexer.lookbehind(2);

.prev

Get the previous token.

  • returns {Object}: Returns a token.

Example

const token = lexer.prev();

.lookahead

Lookahead n tokens and return the last token. Pushes any intermediate tokens onto lexer.tokens. To lookahead a single token, use .peek().

Params

  • n {Number}
  • returns {Object}

Example

const token = lexer.lookahead(2);

.peek

Lookahead a single token.

  • returns {Object} token

Example

const token = lexer.peek();

.next

Get the next token, either from the queue or by advancing.

  • returns {Object|String}: Returns a token, or (when options.mode is set to character) either gets the next character from lexer.queue, or consumes the next charcter in the string.

Example

const token = lexer.next();

.skip

Skip n tokens or characters in the string. Skipped values are not enqueued.

Params

  • n {Number}
  • returns {Object}: returns the very last lexed/skipped token.

Example

const token = lexer.skip(1);

.skipType

Skip the given token types.

Params

  • types {String|Array}: One or more token types to skip.
  • returns {Array}: Returns an array if skipped tokens.

Example

lexer.skipWhile(tok => tok.type !== 'space');

.skipType

Skip the given token types.

Params

  • types {String|Array}: One or more token types to skip.
  • returns {Array}: Returns an array if skipped tokens

Example

lexer.skipType('space');
lexer.skipType(['newline', 'space']);

.skipSpaces

Consume spaces.

  • returns {String}: Returned the skipped string.

Example

lexer.skipSpaces();

.append

Pushes the given value onto lexer.stash.

Params

  • value {any}
  • returns {Object}: Returns the Lexer instance.

Events

  • emits: append

Example

lexer.append('abc');
lexer.append('/');
lexer.append('*');
lexer.append('.');
lexer.append('js');
console.log(lexer.stash);
//=> ['abc', '/', '*', '.', 'js']

.push

Pushes the given token onto lexer.tokens and calls .append() to push token.value onto lexer.stash. Disable pushing onto the stash by setting lexer.options.append or token.append to false.

Params

  • token {Object|String}
  • returns {Object}: Returns the given token.

Events

  • emits: push

Example

console.log(lexer.tokens.length); // 0
lexer.push(new Token('star', '*'));
console.log(lexer.tokens.length); // 1
console.log(lexer.stash) // ['*']

.last

Get the last value in the given array.

Params

  • array {Array}
  • returns {any}

Example

console.log(lexer.last(lexer.tokens));

.isInside

Returns true if a token with the given type is on the stack.

Params

  • type {String}: The type to check for.
  • returns {Boolean}

Example

if (lexer.isInside('bracket') || lexer.isInside('brace')) {
  // do stuff
}

.value

Returns the value of a token using the property defined on lexer.options.value or token.value.

  • returns {String|undefined}

.eos

Returns true if lexer.string and lexer.queue are empty.

  • returns {Boolean}

Creates a new Lexer instance with the given options, and copy the handlers from the current instance to the new instance.

Params

  • options {Object}
  • parent {Object}: Optionally pass a different lexer instance to copy handlers from.
  • returns {Object}: Returns a new Lexer instance

.error

Throw a formatted error message with details including the cursor position.

Params

  • msg {String}: Message to use in the Error.
  • node {Object}
  • returns {undefined}

Example

lexer.set('foo', function(tok) {
  if (tok.value !== 'foo') {
    throw this.error('expected token.value to be "foo"', tok);
  }
});

Lexer#isLexer

Static method that returns true if the given value is an instance of snapdragon-lexer.

Params

  • lexer {Object}
  • returns {Boolean}

Example

const Lexer = require('snapdragon-lexer');
const lexer = new Lexer();
console.log(Lexer.isLexer(lexer)); //=> true
console.log(Lexer.isLexer({})); //=> false

Lexer#Stack

Static method for getting or setting the Stack constructor.

Lexer#Token

Static method for getting or setting the Token constructor, used by lexer.token() to create a new token.

Lexer#isToken

Static method that returns true if the given value is an instance of snapdragon-token. This is a proxy to Token#isToken.

Params

  • lexer {Object}
  • returns {Boolean}

Example

const Token = require('snapdragon-token');
const Lexer = require('snapdragon-lexer');
console.log(Lexer.isToken(new Token({type: 'foo'}))); //=> true
console.log(Lexer.isToken({})); //=> false

.set

Register a handler function.

Params

  • type {String}
  • fn {Function}: The handler function to register.

Example

lexer.set('star', function(token) {
  // do parser, lexer, or compiler stuff
});

As an alternative to .set, the .capture method will automatically register a handler when a function is passed as the last argument.

.get

Get a registered handler function.

Params

  • type {String}
  • fn {Function}: The handler function to register.

Example

lexer.set('star', function() {
  // do parser, lexer, or compiler stuff
});
const star = handlers.get('star');

Properties

lexer.isLexer

Type: {boolean}

Default: true (contant)

This property is defined as a convenience, to make it easy for plugins to check for an instance of Lexer.

lexer.input

Type: {string}

Default: ''

The unmodified source string provided by the user.

lexer.string

Type: {string}

Default: ''

The source string minus the part of the string that has already been consumed.

lexer.consumed

Type: {string}

Default: ''

The part of the source string that has been consumed.

lexer.tokens

Type: {array}

Default: `[]

Array of lexed tokens.

lexer.stash

Type: {array}

Default: ['']

Array of captured strings. Similar to the lexer.tokens array, but stores strings instead of token objects.

lexer.stack

Type: {array}

Default: `[]

LIFO (last in, first out) array. A token is pushed onto the stack when an "opening" character or character sequence needs to be tracked. When the (matching) "closing" character or character sequence is encountered, the (opening) token is popped off of the stack.

The stack is not used by any lexer methods, it's reserved for the user. Stacks are necessary for creating Abstract Syntax Trees (ASTs), but if you require this functionality it would be better to use a parser such as [snapdragon-parser][snapdragon-parser], with methods and other conveniences for creating an AST.

lexer.queue

Type: {array}

Default: `[]

FIFO (first in, first out) array, for temporarily storing tokens that are created when .lookahead() is called (or a method that calls .lookhead(), such as .peek()).

Tokens are dequeued when .next() is called.

lexer.loc

Type: {Object}

Default: { index: 0, column: 0, line: 1 }

The updated source string location with the following properties.

  • index - 0-index
  • column - 0-index
  • line - 1-index

The following plugins are available for automatically updating tokens with the location:

Options

options.source

Type: {string}

Default: undefined

The source of the input string. This is typically a filename or file path, but can also be 'string' if a string or buffer is provided directly.

If lexer.input is undefined, and options.source is a string, the lexer will attempt to set lexer.input by calling fs.readFileSync() on the value provided on options.source.

options.mode

Type: {string}

Default: undefined

If options.mode is character, instead of calling handlers (which match using regex) the .advance() method will consume and return one character at a time.

options.value

Type: {string}

Default: undefined

Specify the token property to use when the .push method pushes a value onto lexer.stash. The logic works something like this:

lexer.append(token[lexer.options.value || 'value']);

Tokens

See the snapdragon-token documentation for more details.

Plugins

Plugins are registered with the lexer.use() method and use the following conventions.

Plugin Conventions

Plugins are functions that take an instance of snapdragon-lexer.

However, it's recommended that you always wrap your plugin function in another function that takes an options object. This allow users to pass options when using the plugin. Even if your plugin doesn't take options, it's a best practice for users to always be able to use the same signature.

Example

function plugin(options) {
  return function(lexer) {
    // do stuff 
  };
}

lexer.use(plugin());

About

Contributing

Pull requests and stars are always welcome. For bugs and feature requests, please create an issue.

Please read the contributing guide for advice on opening issues, pull requests, and coding standards.

Running Tests

Running and reviewing unit tests is a great way to get familiarized with a library and its API. You can install dependencies and run tests with the following command:

$ npm install && npm test
Building docs

(This project's readme.md is generated by verb, please don't edit the readme directly. Any changes to the readme must be made in the .verb.md readme template.)

To generate the readme, run the following command:

$ npm install -g verbose/verb#dev verb-generate-readme && verb

You might also be interested in these projects:

Author

Jon Schlinkert

License

Copyright © 2018, Jon Schlinkert. Released under the MIT License.


This file was generated by verb-generate-readme, v0.6.0, on January 08, 2018.

Keywords

FAQs

Last updated on 08 Jan 2018

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc