basic-lexer
This basic lexer class is meant to be used within a larger lexing project. It is a state container for valuable lexing information and token extraction.
Get it
npm i @johanneslumpe/basic-lexer
Use it
Below you can find a contrived example. It is purposefully kept basic to illustrate how to use the lexer. While the example below could have also easily been solved using a simple regular expression, they are in general hard to read and debug. Using a lexer gives you a lot more flexibility and your code remains readable and easily debuggable.
NOTE: This library makes use of ES7 array and string methods and Symbol
. To use it within an environment that does not support these, you have to provide your own polyfills.
import { EOS, Lexer } from '@johanneslumpe/basic-lexer';
const enum IMyTokens {
WORD = 'WORD',
SPACE = 'SPACE',
PERIOD = 'PERIOD',
LEXING_ERROR = 'ERROR',
}
type MyTokenLexer = Lexer<IMyTokens>;
type stateFn = (lexer: MyTokenLexer) => stateFn | undefined;
const validWordChar = (char: string) => {
const charKeycode = char.charCodeAt(0);
return charKeycode >= 97 && charKeycode <= 122;
};
const word = (lexer: MyTokenLexer): stateFn => {
lexer.acceptRun(validWordChar);
lexer.emit(IMyTokens.WORD);
return sentence;
};
const error = (error: string) => (lexer: MyTokenLexer): undefined => {
lexer.emitError(IMyTokens.LEXING_ERROR, error);
return undefined;
};
const sentence = (lexer: MyTokenLexer): stateFn | undefined => {
const next = lexer.next();
switch (next) {
case '.':
lexer.emit(IMyTokens.PERIOD);
return sentence;
case ' ':
lexer.emit(IMyTokens.SPACE);
return sentence;
case EOS:
return undefined;
}
if (validWordChar(next)) {
return word;
}
return lexError(`Invalid character found: ${next}`);
};
export const lexMySentence = (lexer: MyTokenLexer) => {
let state: stateFn | undefined = sentence;
while (state !== undefined) {
state = state(lexer);
}
return lexer;
};
const myLexer = lexMySentence(new Lexer<IMyTokens>('lexing is fun.'));
console.log(myLexer.emittedTokens);
const myOtherLexer = lexMySentence(new Lexer<IMyTokens>('lexing is l337.'));
console.log(myOtherLexer.emittedTokens);
Documentation
Typedocs can be found in the docs folder