Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More
Socket
Sign inDemoInstall
Socket

minilex

Package Overview
Dependencies
Maintainers
1
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

minilex

  • 0.1.0
  • Rubygems
  • Socket score

Version published
Maintainers
1
Created
Source

Minilex

A little lexer toolkit, for basic lexing needs.

It's designed for the cases where parsers do the parsing, and all you need from your lexer is an array of simple tokens.

Usage

Expression = Minilex::Lexer.new do
  skip :whitespace, /\s+/
  tok :number, /\d+(?:\.\d+)?/
  tok :operator, /[\+\=\/\*]/
end

Expression.lex('1 + 2.34')
# => [[:number, '1', 1, 0],
#     [:operator, '+', 1, 3],
#     [:number, '2.34', 1, 5]
#     [:eos]]

To create a lexer with Lex, instantiate a Minilex::Lexer and define rules.

There are two methods for defining rules, skip and tok:

skip takes an id and a pattern. The lexer will ignore all occurrences of the pattern in the input text. The id isn't strictly necessary, but it's nice for readability and is a required argument.

tok also takes an id and a pattern. The lexer will turn all occurrences of the pattern into a token of the form:

[id, value, line, offset]

# id     - the id you provided
# value  - the matched value
# line   - line number
# offset - character position in the line

Overriding the token format

If you'd like to customize the token format, override append_token:

Digits = Minilex::Lexer.new do
  skip :whitespace, /\s+/
  tok :digit, /\d/

  # id    - the id of the matched rule
  # value - the value that was matched
  #
  # You have access to the array of tokens via `tokens` and the current
  # token's position # information via `pos`.
  def append_token(id, value)
    tokens << Integer(value)
  end

  # By default, the lexer will append an end-of-stream token to the end of
  # the tokens array. You can override what the eos token is or even suppress
  # it altogether with the append_eos callback.
  #
  # Here we'll suppress it by doing nothing
  def append_eos
  end
end

digits.lex('1 2 3 4')
# => [1, 2, 3, 4]

Processing values

There's one more thing you can do. It's just for convenience, though I'm not sure it really belongs in something that's supposed to do as little as possible. I might remove it.

The tok method accepts a third optional processor argument, which should name a method on the lexer (you'll have to write the method, of course).

What this will do is give you a chance to get at the matched text before it gets stuffed into a token:

DigitsConverter = Minilex::Lexer.new do
  skip :whitespace, /\s+/
  tok :digit, /\d/, :integer

  def integer(str)
    Integer(str)
  end
end

DigitsConverter.lex('123')
# => [[:digit, 1, 1, 0], [:digit, 2, 1, 1], [:digit, 3, 1, 2], [:eos]]
#              ^                  ^                  ^
#              ^                  ^                  ^
#            These are Integers (would have been Strings)

FAQs

Package last updated on 30 Apr 2012

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc