Compser
Compser is a parser library for Ruby inspired by elm-parser.
Take a look at the JSON parser and Calculator to get a glimpse of the syntax
and the building blocks you can sue to compose more complex and sophisticated parsers.
Installation
Add the following line to your Gemfile:
gem 'compser', '~> 0.2'
See more details at https://rubygems.org/gems/compser.
drop
Discard any result or chomped string produced by the parser.
parser = drop(:token, '[').take(:integer).drop(:token, ']')
parser.parse('[150]')
parser.parse('[0]')
parser.parse('1234')
parser.parse('[]')
parser.parse('[900')
integer
Parse integers.
parser = take(:integer)
parser.parse('1')
parser.parse('1234')
parser.parse('-500')
parser.parse('1.34')
parser.parse('1e31')
parser.parse('123a')
parser.parse('0x1A')
def my_integer
take(:one_of, [
map(->(x) { x * -1 }).drop(:token, '-').take(:integer),
take(:integer)
])
end
decimal
Parse floating points as BigDecimal.
parser = take(:decimal)
parser.parse('0.00009')
parser.parse('-0.00009')
parser.parse('bad')
parser.parse('1e31')
parser.parse('123a')
token
Parses the token from source.
parser = take(:token, 'module')
parser.parse('module')
parser.parse('modules')
parser.parse('modu')
parser.parse('Module')
keyword
Parses the keyword from source. The next character after the keyword must be a space, symbol or number.
parser = take(:keyword, 'let')
parser.parse('let')
parser.parse('letter')
parser.parse('Let')
parser.parse('le')
double_quoted_string
Parses a string between double quotes. Line breaks and tabs are supported.
parser = take(:double_quoted_string)
parser.parse('"Hello, world!"')
parser.parse('"line1\nline2"')
parser.parse('"Hello, \\"world\\"!"')
parser.parse('foo')
parser.parse('foo "bar"')
parser.parse('"foo')
map
Calls the map function with the taken values in the current pipeline if it succeeds. The output from map becomes the output of the parser,
that is, any parser with a map can be chained into other parsers.
Important: The arity of the map function should be equal to the amount of taken values in the pipeline.
Sum = ->(a, b) { a + b }
parser = map(Sum).take(:integer).drop(:token, '+').take(:integer)
parser.parse('1+1')
parser.parse('1+')
one_of
Attempts to parse each branch in the order they appear in the list. If all branches fail then the parser fails.
Important: one_of
will fail on the current branch it had a partial success before failing. The branch has to fail
early without chomping any character from source .
parser = take(:one_of, [ take(:integer), take(:double_quoted_string) ])
parser.parse('2023')
parser.parse('"Hello, world!"')
parser.parse('true')
sequence
Iterates over the parser until done
is called. We don't know in advance how many values are gonna be taken,
so the map
call should use single splat operator to receive a list with all values taken in the loop.
ToList = ->(*integers) { integers }
CommaSeparatedInteger = ->(continue, done) do
take(:integer)
.drop(:spaces)
.take(:one_of, [
drop(:token, ',').drop(:spaces).and_then(continue),
done
])
end
parser = map(ToList).take(:sequence, CommaSeparatedInteger)
parser.parse('12, 23, 34')
parser.parse('123')
parser.parse('12,')
parser.parse(',12')
lazy
Wraps a parser in a lazy-evaluated proc. Use lazy
to build recursive parsers.
ToList = ->(*integers) { integers }
CommaSeparatedInteger = -> do
take(:integer)
.drop(:spaces)
.take(:one_of, [
drop(:token, ',').drop(:spaces).take(:lazy, CommaSeparatedInteger),
succeed
])
end
parser = map(ToList).take(CommaSeparatedInteger.call())
parser.parse('12, 23, 34')
parser.parse('123')
parser.parse('12,')
parser.parse(',12')
spaces
Chompes zero or more blankspaces, line breaks and tabs. Always succeeds.
take(:spaces).parse(' \nfoo').state
chomp_if
Chomps a single character from source if predicate returns true. Otherwise, a bad result is pushed to state.
parser = take(:chomp_if, ->(ch) { ch == 'a' })
parser.parse('aaabb').state
parser.parse('cccdd').state
chomp_while
Chomps characters from source as long as predicate returns true. This parser always succeeds even if predicate
returns false for the first character. It is a zero-or-more loop.
parser = take(:chomp_while, ->(ch) { ch == 'a' })
parser.parse('aaabb').state
parser.parse('cccdd').state
Benchmark
The following result is a benchark of a JSON parser I implemented
with this library. I ran the benchmark with and without YJIT, and compared the result against JSON.parse
(native C implementation) and Parsby.
The benchmark parses a 1,5kb payload 100 times.
Implementation | Time | Comparison to JSON.parse |
---|
JSON.parse | 0.00067s | - |
Compser::Json.parse (with YJIT) | 0.216s | 322x slower |
Compser::Json.parse | 0.268s | 400x slower |
Parsby::Example::JsonParser (with YJIT) | 24.19s | 36100x slower |
Parsby::Example::JsonParser | 27.22s | 40626x slower |