New Research: Supply Chain Attack on Axios Pulls Malicious Dependency from npm.Details
Socket
Book a DemoSign in
Socket

stack-expression

Package Overview
Dependencies
Maintainers
1
Versions
15
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

stack-expression

stack expressions are similar to regular expressions, but can parse nested structures.

Source
npmnpm
Version
1.5.0
Version published
Weekly downloads
4
Maintainers
1
Weekly downloads
 
Created
Source

stack-expression

stack expressions are similar to regular expressions, but can parse nested structures.

I had tried out a few parsing libraries. I tried nearley, it was easy to write the grammar but hard to debug the ast. chevotrain looked, good, but then it had a silly DSL and looked like way too much code. Another time I had played around with PEGjs but the parser it generated was way too slow.

How hard could it really be?

(update, looks like this is a Parsing Expression Grammar)

To understand how this really works I recommend reading the explainer (and also the code, of course!)

example: CSV parser (JOIN, GROUP, TEXT)

Here is the simplest useful example that you can't do with regular expressions. You can't write a CSV parser that groups things into cells and lines. csv actually is a regular language, the limitation here is how capture groups work in regular expressions.

var {JOIN,GROUP,TEXT} = require('stack-expression')
var cell = /^[\w ]*/
var CSV = JOIN(GROUP( JOIN(TEXT(cell), ',') ), '\n')
console.log(CSV('a,b,c\nd,e,f', 0).groups)
=> [ [a, b, c], [d, e, f] ]

In this library, the TEXT capture group is used to capture those characters directly. so there is a capture around the cell, TEXT(cell) then around that is a JOIN (to get lines) and then another capture, GROUP, to get the line as a group GROUP(JOIN(CATCH(cell), ','))

A repeating pattern with a separator is a very common pattern, but a pain to describe in a regular expression: pattern (separator pattern)* JOIN is just a built in support that takes a pattern and a seperator and does this for you. (I named it join, because it's similar to the join method on a javascript array)

regular patterns: AND, OR, MAYBE, MANY, MORE

AND(subrules...)

only match if all sub rules match.

OR(subrules...)

match the first of any matching subrules.

MAYBE (subrule)

if subrule matches, return that match, else allow an empty match. The same as OR(subrule, EMPTY) where EMPTY = AND()

MANY (subrule)

match subrule 0 or more times, like * in regular expressions.

MORE (subrule)

match subrule 1 or more times, like + in regular expressions.

It's just a shortcut for AND(subrule, MANY(subrule))

JOIN(item, separator)

join one or more items by separator. shortcut for AND(item, MANY(AND(separator, item))) To allow an empty list, use MAYBE(JOIN(item, separator))

note: might add an option to join to allow empty list.

capturing groups: TEXT, GROUP

TEXT(subrule, map?)

capture the text matched in a subrule. map is an optional function that will be applied to a matched group.

GROUP(subrule, map?)

capture any subgroups into a collection. If there are no subgroups, but the subrule matches, an empty array is returned.

the optional map function will be applied to the groups as a whole.

recursion

RECURSE() => subrule

return a subrule that may refer to itself. It's necessary to declare the recursive rule at the start, so that you can pass the rule to itself as a subrule.

The clauses above all return a function that does they thing. To define a recursive stack expression we need to call the function, and also pass that function to itself as an argument.

The following is a lisp-like parser, that accepts nested lists of printable characters separated by space, surrounded by parens. (the CATCHes have been left out for clarity)

var {RECURSE,AND,MAYBE,JOIN,OR} = require('stack-expression')
var list = RECURSE()
var value = /^\w+/
list(AND('(', MAYBE(JOIN(OR(value | list), space)), ')'))

errors

FAIL(message)

create a subrule that never passes, it instead throws an error.

to use, put inside a OR(expected, FAIL(message)) if expected isn't matched, throw an error. Use this once you have enough matched of a given pattern that the rest must match now. for example a json object or array must have a closing } or ]. Also a object must have a : after the string.

EXPECT(rule, message)

If rule isn't matched, fails with message. useful for patterns that once started, must end. for example AND('(', MANY(value), EXPECT(')')) Once the opening '(' is matched, it will match as many values as it can, then expect a ')'. If it doesn't find a ')' an error will be thrown.

If rule is a string, then message will default to the same value.

LOG(rule, name)

dump output to console.log whenever rule is executed. Useful for debugging. Remember to remove it from your code you ship.

PEEK (rule)

match if a rule comes next, but do not consume any characters. I'd recommend avoiding this if possible.

NOT (rule)

match if the following rule does not match. does not consume any characters. I'd recommend avoiding this if possible.

examples

JSON

A json parser in 50 lines including comments, and uses most stack-expression constructs, including GROUP (with map), RECURSE, and FAIL.

lisp

A compact lisp parser, 20 lines. Reuses js strings and numbers from the json parser.

License

MIT

FAQs

Package last updated on 14 Apr 2020

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts