Security News
Opengrep Emerges as Open Source Alternative Amid Semgrep Licensing Controversy
Opengrep forks Semgrep to preserve open source SAST in response to controversial licensing changes.
github.com/fy0/pigeon
The pigeon command generates parsers based on a parsing expression grammar (PEG). Its grammar and syntax is inspired by the PEG.js project, while the implementation is loosely based on the parsing expression grammar for C# 3.0 article. It parses Unicode text encoded in UTF-8.
See the godoc page for detailed usage. Also have a look at the Pigeon Wiki for additional information about Pigeon and PEG in general.
Performance tweak, 5-10x faster than original version(with some trade off).
parser.state
is removed, because it's very slow.parseSeqExpr
only collect not nil values now. Mainly for performance improvement. For example: e <- Expr __ Plus __ Expr
returns [expr, '+', expr], original version return [expr, nil, '+', nil, expr].Multiple peg files supported.
pigeon -o script1.peg.go script1.peg
to generate a normal parser.pigeon -grammar-only -grammar-name=g2 -run-func-prefix="_s2_" -o script2.peg.go script2.peg
to generate grammar only code in same package.newParser("filename", "expr").parse(g2)
actionExpr
is different
expr <- [0-9]+ { fmt.Println(expr) }
is ok in this fork.expr <- "true" { return 1 }
if you want return something.expr <- "if" { p.addErr(errors.New("keyword is not allowed")) }
, equals to expr <- "if" { return nil, errors.New("keyword is not allowed") }
of original pigeon.andCodeExpr
and notCodeExpr
:
actionExpr
, return a bool instead of return bool and errorexpr <- &{ return c.data.AllowNumber } [0-9]+
String capture:
expr <- val:<anotherExpr> { fmt.Println(val.(string)) }
expr <- val:<(A '=' B)> { fmt.Println(val.(string)) }
Logical and
/ or
match:
expr <- &&testExpr testExpr
// if testExpr return ok but matched nothing (e.g. testExpr <- 'A'*), &&testEpr
returns false.Skip "actionExpr" while looking ahead issue, branch feat/skip-code-expr-while-looking-ahead
*{}
/ &{}
/ !{}
won't skip.Remove ParseFile ParseReader, rename Parse and all options to lowercase issue, branch feat/rename-exported-api
ParseReader
converts io.Reader to bytes, then invoke parse
, it don't make sense.Parse
and all options(MaxExpressions
,Entrypoint
,Statistics
,Debug
,Memoize
,AllowInvalidUTF8
,Recover
,GlobalStore
,InitState
) expose to module user. I think expose them is not a good idea.ActionExpr refactored issue, branch refactor/actionExpr
expr <- firstPart:[0-9]+ { fmt.Println(firstPart) } secondPart:[a-z]+ { fmt.Println(firstPart, secondPart) }
is allowed for this fork.expr <- { fmt.Println(p) }
stateCodeExpr(#{})
was removed.Provide a struct(ParserCustomData
) to embed, to replace the globalStore
ParserCustomData
in your module.c.data
, for example: expr <- { fmt.Println(c.data.MyOption) }
globalState
is removed.position
of generated code is removed
SetRulePos
to true and rebuild.Added -optimize-ref-expr-by-index
option
RefExpr
the most usually used expr in parser.Removed -support-left-recursion
option
Removed -optimize-grammar
option
Removed -optimize-basic-latin
option
charClassMatcher
/ anyMatcher
/ litMatcher
not return byte anymore, because of performance.
c.text
instead.Github user @mna created the package in April 2015, and @breml is the package's maintainer as of May 2017.
Starting of June 2023, the backwards compatibility support for pigeon
is changed to follow the official Go Security Policy.
Over time, the Go ecosystem is evolving.
On one hand side, packages like golang.org/x/tools, which are critical dependencies of pigeon
, do follow the official Security Policy and with pigeon
not following the same guidelines, it was no longer possible to include recent versions of these dependencies and with this it was no longer possible to include critical bugfixes.
On the other hand there are changes to what is considered good practice by the greater community (e.g. change from interface{}
to any
). For users following (or even enforcing) these good practices, the code generated by pigeon
does no longer meet the bar of expectations.
Last but not least, following the Go Security Policy over the last years has been a smooth experience and therefore updating Go on a regular bases feels like duty that is reasonable to be put on users of pigeon
.
This observations lead to the decision to follow the same Security Policy as Go.
Provided you have Go correctly installed with the $GOPATH and $GOBIN environment variables set, run:
$ go get -u github.com/mna/pigeon
This will install or update the package, and the pigeon
command will be installed in your $GOBIN directory. Neither this package nor the parsers generated by this command require any third-party dependency, unless such a dependency is used in the code blocks of the grammar.
$ pigeon [options] [PEG_GRAMMAR_FILE]
By default, the input grammar is read from stdin
and the generated code is printed to stdout
. You may save it in a file using the -o
flag.
Given the following grammar:
{
// part of the initializer code block omitted for brevity
type ParserCustomData struct {
}
var ops = map[string]func(int, int) int {
"+": func(l, r int) int {
return l + r
},
"-": func(l, r int) int {
return l - r
},
"*": func(l, r int) int {
return l * r
},
"/": func(l, r int) int {
return l / r
},
}
func toAnySlice(v any) []any {
if v == nil {
return nil
}
return v.([]any)
}
func eval(first, rest any) int {
l := first.(int)
restSl := toAnySlice(rest)
for _, v := range restSl {
restExpr := toAnySlice(v)
r := restExpr[3].(int)
op := restExpr[1].(string)
l = ops[op](l, r)
}
return l
}
}
Input <- expr:Expr EOF {
return expr
}
Expr <- _ first:Term rest:( _ AddOp _ Term )* _ {
return eval(first, rest)
}
Term <- first:Factor rest:( _ MulOp _ Factor )* {
return eval(first, rest)
}
Factor <- '(' expr:Expr ')' {
return expr
} / integer:Integer {
return integer
}
AddOp <- ( '+' / '-' ) {
return string(c.text)
}
MulOp <- ( '*' / '/' ) {
return string(c.text)
}
Integer <- '-'? [0-9]+ {
v, err := strconv.Atoi(string(c.text))
if err != nil {
p.addErr(err)
}
return v
}
_ "whitespace" <- [ \n\t\r]*
EOF <- !.
The generated parser can parse simple arithmetic operations, e.g.:
18 + 3 - 27 * (-18 / -3)
=> -141
More examples can be found in the examples/
subdirectory.
See the package documentation for detailed usage.
See the CONTRIBUTING.md file.
The BSD 3-Clause license. See the LICENSE file.
parseOneOrMoreExpr/parseZeroOrMoreExpr
which not collect results. Choose expr decide by is labeled, A bit faster.pushV
and popV
, a bit faster.parseCharClassMatcher
, variable start
can be removed in most case. Lot of of small memory pieces allocated.parseExpr
?FAQs
Unknown package
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Opengrep forks Semgrep to preserve open source SAST in response to controversial licensing changes.
Security News
Critics call the Node.js EOL CVE a misuse of the system, sparking debate over CVE standards and the growing noise in vulnerability databases.
Security News
cURL and Go security teams are publicly rejecting CVSS as flawed for assessing vulnerabilities and are calling for more accurate, context-aware approaches.