🚀 Socket Launch Week Day 5:Introducing Repository Access Permissions and Custom Roles.Learn more
Sign In

ruby_parser

Package Overview
Dependencies
Maintainers
1
Versions
80
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

ruby_parser - rubygems Package Compare versions

Comparing version
3.14.2
to
3.15.0
lib/ruby27_parser.rb

Sorry, the diff of this file is too big to display

Sorry, the diff of this file is not supported yet

+43
-3

@@ -11,2 +11,6 @@ #!/usr/bin/env ruby -w

renames = [
# unquote... wtf?
/`(.+?)'/, proc { $1 },
/"'(.+?)'"/, proc { "\"#{$1}\"" },
"'='", "tEQL",

@@ -104,2 +108,39 @@ "'!'", "tBANG",

# 2.7 changes:
'"global variable"', "tGVAR",
'"operator-assignment"', "tOP_ASGN",
'"back reference"', "tBACK_REF",
'"numbered reference"', "tNTH_REF",
'"local variable or method"', "tIDENTIFIER",
'"constant"', "tCONSTANT",
'"(.."', "tBDOT2",
'"(..."', "tBDOT3",
'"char literal"', "tCHAR",
'"literal content"', "tSTRING_CONTENT",
'"string literal"', "tSTRING_BEG",
'"symbol literal"', "tSYMBEG",
'"backtick literal"', "tXSTRING_BEG",
'"regexp literal"', "tREGEXP_BEG",
'"word list"', "tWORDS_BEG",
'"verbatim word list"', "tQWORDS_BEG",
'"symbol list"', "tSYMBOLS_BEG",
'"verbatim symbol list"', "tQSYMBOLS_BEG",
'"float literal"', "tFLOAT",
'"imaginary literal"', "tIMAGINARY",
'"integer literal"', "tINTEGER",
'"rational literal"', "tRATIONAL",
'"instance variable"', "tIVAR",
'"class variable"', "tCVAR",
'"terminator"', "tSTRING_END", # TODO: switch this?
'"method"', "tFID",
'"}"', "tSTRING_DEND",
'"do for block"', "kDO_BLOCK",
'"do for condition"', "kDO_COND",
'"do for lambda"', "kDO_LAMBDA",
# UGH

@@ -112,3 +153,2 @@ "k_LINE__", "k__LINE__",

'"do (for condition)"', "kDO_COND",

@@ -118,4 +158,4 @@ '"do (for lambda)"', "kDO_LAMBDA",

/\"(\w+) \(modifier\)\"/, proc { |x| "k#{$1.upcase}_MOD" },
/\"(\w+)\"/, proc { |x| "k#{$1.upcase}" },
/\"(\w+) \(?modifier\)?\"/, proc { |x| "k#{$1.upcase}_MOD" },
/\"(\w+)\"/, proc { |x| "k#{$1.upcase}" },

@@ -122,0 +162,0 @@ /@(\d+)(\s+|$)/, "",

# Quick Notes to Help with Debugging
## Reducing
One of the most important steps is reducing the code sample to a
minimal reproduction. For example, one thing I'm debugging right now
was reported as:
```ruby
a, b, c, d, e, f, g, h, i, j = 1, *[p1, p2, p3], *[p1, p2, p3], *[p4, p5, p6]
```
This original sample has 10 items on the left-hand-side (LHS) and 1 +
3 groups of 3 (calls) on the RHS + 3 arrays + 3 splats. That's a lot.
It's already been reported (perhaps incorrectly) that this has to do
with multiple splats on the RHS, so let's focus on that. At a minimum
the code can be reduced to 2 splats on the RHS and some
experimentation shows that it needs a non-splat item to fail:
```
_, _, _ = 1, *[2], *[3]
```
and some intuition further removed the arrays:
```
_, _, _ = 1, *2, *3
```
the difference is huge and will make a ton of difference when
debugging.
## Getting something to compare
```
% rake debug3 F=file.rb
```
TODO
## Comparing against ruby / ripper:

@@ -4,0 +43,0 @@

@@ -0,1 +1,36 @@

=== 3.15.0 / 2020-08-31
* 1 major enhancement:
* Added tentative 2.7 support.
* 1 minor enhancement:
* Improved ruby_parse_extract_error's handling of moving slow files out.
* 22 bug fixes:
* Bumped ruby version to include 3.0 (trunk).
* Fix an error related to empty ensure bodies. (presidentbeef)
* Fix handling of bad magic encoding comment.
* Fixed SystemStackError when parsing a huoooge hash, caused by a splat arg.
* Fixed a number of errors parsing do blocks in strange edge cases.
* Fixed a string backslash lexing bug when the string is an invalid encoding. (nijikon, gmcgibbon)
* Fixed bug assigning line number to some arg nodes.
* Fixed bug concatinating string literals with differing encodings.
* Fixed bug lexing heredoc w/ nasty mix of \r\n and \n.
* Fixed bug lexing multiple codepoints in \u{0000 1111 2222} forms.
* Fixed bug setting line numbers in empty xstrings in some contexts.
* Fixed edge case on call w/ begin + do block as an arg.
* Fixed handling of UTF BOM.
* Fixed handling of lexer state across string interpolation braces.
* Fixed infinite loop when lexing backslash+cr+newline (aka dos-files)
* Fixed lambda + do block edge case.
* Fixed lexing of some ?\M... and ?\C... edge cases.
* Fixed more do/brace block edge case failures.
* Fixed parsing bug where splat was used in the middle of a list.
* Fixed parsing of interpolation in heredoc-like strings. (presidentbeef)
* Fixed parsing some esoteric edge cases in op_asgn.
* Fixed unicode processing in ident chars so now they better mix.
=== 3.14.2 / 2020-02-06

@@ -2,0 +37,0 @@

+72
-40

@@ -28,8 +28,2 @@ # frozen_string_literal: true

IDENT_CHAR = if HAS_ENC then
/[\w\u0080-\u{10ffff}]/u
else
/[\w\x80-\xFF]/n
end
TOKENS = {

@@ -166,3 +160,3 @@ "!" => :tBANG,

eol = last_line && last_line.end_with?("\r\n") ? "\r\n" : "\n"
eos_re = /#{indent}#{Regexp.escape eos}(#{eol}|\z)/
eos_re = /#{indent}#{Regexp.escape eos}(\r*\n|\z)/
err_msg = "can't match #{eos_re.inspect} anywhere in "

@@ -182,6 +176,11 @@

case
when scan(/#[$@]/) then
ss.pos -= 1 # FIX omg stupid
when scan(/#(?=\$(-.|[a-zA-Z_0-9~\*\$\?!@\/\\;,\.=:<>\"\&\`\'+]))/) then
# TODO: !ISASCII
# ?! see parser_peek_variable_name
return :tSTRING_DVAR, matched
when scan(/#(?=\@\@?[a-zA-Z_])/) then
# TODO: !ISASCII
return :tSTRING_DVAR, matched
when scan(/#[{]/) then
self.command_start = true
return :tSTRING_DBEG, matched

@@ -326,2 +325,7 @@ when scan(/#/) then

def is_local_id id
# maybe just make this false for now
self.parser.env[id.to_sym] == :lvar # HACK: this isn't remotely right
end
def lvar_defined? id

@@ -345,5 +349,5 @@ # TODO: (dyna_in_block? && dvar_defined?(id)) || local_id?(id)

rb_compile_error "unknown type of %string" if ss.matched_size == 2
c, beg, short_hand = matched, ss.getch, false
c, beg, short_hand = matched, getch, false
else # Short-hand (e.g. %{, %., %!, etc)
c, beg, short_hand = "Q", ss.getch, true
c, beg, short_hand = "Q", getch, true
end

@@ -465,3 +469,3 @@

else
content.gsub(/\\\\/, "\\").gsub(/\\'/, "'")
content.gsub(/\\\\/, "\\").gsub(/\\\'/, "'")
end

@@ -504,12 +508,15 @@ end

def process_brace_close text
# matching compare/parse23.y:8561
cond.lexpop
cmdarg.lexpop
case matched
when "}" then
self.brace_nest -= 1
return :tSTRING_DEND, matched if brace_nest < 0
end
# matching compare/parse26.y:8099
cond.pop
cmdarg.pop
case matched
when "}" then
self.brace_nest -= 1
self.lex_state = ruby24minus? ? EXPR_ENDARG : EXPR_END
return :tSTRING_DEND, matched if brace_nest < 0
return :tRCURLY, matched

@@ -615,3 +622,3 @@ when "]" then

def process_label text
symbol = possibly_escape_string text, /^"/
symbol = possibly_escape_string text, /^\"/

@@ -630,3 +637,3 @@ result EXPR_LAB, :tLABEL, [symbol, self.lineno]

result EXPR_END, :tSTRING, text[1..-2].gsub(/\\\\/, "\\").gsub(/\\'/, "'")
result EXPR_END, :tSTRING, text[1..-2].gsub(/\\\\/, "\\").gsub(/\\\'/, "\'")
end

@@ -803,3 +810,3 @@

else
ss.getch
getch
end

@@ -810,2 +817,12 @@

def process_simple_string text
replacement = text[1..-2].gsub(ESC) {
unescape($1).b.force_encoding Encoding::UTF_8
}
replacement = replacement.b unless replacement.valid_encoding?
result EXPR_END, :tSTRING, replacement
end
def process_slash text

@@ -884,3 +901,3 @@ if is_beg? then

self.lex_strterm = nil
self.lex_state = (token_type == :tLABEL_END) ? EXPR_PAR : EXPR_END|EXPR_ENDARG
self.lex_state = (token_type == :tLABEL_END) ? EXPR_PAR : EXPR_LIT
end

@@ -892,5 +909,5 @@

def process_symbol text
symbol = possibly_escape_string text, /^:"/
symbol = possibly_escape_string text, /^:\"/ # stupid emacs
result EXPR_END|EXPR_ENDARG, :tSYMBOL, symbol
result EXPR_LIT, :tSYMBOL, symbol
end

@@ -944,2 +961,4 @@

tok_id = :tIDENTIFIER if tok_id == :tCONSTANT && is_local_id(token)
if last_state !~ EXPR_DOT|EXPR_FNAME and

@@ -968,7 +987,7 @@ (tok_id == :tIDENTIFIER) and # not EXPR_FNAME, not attrasgn

case
when keyword.id0 == :kDO then
when keyword.id0 == :kDO then # parse26.y line 7591
case
when lambda_beginning? then
self.lpar_beg = nil # lambda_beginning? == FALSE in the body of "-> do ... end"
self.paren_nest -= 1
self.paren_nest -= 1 # TODO: question this?
result lex_state, :kDO_LAMBDA, value

@@ -979,4 +998,2 @@ when cond.is_in_state then

result lex_state, :kDO_BLOCK, value
when state =~ EXPR_BEG|EXPR_ENDARG then
result lex_state, :kDO_BLOCK, value
else

@@ -998,5 +1015,5 @@ result lex_state, :kDO, value

if beginning_of_line? && scan(/\__END__(\r?\n|\Z)/) then
return [RubyLexer::EOF, RubyLexer::EOF]
elsif scan(/\_\w*/) then
return process_token matched
[RubyLexer::EOF, RubyLexer::EOF]
elsif scan(/#{IDENT_CHAR}+/) then
process_token matched
end

@@ -1038,3 +1055,3 @@ end

ss[1].to_i(16).chr.force_encoding Encoding::UTF_8
when check(/M-\\[\\MCc]/) then
when check(/M-\\./) then
scan(/M-\\/) # eat it

@@ -1053,2 +1070,7 @@ c = self.read_escape

c
when check(/(C-|c)\\(?!u|\\)/) then
scan(/(C-|c)\\/) # eat it
c = read_escape
c[0] = (c[0].ord & 0x9f).chr
c
when scan(/C-\?|c\?/) then

@@ -1062,13 +1084,21 @@ 127.chr

matched
when scan(/u([0-9a-fA-F]{4}|\{[0-9a-fA-F]{2,6}\})/) then
[ss[1].delete("{}").to_i(16)].pack("U")
when scan(/u([0-9a-fA-F]{1,3})/) then
when scan(/u(\h{4})/) then
[ss[1].to_i(16)].pack("U")
when scan(/u(\h{1,3})/) then
rb_compile_error "Invalid escape character syntax"
when scan(/u\{(\h+(?:\s+\h+)*)\}/) then
ss[1].split.map { |s| s.to_i(16) }.pack("U*")
when scan(/[McCx0-9]/) || end_of_stream? then
rb_compile_error("Invalid escape character syntax")
else
ss.getch
getch
end.dup
end
def getch
c = ss.getch
c = ss.getch if c == "\r" && ss.peek(1) == "\n"
c
end
def regx_options # TODO: rewrite / remove

@@ -1297,6 +1327,8 @@ good, bad = [], []

rb_compile_error("Invalid escape character syntax")
when /u([0-9a-fA-F]{4}|\{[0-9a-fA-F]{2,6}\})/ then
when /u(\h{4})/ then
[$1.delete("{}").to_i(16)].pack("U")
when /u([0-9a-fA-F]{1,3})/ then
when /u(\h{1,3})/ then
rb_compile_error("Invalid escape character syntax")
when /u\{(\h+(?:\s+\h+)*)\}/ then
$1.split.map { |s| s.to_i(16) }.pack("U*")
else

@@ -1379,7 +1411,7 @@ s

EXPR_LAB = EXPR_ARG|EXPR_LABELED
EXPR_NUM = EXPR_END|EXPR_ENDARG
EXPR_LIT = EXPR_END|EXPR_ENDARG
EXPR_PAR = EXPR_BEG|EXPR_LABEL
EXPR_PAD = EXPR_BEG|EXPR_LABELED
EXPR_LIT = EXPR_NUM # TODO: migrate to EXPR_LIT
EXPR_NUM = EXPR_LIT

@@ -1386,0 +1418,0 @@ expr_names.merge!(EXPR_NONE => "EXPR_NONE",

# encoding: UTF-8
#--
# This file is automatically generated. Do not modify it.
# Generated by: oedipus_lex version 2.5.1.
# Generated by: oedipus_lex version 2.5.2.
# Source: lib/ruby_lexer.rex

@@ -19,4 +19,4 @@ #++

# :stopdoc:
IDENT = /^#{IDENT_CHAR}+/o
ESC = /\\((?>[0-7]{1,3}|x[0-9a-fA-F]{1,2}|M-[^\\]|(C-|c)[^\\]|u[0-9a-fA-F]{1,4}|u\{[0-9a-fA-F]+\}|[^0-7xMCc]))/
IDENT_CHAR = /[a-zA-Z0-9_[:^ascii:]]/
ESC = /\\((?>[0-7]{1,3}|x\h{1,2}|M-[^\\]|(C-|c)[^\\]|u\h{1,4}|u\{\h+(?:\s+\h+)*\}|[^0-7xMCc]))/
SIMPLE_STRING = /((#{ESC}|\#(#{ESC}|[^\{\#\@\$\"\\])|[^\"\\\#])*)/o

@@ -164,3 +164,3 @@ SSTRING = /((\\.|[^\'])*)/

when text = ss.scan(/\"(#{SIMPLE_STRING})\"/o) then
action { result EXPR_END, :tSTRING, text[1..-2].gsub(ESC) { unescape $1 } }
process_simple_string text
when text = ss.scan(/\"/) then

@@ -333,12 +333,10 @@ action { string STR_DQUOTE; result nil, :tSTRING_BEG, text }

process_gvar text
when text = ss.scan(/\$[^[:ascii:]]+/) then
when text = ss.scan(/\$#{IDENT_CHAR}+/) then
process_gvar text
when text = ss.scan(/\$\W|\$\z/) then
process_gvar_oddity text
when text = ss.scan(/\$\w+/) then
process_gvar text
end # group /\$/
when text = ss.scan(/\_/) then
process_underscore text
when text = ss.scan(/#{IDENT}/o) then
when text = ss.scan(/#{IDENT_CHAR}+/o) then
process_token text

@@ -345,0 +343,0 @@ when ss.skip(/\004|\032|\000|\Z/) then

# encoding: ASCII-8BIT
# TODO: remove

@@ -31,3 +32,3 @@ require "sexp"

module RubyParserStuff
VERSION = "3.14.2"
VERSION = "3.15.0"

@@ -49,2 +50,7 @@ attr_accessor :lexer, :in_def, :in_single, :file

##
# The last token type returned from #next_token
attr_accessor :last_token_type
$good20 = []

@@ -498,2 +504,4 @@

end
rescue ArgumentError # unknown encoding name
# do nothing
rescue Encoding::InvalidByteSequenceError

@@ -538,3 +546,3 @@ # do nothing

first = header.first || ""
encoding, str = "utf-8", str[3..-1] if first =~ /\A\xEF\xBB\xBF/
encoding, str = "utf-8", str.b[3..-1] if first =~ /\A\xEF\xBB\xBF/

@@ -599,3 +607,5 @@ encoding = $1.strip if header.find { |s|

if htype == :str
head.last << tail.last
a, b = head.last, tail.last
b = b.dup.force_encoding a.encoding unless Encoding.compatible?(a, b)
a << b
elsif htype == :dstr and head.size == 2 then

@@ -704,2 +714,11 @@ head.last << tail.last

def new_begin val
_, lineno, body, _ = val
result = body ? s(:begin, body) : s(:nil)
result.line lineno
result
end
def new_body val

@@ -732,3 +751,6 @@ body, resbody, elsebody, ensurebody = val

result = s(:ensure, result, ensurebody).compact.line result.line if ensurebody
if ensurebody
lineno = (result || ensurebody).line
result = s(:ensure, result, ensurebody).compact.line lineno
end

@@ -853,4 +875,4 @@ result

def new_defs val
recv, (name, _line), args, body = val[1], val[4], val[6], val[7]
line, _ = val[5]
_, recv, _, _, name, (_in_def, line), args, body, _ = val
body ||= s(:nil).line line

@@ -862,2 +884,5 @@

# TODO: remove_begin
# TODO: reduce_nodes
if body then

@@ -887,3 +912,5 @@ if body.sexp_type == :block then

def new_hash val
s(:hash, *val[2].values).line(val[1])
_, line, assocs = val
s(:hash).line(line).concat assocs.values
end

@@ -1147,2 +1174,3 @@

str.force_encoding("UTF-8")
# TODO: remove:
str.force_encoding("ASCII-8BIT") unless str.valid_encoding?

@@ -1242,16 +1270,19 @@ result = s(:str, str).line lexer.lineno

def new_xstring str
if str then
case str.sexp_type
def new_xstring val
_, node = val
node ||= s(:str, "").line lexer.lineno
if node then
case node.sexp_type
when :str
str.sexp_type = :xstr
node.sexp_type = :xstr
when :dstr
str.sexp_type = :dxstr
node.sexp_type = :dxstr
else
str = s(:dxstr, "", str).line str.line
node = s(:dxstr, "", node).line node.line
end
str
else
s(:xstr, "")
end
node
end

@@ -1277,2 +1308,3 @@

if token and token.first != RubyLexer::EOF then
self.last_token_type = token
return token

@@ -1336,2 +1368,3 @@ else

self.comments.clear
self.last_token_type = nil
end

@@ -1338,0 +1371,0 @@

@@ -81,2 +81,3 @@ require "ruby_parser_extras"

require "ruby26_parser"
require "ruby27_parser"

@@ -86,2 +87,3 @@ class RubyParser # HACK

class V27 < ::Ruby27Parser; end
class V26 < ::Ruby26Parser; end

@@ -88,0 +90,0 @@ class V25 < ::Ruby25Parser; end

@@ -27,2 +27,4 @@ .autotest

lib/ruby26_parser.y
lib/ruby27_parser.rb
lib/ruby27_parser.y
lib/ruby_lexer.rb

@@ -29,0 +31,0 @@ lib/ruby_lexer.rex

@@ -11,2 +11,3 @@ # -*- ruby -*-

Hoe.add_include_dirs "lib"
Hoe.add_include_dirs "../../sexp_processor/dev/lib"

@@ -16,3 +17,3 @@ Hoe.add_include_dirs "../../minitest/dev/lib"

V2 = %w[20 21 22 23 24 25 26]
V2 = %w[20 21 22 23 24 25 26 27]
V2.replace [V2.last] if ENV["FAST"] # HACK

@@ -29,3 +30,3 @@

require_ruby_version "~> 2.2"
require_ruby_version [">= 2.1", "< 3.1"]

@@ -62,2 +63,4 @@ if plugin? :perforce then # generated files

task :generate => [:lexer, :parser]
task :clean do

@@ -130,3 +133,10 @@ rm_rf(Dir["**/*~"] +

in_compare do
system "tar yxf #{tarball} #{ruby_dir}/{id.h,parse.y,tool/{id2token.rb,vpath.rb}}"
extract_glob = case version
when /2\.7/
"{id.h,parse.y,tool/{id2token.rb,lib/vpath.rb}}"
else
"{id.h,parse.y,tool/{id2token.rb,vpath.rb}}"
end
system "tar yxf #{tarball} #{ruby_dir}/#{extract_glob}"
Dir.chdir ruby_dir do

@@ -138,2 +148,4 @@ if File.exist? "tool/id2token.rb" then

end
ruby "-pi", "-e", 'gsub(/^%define\s+api\.pure/, "%pure-parser")', "../#{parse_y}"
end

@@ -191,5 +203,6 @@ sh "rm -rf #{ruby_dir}"

ruby_parse "2.3.8"
ruby_parse "2.4.5"
ruby_parse "2.5.3"
ruby_parse "2.6.1"
ruby_parse "2.4.9"
ruby_parse "2.5.8"
ruby_parse "2.6.6"
ruby_parse "2.7.1"

@@ -201,3 +214,3 @@ task :debug => :isolate do

$: << "lib"
$:.unshift "lib"
require "ruby_parser"

@@ -225,4 +238,5 @@ require "pp"

pp parser.process(ruby, file, time)
rescue Racc::ParseError => e
rescue ArgumentError, Racc::ParseError => e
p e
puts e.backtrace.join "\n "
ss = parser.lexer.ss

@@ -244,8 +258,13 @@ src = ss.string

sh "ruby -v"
sh "ruby -y #{file} 2>&1 | #{munge} > tmp/ruby"
sh "./tools/ripper.rb -d #{file} | #{munge} > tmp/rip"
sh "rake debug F=#{file} DEBUG=1 V=25 2>&1 | #{munge} > tmp/rp"
sh "rake debug F=#{file} DEBUG=1 2>&1 | #{munge} > tmp/rp"
sh "diff -U 999 -d tmp/{rip,rp}"
end
task :cmp do
sh %(emacsclient --eval '(ediff-files "tmp/ruby" "tmp/rp")')
end
task :cmp3 do

@@ -252,0 +271,0 @@ sh %(emacsclient --eval '(ediff-files3 "tmp/ruby" "tmp/rip" "tmp/rp")')

@@ -1,2 +0,2 @@

#!/usr/bin/ruby -ws
#!/usr/bin/env ruby -ws

@@ -121,2 +121,4 @@ $v ||= false

/\$?@(\d+) */, "", # TODO: remove?
/_EXPR/, "",
]

@@ -198,6 +200,9 @@

when /^lex_state: :?([\w|]+) -> :?([\w|]+)(?: (?:at|from) (.*))?/ then
if $3 && $v then
puts "lex_state: #{$1.upcase} -> #{$2.upcase} at #{$3}"
a, b, c = $1.upcase, $2.upcase, $3
a.gsub! /EXPR_/, ""
b.gsub! /EXPR_/, ""
if c && $v then
puts "lex_state: #{a} -> #{b} at #{c}"
else
puts "lex_state: #{$1.upcase} -> #{$2.upcase}"
puts "lex_state: #{a} -> #{b}"
end

@@ -204,0 +209,0 @@ when /debug|FUCK/ then

Sorry, the diff of this file is not supported yet

Sorry, the diff of this file is not supported yet

Sorry, the diff of this file is not supported yet

Sorry, the diff of this file is too big to display

Sorry, the diff of this file is not supported yet

Sorry, the diff of this file is too big to display

Sorry, the diff of this file is not supported yet

Sorry, the diff of this file is too big to display

Sorry, the diff of this file is not supported yet

Sorry, the diff of this file is too big to display

Sorry, the diff of this file is not supported yet

Sorry, the diff of this file is too big to display

Sorry, the diff of this file is not supported yet

Sorry, the diff of this file is too big to display

Sorry, the diff of this file is not supported yet

Sorry, the diff of this file is too big to display

Sorry, the diff of this file is not supported yet

Sorry, the diff of this file is too big to display

Sorry, the diff of this file is too big to display