Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

parse_termexpr() reads past comma #14172

Open
p5pRT opened this issue Oct 19, 2014 · 12 comments
Open

parse_termexpr() reads past comma #14172

p5pRT opened this issue Oct 19, 2014 · 12 comments

Comments

@p5pRT
Copy link

p5pRT commented Oct 19, 2014

Migrated from rt.perl.org#123010 (status was 'open')

Searchable as RT123010$

@p5pRT
Copy link
Author

p5pRT commented Oct 19, 2014

From @mauke

Hi,

I've discovered a weird issue with parse_termexpr.

According to the documentation​: "The expression must be followed (and thus terminated) either by a comma or lower-precedence operator or by something that would normally terminate an expression such as semicolon".

So I thought it would be perfect for parsing something like ($x = TERMEXPR, $y, $z). But if TERMEXPR is a list operator like die or return, parse_termexpr reads past the comma and consumes $y and $z. Or at least that's what it looks like​:

$ perl -MO=Deparse -e '($x = return, $y, $z)'
$x = (return), $y, $z;

$ perl -MFunction​::Parameters -MO=Deparse -e 'fun foo($x = return, $y, $z) {}'
sub foo {
# ...
  my($x) = @​_;
  $x = (return, $y, $z) if @​_ < 1;
  ();
}
# ...

For comparison​:
$ perl -MFunction​::Parameters -MO=Deparse -e 'fun foo($x = (return), $y, $z) {}'
sub foo {
# ...
  my($x, $y, $z) = @​_;
  $x = (return) if @​_ < 1;
  ();
}

What is happening here that makes parse_termexpr continue even after "return,"?

@p5pRT
Copy link
Author

p5pRT commented Oct 19, 2014

From @cpansprout

On Sun Oct 19 01​:58​:00 2014, mauke- wrote​:

What is happening here that makes parse_termexpr continue even after
"return,"?

Don’t know yet, but it affects core signatures as well, since they also use the parsing functions.

The attached script crashes on both Function​::Parameters and core signatures when it gets to trying int, so there is another bug. :-) It doesn’t crash if I just try sub($x=int,$y=()) on its own.

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Oct 19, 2014

From @cpansprout

signature-test.pl

@p5pRT
Copy link
Author

p5pRT commented Oct 19, 2014

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Oct 19, 2014

From @cpansprout

On Sun Oct 19 08​:19​:04 2014, sprout wrote​:

On Sun Oct 19 01​:58​:00 2014, mauke- wrote​:

What is happening here that makes parse_termexpr continue even after
"return,"?

Don’t know yet, but it affects core signatures as well, since they
also use the parsing functions.

The attached script crashes on both Function​::Parameters and core
signatures when it gets to trying int, so there is another bug. :-)
It doesn’t crash if I just try sub($x=int,$y=()) on its own.

My regular expression was bad. The new attached script shows that this bug affects chmod, chown, die and exec, as well as return. I have to solve the crash before I can find out which other keywords are affected.

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Oct 19, 2014

From @cpansprout

signature-test-new.pl

@p5pRT
Copy link
Author

p5pRT commented Oct 19, 2014

From @cpansprout

On Sun Oct 19 10​:47​:25 2014, sprout wrote​:

On Sun Oct 19 08​:19​:04 2014, sprout wrote​:

On Sun Oct 19 01​:58​:00 2014, mauke- wrote​:

What is happening here that makes parse_termexpr continue even
after
"return,"?

Don’t know yet, but it affects core signatures as well, since they
also use the parsing functions.

The attached script crashes on both Function​::Parameters and core
signatures when it gets to trying int, so there is another bug. :-)
It doesn’t crash if I just try sub($x=int,$y=()) on its own.

My regular expression was bad. The new attached script shows that
this bug affects chmod, chown, die and exec, as well as return. I
have to solve the crash before I can find out which other keywords are
affected.

It was eval 'join,' that was crashing. I have a fix that I will push as soon as the tests finish.

The keywords that exhibit the bug with parse_termexpr are​:

chmod
chown
die
exec
kill
return
reverse
select
setpgrp
system
utime
warn

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Oct 19, 2014

From @cpansprout

On Sun Oct 19 11​:09​:57 2014, sprout wrote​:

On Sun Oct 19 10​:47​:25 2014, sprout wrote​:

On Sun Oct 19 08​:19​:04 2014, sprout wrote​:

On Sun Oct 19 01​:58​:00 2014, mauke- wrote​:

What is happening here that makes parse_termexpr continue even
after
"return,"?

Don’t know yet, but it affects core signatures as well, since they
also use the parsing functions.

The attached script crashes on both Function​::Parameters and core
signatures when it gets to trying int, so there is another bug. :-)
It doesn’t crash if I just try sub($x=int,$y=()) on its own.

My regular expression was bad. The new attached script shows that
this bug affects chmod, chown, die and exec, as well as return. I
have to solve the crash before I can find out which other keywords
are
affected.

It was eval 'join,' that was crashing. I have a fix that I will push
as soon as the tests finish.

The keywords that exhibit the bug with parse_termexpr are​:

chmod
chown
die
exec
kill
return
reverse
select
setpgrp
system
utime
warn

My script was still wrong, because it did not account for ‘print,’ becoming ‘print($_),’. But anyway I have found the cause. S_lop and OLDLOP in toke.c do this​:

  if (!PL_lex_allbrackets && PL_lex_fakeeof > LEX_FAKEEOF_LOWLOGIC)
  PL_lex_fakeeof = LEX_FAKEEOF_LOWLOGIC;

So if we have a list-precedence op, then it acts as a bracketing construct, and that is why the ‘fake eof’ changes from, say, LEX_FAKEEOF_COMMA to LEX_FAKEEOF_LOWLOGIC, meaning it will swallow up everything up to the next and/or/xor or anything of lower precedence.

That fails of course with a comma immediately after the list operator, and also with​:

sub ($x=print $a,&& $y) {}

Fixing this will be very tricky I imagine. I wonder whether this FAKEEOF logic really makes sense. (The lexer emits a 0 token [i.e., EOF] when it sees that the next token would be of lower precedence than we are currently parsing.) It would make more sense to me to continue parsing until the next token does not fit whatever grammar rule it is that we are currently parsing. And that may require some sort of error recovery in perly.y or perly.c.

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Jul 25, 2016

From @cpansprout

On Sun Oct 19 11​:09​:57 2014, sprout wrote​:

It was eval 'join,' that was crashing. I have a fix that I will push
as soon as the tests finish.

I forgot to mention at the time that it was commit a454544 and that fixed the crash.

Also, in commit 1ccc3f3 I added to-do tests for the main issue in this ticket, which is still unfixed.

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Jul 25, 2016

From @iabyn

On Sun, Jul 24, 2016 at 05​:49​:25PM -0700, Father Chrysostomos via RT wrote​:

On Sun Oct 19 11​:09​:57 2014, sprout wrote​:

It was eval 'join,' that was crashing. I have a fix that I will push
as soon as the tests finish.

I forgot to mention at the time that it was commit a454544 and that fixed the crash.

Also, in commit 1ccc3f3 I added to-do tests for the main issue in this ticket, which is still unfixed.

Note that my sig branch no longer uses parse_termexpr(), that for example

  sub f ($a = return , $c=3) { }

is now seen as a two-element signature.

--
A power surge on the Bridge is rapidly and correctly diagnosed as a faulty
capacitor by the highly-trained and competent engineering staff.
  -- Things That Never Happen in "Star Trek" #9

@p5pRT
Copy link
Author

p5pRT commented Jul 25, 2016

From @cpansprout

On Mon Jul 25 15​:03​:40 2016, davem wrote​:

On Sun, Jul 24, 2016 at 05​:49​:25PM -0700, Father Chrysostomos via RT
wrote​:

On Sun Oct 19 11​:09​:57 2014, sprout wrote​:

It was eval 'join,' that was crashing. I have a fix that I will
push
as soon as the tests finish.

I forgot to mention at the time that it was commit a454544 and
that fixed the crash.

Also, in commit 1ccc3f3 I added to-do tests for the main issue in
this ticket, which is still unfixed.

Note that my sig branch no longer uses parse_termexpr(), that for
example

sub f ($a = return , $c=3) { }

is now seen as a two-element signature.

True, but the original report applies specifically to parse_termexpr(), which is still unfixed.

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Jul 27, 2016

From @cpansprout

On Mon Jul 25 16​:02​:42 2016, sprout wrote​:

On Mon Jul 25 15​:03​:40 2016, davem wrote​:

On Sun, Jul 24, 2016 at 05​:49​:25PM -0700, Father Chrysostomos via RT
wrote​:

Also, in commit 1ccc3f3 I added to-do tests for the main issue
in
this ticket, which is still unfixed.

Note that my sig branch no longer uses parse_termexpr(), that for
example

sub f ($a = return , $c=3) { }

is now seen as a two-element signature.

True, but the original report applies specifically to
parse_termexpr(), which is still unfixed.

That does mean that we no longer have to-do tests for the bug. Such tests would have to use XS​::APItest.

--

Father Chrysostomos

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants