Skip Menu |
Report information
Id: 129903
Status: open
Priority: 0/
Queue: perl5

Owner: Nobody
Requestors: brian.carpenter [at] gmail.com
Cc:
AdminCc:

Operating System: (no value)
PatchStatus: (no value)
Severity: low
Type: unknown
Perl Version: (no value)
Fixed In: (no value)



Date: Mon, 17 Oct 2016 12:25:24 -0500
From: "Brian 'geeknik' Carpenter" <brian.carpenter [...] gmail.com>
Subject: regexec.c stack overflow
To: perlbug [...] perl.org
Download (untitled) / with headers
text/plain 2.3k
I'm not convinced that this is an actual bug, but #p5p was silent when I asked about it. Affects Perl back to 5.20.2 including v5.25.6 (v5.25.5-114-g87af8d5). Valgrind fails pretty quickly but gdb just goes on forever.

perl -e '/(?{m}(0)},s\/\/\/})//0'

==6615==ERROR: AddressSanitizer: stack-overflow on address 0x7ffc78f88ca8 (pc 0x0000004c0d4c bp 0x7ffc78f89500 sp 0x7ffc78f88cb0 T0)
    #0 0x4c0d4b in calloc (/root/perl/perl+0x4c0d4b)
    #1 0x7f9301 in Perl_safesyscalloc /root/perl/util.c:442:18
    #2 0xb45aed in Perl_regexec_flags /root/perl/regexec.c:3128:9
    #3 0x8cf9c5 in Perl_pp_subst /root/perl/pp_hot.c:2981:10
    #4 0x7f4483 in Perl_runops_debug /root/perl/dump.c:2246:23
    #5 0xb984b6 in S_regmatch /root/perl/regexec.c:6888:3
    #6 0xb7337c in S_regtry /root/perl/regexec.c:3622:14
...
...
    #247 0xb4b82b in Perl_regexec_flags /root/perl/regexec.c:3489:7
    #248 0x8cf9c5 in Perl_pp_subst /root/perl/pp_hot.c:2981:10
    #249 0x7f4483 in Perl_runops_debug /root/perl/dump.c:2246:23
    #250 0xb984b6 in S_regmatch /root/perl/regexec.c:6888:3
    #251 0xb7337c in S_regtry /root/perl/regexec.c:3622:14

SUMMARY: AddressSanitizer: stack-overflow ??:0 calloc
==6615==ABORTING

==19424== Stack overflow in thread 1: can't grow stack to 0xffe801f90
==19424==
==19424== Process terminating with default action of signal 11 (SIGSEGV)
==19424==  Access not within mapped region at address 0xFFE801F90
==19424==    at 0x5B4137: S_regtry (regexec.c:3578)
==19424==  If you believe this happened as a result of a stack
==19424==  overflow in your program's main thread (unlikely but
==19424==  possible), you can try to increase the size of the
==19424==  main thread stack using the --main-stacksize= flag.
==19424==  The main thread stack size used in this run was 8388608.
==19424== Stack overflow in thread 1: can't grow stack to 0xffe801f88
==19424==
==19424== Process terminating with default action of signal 11 (SIGSEGV)
==19424==  Access not within mapped region at address 0xFFE801F88
==19424==    at 0x4A236C0: _vgnU_freeres (vg_preloaded.c:58)
==19424==  If you believe this happened as a result of a stack
==19424==  overflow in your program's main thread (unlikely but
==19424==  possible), you can try to increase the size of the
==19424==  main thread stack using the --main-stacksize= flag.
==19424==  The main thread stack size used in this run was 8388608.
Segmentation fault




RT-Send-CC: perl5-porters [...] perl.org
Download (untitled) / with headers
text/plain 1.3k
On Mon Oct 17 10:26:19 2016, brian.carpenter@gmail.com wrote: Show quoted text
> I'm not convinced that this is an actual bug,
I think it is. Show quoted text
> but #p5p was silent when I > asked about it. Affects Perl back to 5.20.2 including v5.25.6 > (v5.25.5-114-g87af8d5). Valgrind fails pretty quickly but gdb just goes on > forever. > > perl -e '/(?{m}(0)},s\/\/\/})//0'
That is nonsensical code. $ perl5.18.3 -e '/(?{m}(0)},s\/\/\/})//0' Segmentation fault: 11 $ perl5.14.4 -e '/(?{m}(0)},s\/\/\/})//0' Sequence (?{...}) not terminated or not {}-balanced in regex; marked by <-- HERE in m/(?{ <-- HERE m}(0)},s///})/ at -e line 1. I do not have 5.16 handy. The output from 5.14 is what I would expect. Show quoted text
> ==6615==ERROR: AddressSanitizer: stack-overflow on address 0x7ffc78f88ca8 > (pc 0x0000004c0d4c bp 0x7ffc78f89500 sp 0x7ffc78f88cb0 T0) > #0 0x4c0d4b in calloc (/root/perl/perl+0x4c0d4b) > #1 0x7f9301 in Perl_safesyscalloc /root/perl/util.c:442:18 > #2 0xb45aed in Perl_regexec_flags /root/perl/regexec.c:3128:9 > #3 0x8cf9c5 in Perl_pp_subst /root/perl/pp_hot.c:2981:10 > #4 0x7f4483 in Perl_runops_debug /root/perl/dump.c:2246:23 > #5 0xb984b6 in S_regmatch /root/perl/regexec.c:6888:3 > #6 0xb7337c in S_regtry /root/perl/regexec.c:3622:14
It should not even be getting that far. It should fail at compile time. -- Father Chrysostomos
Date: Mon, 17 Oct 2016 22:05:39 -0400
To: Perl RT Bug Tracker <perlbug-followup [...] perl.org>
Subject: Re: [perl #129903] regexec.c stack overflow
From: Dan Collins <dcollinsn [...] gmail.com>
$ perl5.14.0 -e '/(?{m}(0)},s\/\/\/})//0'
Sequence (?{...}) not terminated or not {}-balanced in regex; marked by <-- HERE in m/(?{ <-- HERE m}(0)},s///})/ at -e line 1.
$ perl5.18.0 -e '/(?{m}(0)},s\/\/\/})//0'
Segmentation fault
$ perl5.16.0 -e '/(?{m}(0)},s\/\/\/})//0'
Sequence (?{...}) not terminated or not {}-balanced in regex; marked by <-- HERE in m/(?{ <-- HERE m}(0)},s///})/ at -e line 1.
$ perl5.17.0 -e '/(?{m}(0)},s\/\/\/})//0'
Sequence (?{...}) not terminated or not {}-balanced in regex; marked by <-- HERE in m/(?{ <-- HERE m}(0)},s///})/ at -e line 1.
$ perl5.17.5 -e '/(?{m}(0)},s\/\/\/})//0'
Segmentation fault
$ perl5.17.3 -e '/(?{m}(0)},s\/\/\/})//0'
Segmentation fault
$ perl5.17.1 -e '/(?{m}(0)},s\/\/\/})//0'
Segmentation fault
$ perl Porting/bisect.pl --start=v5.17.0 --end=v5.17.1 --crash -- ./perl -Ilib -e '/(?{m}(0)},s\/\/\/})//0'
68e2671bec1b01022978d5d5eb6eee8742396e13 is the first bad commit
commit 68e2671bec1b01022978d5d5eb6eee8742396e13
Author: David Mitchell <davem@iabyn.com>
Date:   Thu Aug 25 11:41:49 2011 +0100

    Mostly complete fix for literal /(?{..})/ blocks

    Change the way that code blocks in patterns are parsed and executed,
    especially as regards lexical and scoping behaviour.

    (Note that this fix only applies to literal code blocks appearing within
    patterns: run-time patterns, and literals within qr//, are still done the
    old broken way for now).

    This change means that for literal /(?{..})/ and /(??{..})/:

    * the code block is now fully parsed in the same pass as the surrounding
      code, which means that the compiler no longer just does a simplistic
      count of balancing {} to find the limits of the code block;
      i.e. stuff like /(?{  $x = "{" })/ now works (in the same way
      that subscripts in double quoted strings always have: "$a{'{'}" )

    * Error and warning messages will now appear to emanate from the main body
      rather than an re_eval; e.g. the output from

        #!/usr/bin/perl
        /(?{ warn "boo" })/

    has changed from

        boo at (re_eval 1) line 1.

    to

        boo at /tmp/p line 2.

    * scope and closures now behave as you might expect; for example

            for my $x (qw(a b c)) { "" =~ /(?{ print $x })/ }

      now prints "abc" rather than ""

    * with recursion, it now finds the lexical within the appropriate depth
      of pad: this code now prints "012" rather than "000":

        sub recurse {
            my ($n) = @_;
            return if $n > 2;
            "" =~ /^(?{print $n})/;
            recurse($n+1);
        }
        recurse(0);

    * an earlier fix that stopped 'my' declarations within code blocks causing
      crashes, required the accumulating of two SAVECOMPPADs on the stack for
      each iteration of the code block; this is no longer needed;

    * UNITCHECK blocks within literal code blocks are now run as part of the
      main body of code (run-time code blocks still trigger an immediate
      call to the UNITCHECK block though)

    This is all achieved by building upon the efforts of the commits which led
    up to this; those altered the parser to parse literal code blocks
    directly, but up until now those code blocks were discarded by
    Perl_pmruntime and the block re-compiled using the original re_eval
    mechanism. As of this commit, for the non-qr and non-runtime variants,
    those code blocks are no longer thrown away. Instead:

    * the LISTOP generated by the parser, which contains all the code
      blocks plus OP_CONSTs that collectively make up the literal pattern,
      is now stored in a new field in PMOPs, called op_code_list. For example
      in /A(?{BLOCK})C/, the listop stored in op_code_list looks like

        LIST
            PUSHMARK
            CONST['A']
            NULL/special (aka a DO block)
                BLOCK
            CONST['(?{BLOCK})']
            CONST['B']

    * each of the code blocks has its last op set to null and is individually
      run through the peephole optimiser, so each one becomes a little
      self-contained block of code, rather than a list of blocks that run into
      each other;

    * then in re_op_compile(), we concatenate the list of CONSTs to produce a
      string to be compiled, but at the same time we note any DO blocks and
      note the start and end positions of the corresponding CONST['(?{BLOCK})'];

    * (if the current regex engine isn't the built-in perl one, then we just
      throw away the code blocks and pass the concatenated string to the engine)

    * then during regex compilation, whenever we encounter a '(?{', we see if
      it matches the index of one of the pre-compiled blocks, and if so, we
      store a pointer to that block in an 'l' data slot, and use the end index
      to skip over the text of the code body. Conversely, if the index doesn't
      match, then we know that it's a run-time pattern and (for now), compile
      it in the old way.

    * During execution, when an EVAL op is encountered, if data->what is 'l',
      then we just use the pad that was in effect when the pattern was called;
      i.e. we use the current pad slot of the currently executing CV that the
      pattern is embedded within.

:100644 100644 75ee7e7bae9366e089980b8a7d346198645cdfe9 8f2fa766c6cc393d413d59b55ad6f896ec5c7010 M      dump.c
:100644 100644 5cc98874ae03eae36d4207071ed587d51e80ebed 4d82c7cc5797c4a97f2251f4f1487aa090efda99 M      op.c
:100644 100644 6aa16f5725db2ae0a6a01b0d5b34b6446e8b5ebf f267da2c9c752b765151eca4c7009230c97b82ed M      op.h
:040000 040000 3454e32e87872db5fef604d816c7ca37dc487166 f7b4543ded1cbaf299521b31150d1c38e5b11509 M      pod
:100644 100644 394502b314a65eea36bca2f52e31f50048f0060d b45122c1a79df44e6094c1001fc7b3456543b740 M      regcomp.c
:100644 100644 0fdb0058b8b5fce8763f7d0da05b187d460838b6 a9da0c97e12a8038ef0bc49870737fe62a7c01e6 M      regcomp.h
:100644 100644 bb845a79216c447b2fa7ec08cee46602b05d3461 f384c4d32c11fe1917e7e8fbcd998b60dc1355c3 M      regexec.c
:040000 040000 a2c61b299bd88c2308af219684388ae3deea371d 4ae03f25b1ebcee1518e0dc6ccc7ae23b1433f61 M      t
bisect run success
That took 772 seconds.

--
Dan Collins
CC: Perl5 Porteros <perl5-porters [...] perl.org>
Subject: Re: [perl #129903] regexec.c stack overflow
To: Perl RT Bug Tracker <perlbug-followup [...] perl.org>
From: demerphq <demerphq [...] gmail.com>
Date: Tue, 18 Oct 2016 10:00:40 +0200
Download (untitled) / with headers
text/plain 2.2k
On 18 October 2016 at 03:17, Father Chrysostomos via RT <perlbug-followup@perl.org> wrote: Show quoted text
> On Mon Oct 17 10:26:19 2016, brian.carpenter@gmail.com wrote:
>> I'm not convinced that this is an actual bug,
> > I think it is. >
>> but #p5p was silent when I >> asked about it. Affects Perl back to 5.20.2 including v5.25.6 >> (v5.25.5-114-g87af8d5). Valgrind fails pretty quickly but gdb just goes on >> forever. >> >> perl -e '/0'
> > That is nonsensical code.
Do you mean just the '/0' or the full pattern? With some editing I read the original code as being the same as (m/(?{ m!(0)!, s!!! })/) / 0 Which is strange but IMO valid code. Show quoted text
> $ perl5.18.3 -e '/(?{m}(0)},s\/\/\/})//0' > Segmentation fault: 11 > $ perl5.14.4 -e '/(?{m}(0)},s\/\/\/})//0' > Sequence (?{...}) not terminated or not {}-balanced in regex; marked by <-- HERE in m/(?{ <-- HERE m}(0)},s///})/ at -e line 1. > > I do not have 5.16 handy. The output from 5.14 is what I would expect.
I am not sure why. Is it because you expect the first } to close the (?{ , while I dont? The inside of the (?{ ... }) is code, and should follow nesting rules, so m}(0)} should be parsed and extracted as a balanced construct, just as it would if it were q}foo}, and then the s/// gets parsed out, to find the trailing }) which finish the (?{}) construct. I would say the 5.14.4 behavior is a misparse. If this is legal: perl -le'print q}foo}' foo then I would expect this: m}(0)} to be legal too, and i would expect it to be legal inside of a (?{ ... }) construct in a pattern. Show quoted text
>> ==6615==ERROR: AddressSanitizer: stack-overflow on address 0x7ffc78f88ca8 >> (pc 0x0000004c0d4c bp 0x7ffc78f89500 sp 0x7ffc78f88cb0 T0) >> #0 0x4c0d4b in calloc (/root/perl/perl+0x4c0d4b) >> #1 0x7f9301 in Perl_safesyscalloc /root/perl/util.c:442:18 >> #2 0xb45aed in Perl_regexec_flags /root/perl/regexec.c:3128:9 >> #3 0x8cf9c5 in Perl_pp_subst /root/perl/pp_hot.c:2981:10 >> #4 0x7f4483 in Perl_runops_debug /root/perl/dump.c:2246:23 >> #5 0xb984b6 in S_regmatch /root/perl/regexec.c:6888:3 >> #6 0xb7337c in S_regtry /root/perl/regexec.c:3622:14
> > It should not even be getting that far. It should fail at compile time.
Looking forward to hearing more on why. Yves -- perl -Mre=debug -e "/just|another|perl|hacker/"
CC: Perl5 Porteros <perl5-porters [...] perl.org>
From: demerphq <demerphq [...] gmail.com>
Date: Tue, 18 Oct 2016 10:17:25 +0200
Subject: Re: [perl #129903] regexec.c stack overflow
To: Perl RT Bug Tracker <perlbug-followup [...] perl.org>
Download (untitled) / with headers
text/plain 5.4k
On 18 October 2016 at 10:00, demerphq <demerphq@gmail.com> wrote: Show quoted text
> On 18 October 2016 at 03:17, Father Chrysostomos via RT > <perlbug-followup@perl.org> wrote:
>> On Mon Oct 17 10:26:19 2016, brian.carpenter@gmail.com wrote:
>>> I'm not convinced that this is an actual bug,
>> >> I think it is. >>
>>> but #p5p was silent when I >>> asked about it. Affects Perl back to 5.20.2 including v5.25.6 >>> (v5.25.5-114-g87af8d5). Valgrind fails pretty quickly but gdb just goes on >>> forever. >>> >>> perl -e '/0'
>> >> That is nonsensical code.
> > Do you mean just the '/0' or the full pattern? > > With some editing I read the original code as being the same as > > > (m/(?{ > m!(0)!, > s!!! > })/) / 0 > > Which is strange but IMO valid code. >
>> $ perl5.18.3 -e '/(?{m}(0)},s\/\/\/})//0' >> Segmentation fault: 11 >> $ perl5.14.4 -e '/(?{m}(0)},s\/\/\/})//0' >> Sequence (?{...}) not terminated or not {}-balanced in regex; marked by <-- HERE in m/(?{ <-- HERE m}(0)},s///})/ at -e line 1. >> >> I do not have 5.16 handy. The output from 5.14 is what I would expect.
> > I am not sure why. Is it because you expect the first } to close the > (?{ , while I dont? The inside of the (?{ ... }) is code, and should > follow nesting rules, so m}(0)} should be parsed and extracted as a > balanced construct, just as it would if it were q}foo}, and then the > s/// gets parsed out, to find the trailing }) which finish the (?{}) > construct. > > I would say the 5.14.4 behavior is a misparse. > > If this is legal: > > perl -le'print q}foo}' > foo > > then I would expect this: > > m}(0)} > > to be legal too, and i would expect it to be legal inside of a (?{ ... > }) construct in a pattern. >
>>> ==6615==ERROR: AddressSanitizer: stack-overflow on address 0x7ffc78f88ca8 >>> (pc 0x0000004c0d4c bp 0x7ffc78f89500 sp 0x7ffc78f88cb0 T0) >>> #0 0x4c0d4b in calloc (/root/perl/perl+0x4c0d4b) >>> #1 0x7f9301 in Perl_safesyscalloc /root/perl/util.c:442:18 >>> #2 0xb45aed in Perl_regexec_flags /root/perl/regexec.c:3128:9 >>> #3 0x8cf9c5 in Perl_pp_subst /root/perl/pp_hot.c:2981:10 >>> #4 0x7f4483 in Perl_runops_debug /root/perl/dump.c:2246:23 >>> #5 0xb984b6 in S_regmatch /root/perl/regexec.c:6888:3 >>> #6 0xb7337c in S_regtry /root/perl/regexec.c:3622:14
>> >> It should not even be getting that far. It should fail at compile time.
> > Looking forward to hearing more on why.
I dont think the simplified version of the code should infinite loop on the stack. For this code (because $_ is the empty string) we should perform an outer match once, then a match using qr/(0)/ twice, once via the "empty pattern special case" in the s///, both of which should fail, and then the outer match should stop. We should not loop. Apparently the problem is related to the empty-pattern special case behavior. If I change the pattern from m/(?{ m!(0)!, s!!! })/ to m/(?{ m!(0)!, s!(0)!! })/ which should be equivalent valgrind is happy and we get what I expect, a division by zero error: $ ./perl -le'(m/(?{ m!(0)!, s!(0)!! })/) / 0' Illegal division by zero at -e line 1. $ ./perl -le'(m/(?{ m!(0)!, s!!! })/) / 0' Segmentation fault And here is the valgrind output: $ valgrind ./perl -le'(m/(?{ m!(0)!, s!!! })/) / 0' ==25240== Memcheck, a memory error detector ==25240== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al. ==25240== Using Valgrind-3.10.1 and LibVEX; rerun with -h for copyright info ==25240== Command: ./perl -le(m/(?{\ m!(0)!,\ s!!!\ })/)\ /\ 0 ==25240== ==25240== Stack overflow in thread 1: can't grow stack to 0xffe801f88 ==25240== ==25240== Process terminating with default action of signal 11 (SIGSEGV) ==25240== Access not within mapped region at address 0xFFE801F88 ==25240== at 0x5D01B0: Perl_sv_2pv_flags (sv.c:2939) ==25240== If you believe this happened as a result of a stack ==25240== overflow in your program's main thread (unlikely but ==25240== possible), you can try to increase the size of the ==25240== main thread stack using the --main-stacksize= flag. ==25240== The main thread stack size used in this run was 8388608. ==25240== Stack overflow in thread 1: can't grow stack to 0xffe801f68 ==25240== ==25240== Process terminating with default action of signal 11 (SIGSEGV) ==25240== Access not within mapped region at address 0xFFE801F68 ==25240== at 0x4A256B0: _vgnU_freeres (in /usr/lib/valgrind/vgpreload_core-amd64-linux.so) ==25240== If you believe this happened as a result of a stack ==25240== overflow in your program's main thread (unlikely but ==25240== possible), you can try to increase the size of the ==25240== main thread stack using the --main-stacksize= flag. ==25240== The main thread stack size used in this run was 8388608. ==25240== ==25240== HEAP SUMMARY: ==25240== in use at exit: 7,449,285 bytes in 11,675 blocks ==25240== total heap usage: 11,825 allocs, 150 frees, 8,150,685 bytes allocated ==25240== ==25240== LEAK SUMMARY: ==25240== definitely lost: 0 bytes in 0 blocks ==25240== indirectly lost: 0 bytes in 0 blocks ==25240== possibly lost: 0 bytes in 0 blocks ==25240== still reachable: 7,449,285 bytes in 11,675 blocks ==25240== suppressed: 0 bytes in 0 blocks ==25240== Rerun with --leak-check=full to see details of leaked memory ==25240== ==25240== For counts of detected and suppressed errors, rerun with: -v ==25240== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0) Segmentation fault This is definitely a bug. Yves -- perl -Mre=debug -e "/just|another|perl|hacker/"
Date: Tue, 18 Oct 2016 10:21:06 +0200
To: Perl RT Bug Tracker <perlbug-followup [...] perl.org>
Subject: Re: [perl #129903] regexec.c stack overflow
CC: Perl5 Porteros <perl5-porters [...] perl.org>
From: demerphq <demerphq [...] gmail.com>
Download (untitled) / with headers
text/plain 368b
On 18 October 2016 at 10:17, demerphq <demerphq@gmail.com> wrote: Show quoted text
> ... Apparently > the problem is related to the empty-pattern special case behavior.
[...] Show quoted text
> > This is definitely a bug.
And yet another example of the empty pattern special case causing problems. Sigh, so much trouble for so little value. Yves -- perl -Mre=debug -e "/just|another|perl|hacker/"
Subject: Re: [perl #129903] regexec.c stack overflow
To: Perl RT Bug Tracker <perlbug-followup [...] perl.org>
CC: Perl5 Porteros <perl5-porters [...] perl.org>
From: demerphq <demerphq [...] gmail.com>
Date: Tue, 18 Oct 2016 11:29:33 +0200
Download (untitled) / with headers
text/plain 957b
On 18 October 2016 at 10:21, demerphq <demerphq@gmail.com> wrote: Show quoted text
> On 18 October 2016 at 10:17, demerphq <demerphq@gmail.com> wrote:
>> ... Apparently >> the problem is related to the empty-pattern special case behavior.
> [...]
>> >> This is definitely a bug.
> > And yet another example of the empty pattern special case causing > problems. Sigh, so much trouble for so little value.
Actually this case can be boiled down to: ./perl -le'/(?{ s!!! })/' which causes an infinite loop due to PL_curpm being set at the start of the match, which the s!!! then sees as the "last successful match". IMO it shouldn't as the outer match hasn't actually finished yet, but because of when and how we set up PL_curpm at match start we have this problem inside of (?{ }) constructs. I think we can fix it by checking PL_curpm against PL_reg_curpm when we do the empty pattern check; running tests now. Yves -- perl -Mre=debug -e "/just|another|perl|hacker/"
RT-Send-CC: perl5-porters [...] perl.org
Download (untitled) / with headers
text/plain 429b
On Tue Oct 18 02:30:01 2016, demerphq wrote: Show quoted text
> I think we can fix it by checking PL_curpm against PL_reg_curpm when > we do the empty pattern check; running tests now.
And do what? Even if they're the same, it still might also legitimately be the last successful match, no? It seems to me we can't offer the promised behaviour unless there's a variable that really is set only at the point we complete a successful match. Hugo
To: Perl RT Bug Tracker <perlbug-followup [...] perl.org>
Subject: Re: [perl #129903] regexec.c stack overflow
CC: Perl5 Porteros <perl5-porters [...] perl.org>
From: demerphq <demerphq [...] gmail.com>
Date: Tue, 18 Oct 2016 13:07:58 +0200
Download (untitled) / with headers
text/plain 1.5k
On 18 October 2016 at 12:03, Hugo van der Sanden via RT <perlbug-followup@perl.org> wrote: Show quoted text
> On Tue Oct 18 02:30:01 2016, demerphq wrote:
>> I think we can fix it by checking PL_curpm against PL_reg_curpm when >> we do the empty pattern check; running tests now.
> > And do what? Even if they're the same, it still might also legitimately be the last successful match, no?
No I don't think so, and all the tests pass. As far as I can tell PL_reg_curpm is a special opcode which we use when we are setting up PL_curpm before executing a (?{...}) or (??{ ... }) code block so that things like $1 and stuff like that point at the current pattern. Once the code is executed PL_curpm is unwound to no longer point at PL_reg_curpm. Essentially the problem comes down to PL_curpm being the place where things like $1 and friends are derived, as well as how we find the pattern for the last successful match, but inside of a (?{ ... }) we want $1 to point at the *current* pattern, not the *last* successfully matched one. This conflict currently causes use of the empty pattern in a code block to point at the pattern that contains the code, and trigger infinite recursion. So to fix this I think we would have to split PL_curpm in two, and have one var for $1 and friends representing "the last successful matched pattern or when in a regex code construct the currently executing pattern", and another for "last successful matched pattern" which would be use by the empty pattern magic. On the other hand we can simply forbid the use of the empty pattern inside of a regex code block and have a much simpler patch. :-) Yves
RT-Send-CC: perl5-porters [...] perl.org
Download (untitled) / with headers
text/plain 951b
On Tue Oct 18 01:01:21 2016, demerphq wrote: Show quoted text
> On 18 October 2016 at 03:17, Father Chrysostomos via RT > <perlbug-followup@perl.org> wrote:
> > On Mon Oct 17 10:26:19 2016, brian.carpenter@gmail.com wrote:
> >> I'm not convinced that this is an actual bug,
> > > > I think it is. > >
> >> but #p5p was silent when I > >> asked about it. Affects Perl back to 5.20.2 including v5.25.6 > >> (v5.25.5-114-g87af8d5). Valgrind fails pretty quickly but gdb just > >> goes on > >> forever. > >> > >> perl -e '/0'
> > > > That is nonsensical code.
> > Do you mean just the '/0' or the full pattern?
I do not know what happened, but the line that I quoted was: perl -e '/(?{m}(0)},s\/\/\/})//0' not: perl -e '/0' Show quoted text
> With some editing I read the original code as being the same as > > > (m/(?{ > m!(0)!, > s!!! > })/) / 0
I see now that I misread the code. the } for the m}} delimiter was what threw me off. -- Father Chrysostomos
RT-Send-CC: perl5-porters [...] perl.org
Download (untitled) / with headers
text/plain 1.7k
On Tue Oct 18 04:08:30 2016, demerphq wrote: Show quoted text
> On 18 October 2016 at 12:03, Hugo van der Sanden via RT > <perlbug-followup@perl.org> wrote:
> > On Tue Oct 18 02:30:01 2016, demerphq wrote:
> >> I think we can fix it by checking PL_curpm against PL_reg_curpm when > >> we do the empty pattern check; running tests now.
> > > > And do what? Even if they're the same, it still might also > > legitimately be the last successful match, no?
> > No I don't think so, and all the tests pass. > > As far as I can tell PL_reg_curpm is a special opcode which we use > when we are setting up PL_curpm before executing a (?{...}) or (??{ > ... }) code block so that things like $1 and stuff like that point at > the current pattern. Once the code is executed PL_curpm is unwound to > no longer point at PL_reg_curpm. Essentially the problem comes down to > PL_curpm being the place where things like $1 and friends are derived, > as well as how we find the pattern for the last successful match, but > inside of a (?{ ... }) we want $1 to point at the *current* pattern, > not the *last* successfully matched one. This conflict currently > causes use of the empty pattern in a code block to point at the > pattern that contains the code, and trigger infinite recursion. So to > fix this I think we would have to split PL_curpm in two, and have one > var for $1 and friends representing "the last successful matched > pattern or when in a regex code construct the currently executing > pattern", and another for "last successful matched pattern" which > would be use by the empty pattern magic. > > On the other hand we can simply forbid the use of the empty pattern > inside of a regex code block and have a much simpler patch. :-)
PL_curpm does far too many things. This is not the only bug that results. -- Father Chrysostomos
CC: Perl5 Porteros <perl5-porters [...] perl.org>
Subject: Re: [perl #129903] regexec.c stack overflow
To: Perl RT Bug Tracker <perlbug-followup [...] perl.org>
From: demerphq <demerphq [...] gmail.com>
Date: Wed, 19 Oct 2016 09:22:43 +0200
Download (untitled) / with headers
text/plain 1.9k

On 18 Oct 2016 23:46, "Father Chrysostomos via RT" <perlbug-followup@perl.org> wrote:
Show quoted text

>
> On Tue Oct 18 04:08:30 2016, demerphq wrote:
> > On 18 October 2016 at 12:03, Hugo van der Sanden via RT
> > <perlbug-followup@perl.org> wrote:
> > > On Tue Oct 18 02:30:01 2016, demerphq wrote:
> > >> I think we can fix it by checking PL_curpm against PL_reg_curpm when
> > >> we do the empty pattern check; running tests now.
> > >
> > > And do what? Even if they're the same, it still might also
> > > legitimately be the last successful match, no?
> >
> > No I don't think so, and all the tests pass.
> >
> > As far as I can tell PL_reg_curpm is a special opcode which we use
> > when we are setting up PL_curpm before executing a (?{...}) or (??{
> > ... }) code block so that things like $1 and stuff like that point at
> > the current pattern. Once the code is executed PL_curpm is unwound to
> > no longer point at PL_reg_curpm. Essentially the problem comes down to
> > PL_curpm being the place where things like $1 and friends are derived,
> > as well as how we find the pattern for the last successful match, but
> > inside of a (?{ ... }) we want $1 to point at the *current* pattern,
> > not the *last* successfully matched one. This conflict currently
> > causes use of the empty pattern in a code block to point at the
> > pattern that contains the code, and trigger infinite recursion. So to
> > fix this I think we would have to split PL_curpm in two, and have one
> > var for $1 and friends representing "the last successful matched
> > pattern or when in a regex code construct the currently executing
> > pattern", and another for "last successful matched pattern" which
> > would be use by the empty pattern magic.
> >
> > On the other hand we can simply forbid the use of the empty pattern
> > inside of a regex code block and have a much simpler patch. :-)
>
> PL_curpm does far too many things.  This is not the only bug that results.

Can you expand on that thought a bit please?

Yves

Date: Wed, 19 Oct 2016 17:51:49 +0200
To: Perl RT Bug Tracker <perlbug-followup [...] perl.org>
Subject: Re: [perl #129903] regexec.c stack overflow
CC: Perl5 Porteros <perl5-porters [...] perl.org>
From: demerphq <demerphq [...] gmail.com>
Download (untitled) / with headers
text/plain 2.4k
On 19 October 2016 at 09:22, demerphq <demerphq@gmail.com> wrote: Show quoted text
> On 18 Oct 2016 23:46, "Father Chrysostomos via RT" > <perlbug-followup@perl.org> wrote:
>> >> On Tue Oct 18 04:08:30 2016, demerphq wrote:
>> > On 18 October 2016 at 12:03, Hugo van der Sanden via RT >> > <perlbug-followup@perl.org> wrote:
>> > > On Tue Oct 18 02:30:01 2016, demerphq wrote:
>> > >> I think we can fix it by checking PL_curpm against PL_reg_curpm when >> > >> we do the empty pattern check; running tests now.
>> > > >> > > And do what? Even if they're the same, it still might also >> > > legitimately be the last successful match, no?
>> > >> > No I don't think so, and all the tests pass. >> > >> > As far as I can tell PL_reg_curpm is a special opcode which we use >> > when we are setting up PL_curpm before executing a (?{...}) or (??{ >> > ... }) code block so that things like $1 and stuff like that point at >> > the current pattern. Once the code is executed PL_curpm is unwound to >> > no longer point at PL_reg_curpm. Essentially the problem comes down to >> > PL_curpm being the place where things like $1 and friends are derived, >> > as well as how we find the pattern for the last successful match, but >> > inside of a (?{ ... }) we want $1 to point at the *current* pattern, >> > not the *last* successfully matched one. This conflict currently >> > causes use of the empty pattern in a code block to point at the >> > pattern that contains the code, and trigger infinite recursion. So to >> > fix this I think we would have to split PL_curpm in two, and have one >> > var for $1 and friends representing "the last successful matched >> > pattern or when in a regex code construct the currently executing >> > pattern", and another for "last successful matched pattern" which >> > would be use by the empty pattern magic. >> > >> > On the other hand we can simply forbid the use of the empty pattern >> > inside of a regex code block and have a much simpler patch. :-)
>> >> PL_curpm does far too many things. This is not the only bug that results.
> > Can you expand on that thought a bit please?
BTW, I pushed a patch to forbid use of the empty pattern inside of regex code blocks. Once I understand what else you worry about with PL_curpm I will do a patch to make it work properly. But I figure a run time catchable fatal exception is better than a C stack overflow until we can make it work properly. Yves -- perl -Mre=debug -e "/just|another|perl|hacker/"
RT-Send-CC: perl5-porters [...] perl.org
On Wed Oct 19 00:23:19 2016, demerphq wrote: Show quoted text
> On 18 Oct 2016 23:46, "Father Chrysostomos via RT" < > perlbug-followup@perl.org> wrote:
> > PL_curpm does far too many things. This is not the only bug that results.
> > Can you expand on that thought a bit please?
Having PL_curpm point to the actual pattern in the op tree results in different recursion levels sharing the same set of match variables: $u = ",echle etn sJ"; $t = "\nrka rPrhoatu"; $_ = $u.$t; sub foo { s/(.)//s or return; bar(); print chop $$1 } sub bar { s/(.)//s or return; foo(); print chop $$1 } foo (I thought I had reported this before, but I cannot find the ticket now.) Using a qr// works. Similarly, having m|(?{ // })| try to use the *current* pattern instead of the last pattern to match successfully—solely due to an implementation detail—seems wrong. PL_curpm is serving three purposes: the last match op, the last set of matched buffers ($1 etc.), and the innermost pattern enclosing the currently executing code. -- Father Chrysostomos
To: Perl RT Bug Tracker <perlbug-followup [...] perl.org>
Subject: Re: [perl #129903] regexec.c stack overflow
CC: Perl5 Porteros <perl5-porters [...] perl.org>
From: demerphq <demerphq [...] gmail.com>
Date: Wed, 19 Oct 2016 23:29:02 +0200
Download (untitled) / with headers
text/plain 1.3k
On 19 October 2016 at 22:51, Father Chrysostomos via RT <perlbug-followup@perl.org> wrote: Show quoted text
> On Wed Oct 19 00:23:19 2016, demerphq wrote:
>> On 18 Oct 2016 23:46, "Father Chrysostomos via RT" < >> perlbug-followup@perl.org> wrote:
>> > PL_curpm does far too many things. This is not the only bug that results.
>> >> Can you expand on that thought a bit please?
> > Having PL_curpm point to the actual pattern in the op tree results in different recursion levels sharing the same set of match variables: > > $u = ",echle etn sJ"; > $t = "\nrka rPrhoatu"; > $_ = $u.$t; > sub foo { s/(.)//s or return; bar(); print chop $$1 } > sub bar { s/(.)//s or return; foo(); print chop $$1 } > foo > > (I thought I had reported this before, but I cannot find the ticket now.)
Wow. It took me a while to work that through. :-) I see what you mean. That is going to be a lot harder to fix than the other issues with PL_curpm IMO. Show quoted text
> Using a qr// works. > > Similarly, having m|(?{ // })| try to use the *current* pattern instead of the last pattern to match successfully—solely due to an implementation detail—seems wrong.
Yes I agree. Show quoted text
> PL_curpm is serving three purposes: the last match op, the last set of matched buffers ($1 etc.), and the innermost pattern enclosing the currently executing code.
Yes, I see what you mean. I think I might have put it differently, but i get your point now for sure. Yves
Date: Thu, 20 Oct 2016 09:37:17 +0200
CC: Perl5 Porteros <perl5-porters [...] perl.org>
From: demerphq <demerphq [...] gmail.com>
To: Perl RT Bug Tracker <perlbug-followup [...] perl.org>
Subject: Re: [perl #129903] regexec.c stack overflow
Download (untitled) / with headers
text/plain 866b
On 19 October 2016 at 23:29, demerphq <demerphq@gmail.com> wrote: Show quoted text
> On 19 October 2016 at 22:51, Father Chrysostomos via RT
>> Similarly, having m|(?{ // })| try to use the *current* pattern instead of the last pattern to match successfully—solely due to an implementation detail—seems wrong.
> > Yes I agree.
So following up for this I put together a quick hack to make it possible to use the empty pattern in regexen, but honestly I dont see how it is very useful, it is all too easy to create an infinite loop (which luckily seems to be easy to detect). ./perl -le'$_="aa"; /a/; /(?{ s!!x! })/; print; //; print' xa Infinite recursion via empty pattern at -e line 1. I personally think this is no less ugly than the previous patch, but if you prefer it I can push it. FWIW, this seems to be yet another good reason to get rid of the empty pattern. Yves
From: demerphq <demerphq [...] gmail.com>
CC: Perl5 Porteros <perl5-porters [...] perl.org>
Date: Thu, 20 Oct 2016 09:41:40 +0200
Subject: Re: [perl #129903] regexec.c stack overflow
To: Perl RT Bug Tracker <perlbug-followup [...] perl.org>
Download (untitled) / with headers
text/plain 1.1k
On 20 October 2016 at 09:37, demerphq <demerphq@gmail.com> wrote: Show quoted text
> On 19 October 2016 at 23:29, demerphq <demerphq@gmail.com> wrote:
>> On 19 October 2016 at 22:51, Father Chrysostomos via RT
>>> Similarly, having m|(?{ // })| try to use the *current* pattern instead of the last pattern to match successfully—solely due to an implementation detail—seems wrong.
>> >> Yes I agree.
> > So following up for this I put together a quick hack to make it > possible to use the empty pattern in regexen, but honestly I dont see > how it is very useful, it is all too easy to create an infinite loop > (which luckily seems to be easy to detect). > > ./perl -le'$_="aa"; /a/; /(?{ s!!x! })/; print; //; print' > xa > Infinite recursion via empty pattern at -e line 1. > > I personally think this is no less ugly than the previous patch, but > if you prefer it I can push it. > > FWIW, this seems to be yet another good reason to get rid of the empty pattern.
Oh, here is another reason: ./perl -le'$_="ab"; my $p=qr/(?{ s!!x! })/; /$p/; print; /a/; /$p/; print; /b/; /$p/; print; //;' xab xxb xxx Infinite recursion via empty pattern at -e line 1. Crazy. -- perl -Mre=debug -e "/just|another|perl|hacker/"
RT-Send-CC: perl5-porters [...] perl.org
Download (untitled) / with headers
text/plain 815b
On Thu Oct 20 00:37:45 2016, demerphq wrote: Show quoted text
> On 19 October 2016 at 23:29, demerphq <demerphq@gmail.com> wrote: > So following up for this I put together a quick hack to make it > possible to use the empty pattern in regexen, but honestly I dont see > how it is very useful, it is all too easy to create an infinite loop > (which luckily seems to be easy to detect). > > ./perl -le'$_="aa"; /a/; /(?{ s!!x! })/; print; //; print' > xa > Infinite recursion via empty pattern at -e line 1. > > I personally think this is no less ugly than the previous patch, but > if you prefer it I can push it.
I would prefer it, but I do not feel strongly about it. Show quoted text
> FWIW, this seems to be yet another good reason to get rid of the empty > pattern.
I am still opposed to that, unfortunately. :-) -- Father Chrysostomos
RT-Send-CC: perl5-porters [...] perl.org
Download (untitled) / with headers
text/plain 1.3k
On Thu Oct 20 00:42:10 2016, demerphq wrote: Show quoted text
> On 20 October 2016 at 09:37, demerphq <demerphq@gmail.com> wrote:
> > On 19 October 2016 at 23:29, demerphq <demerphq@gmail.com> wrote:
> >> On 19 October 2016 at 22:51, Father Chrysostomos via RT
> >>> Similarly, having m|(?{ // })| try to use the *current* pattern > >>> instead of the last pattern to match successfully—solely due to an > >>> implementation detail—seems wrong.
> >> > >> Yes I agree.
> > > > So following up for this I put together a quick hack to make it > > possible to use the empty pattern in regexen, but honestly I dont see > > how it is very useful, it is all too easy to create an infinite loop > > (which luckily seems to be easy to detect). > > > > ./perl -le'$_="aa"; /a/; /(?{ s!!x! })/; print; //; print' > > xa > > Infinite recursion via empty pattern at -e line 1. > > > > I personally think this is no less ugly than the previous patch, but > > if you prefer it I can push it. > > > > FWIW, this seems to be yet another good reason to get rid of the > > empty pattern.
> > Oh, here is another reason: > > ./perl -le'$_="ab"; my $p=qr/(?{ s!!x! })/; /$p/; print; /a/; /$p/; > print; /b/; /$p/; print; //;' > xab > xxb > xxx > Infinite recursion via empty pattern at -e line 1. > > Crazy.
I would agree that the *Perl* code there is crazy and the author of it gets what he deserves. -- Father Chrysostomos
Date: Tue, 1 Nov 2016 13:35:14 +0100
Subject: Re: [perl #129903] regexec.c stack overflow
To: Perl RT Bug Tracker <perlbug-followup [...] perl.org>
From: demerphq <demerphq [...] gmail.com>
Download (untitled) / with headers
text/plain 1012b
On 20 October 2016 at 22:12, Father Chrysostomos via RT <perlbug-followup@perl.org> wrote: Show quoted text
> On Thu Oct 20 00:37:45 2016, demerphq wrote:
>> On 19 October 2016 at 23:29, demerphq <demerphq@gmail.com> wrote: >> So following up for this I put together a quick hack to make it >> possible to use the empty pattern in regexen, but honestly I dont see >> how it is very useful, it is all too easy to create an infinite loop >> (which luckily seems to be easy to detect). >> >> ./perl -le'$_="aa"; /a/; /(?{ s!!x! })/; print; //; print' >> xa >> Infinite recursion via empty pattern at -e line 1. >> >> I personally think this is no less ugly than the previous patch, but >> if you prefer it I can push it.
> > I would prefer it, but I do not feel strongly about it.
Well I just push a patch to meet your preference. Show quoted text
>> FWIW, this seems to be yet another good reason to get rid of the empty >> pattern.
> > I am still opposed to that, unfortunately. :-)
We can debate that one (again) in another thread. :-) Yves


This service is sponsored and maintained by Best Practical Solutions and runs on Perl.org infrastructure.

For issues related to this RT instance (aka "perlbug"), please contact perlbug-admin at perl.org