Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segfault in S_study_chunk (regcomp.c:4870) #16947

Closed
p5pRT opened this issue Apr 10, 2019 · 28 comments
Closed

Segfault in S_study_chunk (regcomp.c:4870) #16947

p5pRT opened this issue Apr 10, 2019 · 28 comments
Milestone

Comments

@p5pRT
Copy link

p5pRT commented Apr 10, 2019

Migrated from rt.perl.org#134017 (status was 'open')

Searchable as RT134017$

@p5pRT
Copy link
Author

p5pRT commented Apr 10, 2019

From @dur-randir

Created by @dur-randir

While fuzzing perl v5.29.9-63-g2496d8f3f7 built with afl and run
under libdislocator, I found the following program

0 =~ /0(?n)|()(()(()(?0)|(?0)))(0)/

to segfault. GDB stack trace is following

#0 0x000055555568f92c in S_study_chunk (pRExC_state=0x7fffffffd650,
scanp=0x7fffffffcec8, minlenp=0x7fffffffd3e0, deltap=0x7fffffffcee8,
  last=0x555555b7b944, data=0x7fffffffd240, stopparen=-1,
recursed_depth=0, and_withp=0x555555b7c600, flags=12288, depth=1) at
regcomp.c​:4870
#1 0x000055555568f171 in S_study_chunk (pRExC_state=0x7fffffffd650,
scanp=0x7fffffffd3d8, minlenp=0x7fffffffd3e0, deltap=0x7fffffffd400,
  last=0x555555b7b948, data=0x7fffffffd9c0, stopparen=-1,
recursed_depth=0, and_withp=0x0, flags=10240, depth=0) at
regcomp.c​:4639
#2 0x000055555569fbc4 in Perl_re_op_compile (patternp=0x0,
pat_count=1, expr=0x555555b79598, eng=0x555555b41d20
<PL_core_reg_engine>, old_re=0x0,
  is_bare_re=0x0, orig_rx_flags=0, pm_flags=0) at regcomp.c​:8176
#3 0x00005555555ba159 in Perl_pmruntime (o=0x555555b795d8,
expr=0x555555b79598, repl=0x0, flags=1, floor=0) at op.c​:7127
#4 0x000055555566ffc3 in Perl_yyparse (gramtype=258) at perly.y​:1234
#5 0x00005555555ec726 in S_parse_body (env=0x0, xsinit=0x5555555a11f8
<xs_init>) at perl.c​:2531
#6 0x00005555555ea9f8 in perl_parse (my_perl=0x555555b4c260,
xsinit=0x5555555a11f8 <xs_init>, argc=2, argv=0x7fffffffe1c8, env=0x0)
at perl.c​:1822
#7 0x00005555555a113b in main (argc=2, argv=0x7fffffffe1c8,
env=0x7fffffffe1e0) at perlmain.c​:126

This is present a regression between 5.24 and 5.26, bisect points to

commit da7cf1c
Author​: Yves Orton <demerphq@​gmail.com>
Date​: Tue May 10 09​:44​:31 2016 +0200

  fix #128109 - do not move RExC_open_parens[0] in reginsert

  In d5a00e4 I merged GOSUB and GOSTART,
  part of which involved making RExC_open_parens[0] refer to the start of
  the pattern, and RExC_close_parens[0] referring to the end of the pattern.

  This tripped up in reginsert in a subtle way, the start of the pattern
  cannot and should not move in reginsert(). Unlike a paren that might
  be at the start of the pattern which should move when something is inserted
  in front of it, the start is a fixed point and should never move.

Perl Info

Flags:
    category=core
    severity=medium

Site configuration information for perl 5.29.9:

Configured by dur-randir at Wed Feb 27 14:51:01 MSK 2019.

Summary of my perl5 (revision 5 version 29 subversion 9) configuration:
  Commit id: c1e47bad34ce1d9c84ed57c9b8978bcbd5a02e98
  Platform:
    osname=darwin
    osvers=13.4.0
    archname=darwin-thread-multi-2level
    uname='darwin isengard.local 13.4.0 darwin kernel version 13.4.0:
mon jan 11 18:17:34 pst 2016; root:xnu-2422.115.15~1release_x86_64
x86_64 '
    config_args='-de -Dusedevel -DDEBUGGING -Dusethreads'
    hint=recommended
    useposix=true
    d_sigaction=define
    useithreads=define
    usemultiplicity=define
    use64bitint=define
    use64bitall=define
    uselongdouble=undef
    usemymalloc=n
    default_inc_excludes_dot=define
    bincompat5005=undef
  Compiler:
    cc='cc'
    ccflags ='-fno-common -DPERL_DARWIN -mmacosx-version-min=10.9
-DDEBUGGING -fno-strict-aliasing -pipe -fstack-protector
-I/usr/local/include -DPERL_USE_SAFE_PUTENV'
    optimize='-O3 -g'
    cppflags='-fno-common -DPERL_DARWIN -mmacosx-version-min=10.9
-DDEBUGGING -fno-strict-aliasing -pipe -fstack-protector
-I/usr/local/include'
    ccversion=''
    gccversion='4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.56)'
    gccosandvers=''
    intsize=4
    longsize=8
    ptrsize=8
    doublesize=8
    byteorder=12345678
    doublekind=3
    d_longlong=define
    longlongsize=8
    d_longdbl=define
    longdblsize=16
    longdblkind=3
    ivtype='long'
    ivsize=8
    nvtype='double'
    nvsize=8
    Off_t='off_t'
    lseeksize=8
    alignbytes=8
    prototype=define
  Linker and Libraries:
    ld='cc'
    ldflags =' -mmacosx-version-min=10.9 -fstack-protector -L/usr/local/lib'
    libpth=/usr/local/lib
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../lib/clang/6.0/lib
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib
/usr/lib
    libs=-lpthread -lgdbm -ldbm -ldl -lm -lutil -lc
    perllibs=-lpthread -ldl -lm -lutil -lc
    libc=
    so=dylib
    useshrplib=false
    libperl=libperl.a
    gnulibc_version=''
  Dynamic Linking:
    dlsrc=dl_dlopen.xs
    dlext=bundle
    d_dlsymun=undef
    ccdlflags=' '
    cccdlflags=' '
    lddlflags=' -mmacosx-version-min=10.9 -bundle -undefined
dynamic_lookup -L/usr/local/lib -fstack-protector'



@INC for perl 5.29.9:
    lib
    /usr/local/lib/perl5/site_perl/5.29.9/darwin-thread-multi-2level
    /usr/local/lib/perl5/site_perl/5.29.9
    /usr/local/lib/perl5/5.29.9/darwin-thread-multi-2level
    /usr/local/lib/perl5/5.29.9


Environment for perl 5.29.9:
    DYLD_LIBRARY_PATH (unset)
    HOME=/Users/dur-randir
    LANG=en_US.UTF-8
    LANGUAGE (unset)
    LD_LIBRARY_PATH (unset)
    LOGDIR (unset)
    PATH=/Users/dur-randir/perlbrew/bin:/Users/dur-randir/perlbrew/perls/perl-5.22.1/bin:/usr/local/bin:/usr/local/sbin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/texbin
    PERLBREW_HOME=/Users/dur-randir/.perlbrew
    PERLBREW_MANPATH=/Users/dur-randir/perlbrew/perls/perl-5.22.1/man
    PERLBREW_PATH=/Users/dur-randir/perlbrew/bin:/Users/dur-randir/perlbrew/perls/perl-5.22.1/bin
    PERLBREW_PERL=perl-5.22.1
    PERLBREW_ROOT=/Users/dur-randir/perlbrew
    PERLBREW_SHELLRC_VERSION=0.84
    PERLBREW_VERSION=0.84
    PERL_BADLANG (unset)
    SHELL=/usr/local/bin/zsh

@p5pRT
Copy link
Author

p5pRT commented Apr 10, 2019

From @dur-randir

Another similar example, caused by the same commit, but with another stack trace​:

0 =~ /0(?n)|()(()((?0)|(?0)))0*\N0/

Program terminated with signal SIGFPE, Arithmetic exception.
#0 0x0000555555687ab0 in S_make_trie (pRExC_state=0x7fffffffd310, startbranch=0x7ffff3a28bf8, first=0x7ffff3a28ecc, last=0x7ffff3a28ee4,
  tail=0x7ffff3a28ecc, word_count=2, flags=35, depth=2) at regcomp.c​:3255
3255 accept_state = TRIE_NODENUM( state );
(gdb) bt
#0 0x0000555555687ab0 in S_make_trie (pRExC_state=0x7fffffffd310, startbranch=0x7ffff3a28bf8, first=0x7ffff3a28ecc, last=0x7ffff3a28ee4,
  tail=0x7ffff3a28ecc, word_count=2, flags=35, depth=2) at regcomp.c​:3255
#1 0x000055555568fff3 in S_study_chunk (pRExC_state=0x7fffffffd310, scanp=0x7fffffffcb88, minlenp=0x7fffffffd0a0, deltap=0x7fffffffcba8,
  last=0x7ffff3a28f1c, data=0x7fffffffcf00, stopparen=-1, recursed_depth=0, and_withp=0x0, flags=8192, depth=1) at regcomp.c​:4960
#2 0x000055555568f171 in S_study_chunk (pRExC_state=0x7fffffffd310, scanp=0x7fffffffd098, minlenp=0x7fffffffd0a0, deltap=0x7fffffffd0c0,
  last=0x7ffff3a28ff8, data=0x7fffffffd680, stopparen=-1, recursed_depth=0, and_withp=0x0, flags=10240, depth=0) at regcomp.c​:4639
#3 0x000055555569fbc4 in Perl_re_op_compile (patternp=0x0, pat_count=1, expr=0x7ffff43daf18, eng=0x555555b41d20 <PL_core_reg_engine>, old_re=0x0,
  is_bare_re=0x0, orig_rx_flags=256, pm_flags=256) at regcomp.c​:8176
#4 0x00005555555ba159 in Perl_pmruntime (o=0x7ffff43daf58, expr=0x7ffff43daf18, repl=0x0, flags=1, floor=0) at op.c​:7127
#5 0x000055555566ffc3 in Perl_yyparse (gramtype=258) at perly.y​:1234
#6 0x00005555558390a9 in S_doeval_compile (gimme=1 '\001', outside=0x7ffff79fb9a0, seq=4294967265, hh=0x7ffff748b898) at pp_ctl.c​:3502
#7 0x0000555555840e44 in Perl_pp_entereval () at pp_ctl.c​:4478
#8 0x000055555570b895 in Perl_runops_debug () at dump.c​:2537
#9 0x00005555555ed560 in S_run_body (oldscope=1) at perl.c​:2716
#10 0x00005555555ecade in perl_run (my_perl=0x7ffff7dc9fff) at perl.c​:2639
#11 0x00005555555a114e in main (argc=4, argv=0x7fffffffe0b8, env=0x7fffffffe0e0) at perlmain.c​:127

@p5pRT
Copy link
Author

p5pRT commented Apr 10, 2019

From @khwilliamson

Here is what the pattern looks up before the start of optimization

  1​: BRANCH (4)
  2​: EXACT <0> (19)
  4​: BRANCH (FAIL)
  5​: NOTHING (6)
  6​: NOTHING (7)
  7​: BRANCH (12)
  8​: NOTHING (9)
  9​: GOSUB0[+0​:9] (16)
  12​: BRANCH (FAIL)
  13​: GOSUB0[+1​:14] (16)
  16​: TAIL (17)
  17​: EXACT <0> (19)
  19​: END (0)
--
Karl Williamson

@p5pRT
Copy link
Author

p5pRT commented Apr 10, 2019

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Apr 10, 2019

From @khwilliamson

On 4/10/19 5​:22 PM, Karl Williamson via RT wrote​:

Here is what the pattern looks up before the start of optimization

1&#8203;: BRANCH \(4\)
2&#8203;:   EXACT \<0> \(19\)
4&#8203;: BRANCH \(FAIL\)
5&#8203;:   NOTHING \(6\)
6&#8203;:   NOTHING \(7\)
7&#8203;:   BRANCH \(12\)
8&#8203;:     NOTHING \(9\)
9&#8203;:     GOSUB0\[\+0&#8203;:9\] \(16\)

12​: BRANCH (FAIL)
13​: GOSUB0[+1​:14] (16)
16​: TAIL (17)
17​: EXACT <0> (19)
19​: END (0)

This is the result of the first example in the ticket.
The segfault arises because the NEXTOPER() is getting corrupted
apparently, and is returning 0x0, and we can't access that memory address

@p5pRT
Copy link
Author

p5pRT commented Apr 11, 2019

From @hvds

On Wed, 10 Apr 2019 16:31:13 -0700, public@khwilliamson.com wrote:

On 4/10/19 5:22 PM, Karl Williamson via RT wrote:

Here is what the pattern looks up before the start of optimization

1: BRANCH (4)
2:   EXACT <0> (19)
4: BRANCH (FAIL)
5:   NOTHING (6)
6:   NOTHING (7)
7:   BRANCH (12)
8:     NOTHING (9)
9:     GOSUB0[+0:9] (16)

12: BRANCH (FAIL)
13: GOSUB0[+1:14] (16)
16: TAIL (17)
17: EXACT <0> (19)
19: END (0)

This is the result of the first example in the ticket.
The segfault arises because the NEXTOPER() is getting corrupted
apparently, and is returning 0x0, and we can't access that memory address

As far as I can see, the loop:

  for (cur = startbranch; cur != scan; cur = regnext(cur))

is falling off the end in one of its incarnations; I don't yet understand why that's happening.

I wonder whether it would be valid to change the loop condition to cur < scan, or maybe cur != scan && OP(cur) != END.

FWIW I find it marginally easier to track what's going on with the hv/study_chunk branch, even though it's a bit out of date. That gives this stack trace:

#0  0x00000000004eba16 in S_rck_make_trie (pRExC_state=0x7fffffffdab0, 
    params=0x7fffffffd270) at regcomp.c:5882
#1  0x00000000004e7123 in S_rck_branch (pRExC_state=0x7fffffffdab0, 
    params=0x7fffffffd270) at regcomp.c:4431
#2  0x00000000004ee4a2 in S_study_chunk_one_node (pRExC_state=0x7fffffffdab0, 
    params=0x7fffffffd270) at regcomp.c:6420
#3  0x00000000004e6893 in S_study_chunk (pRExC_state=0x7fffffffdab0, 
    params=0x7fffffffd270) at regcomp.c:4280
#4  0x00000000004eaf43 in S_rck_make_trie (pRExC_state=0x7fffffffdab0, 
    params=0x7fffffffd920) at regcomp.c:5661
#5  0x00000000004e7123 in S_rck_branch (pRExC_state=0x7fffffffdab0, 
    params=0x7fffffffd920) at regcomp.c:4431
#6  0x00000000004ee4a2 in S_study_chunk_one_node (pRExC_state=0x7fffffffdab0, 
    params=0x7fffffffd920) at regcomp.c:6420
#7  0x00000000004e6893 in S_study_chunk (pRExC_state=0x7fffffffdab0, 
    params=0x7fffffffd920) at regcomp.c:4280
#8  0x00000000004f7c6f in Perl_re_op_compile (patternp=0x0, pat_count=1, 
    expr=0xbbe8a8, eng=0x7a00a0 <PL_core_reg_engine>, old_re=0x0, 
    is_bare_re=0x0, orig_rx_flags=0, pm_flags=0) at regcomp.c:8158
#9  0x000000000041c753 in Perl_pmruntime (o=0xbbe8e8, expr=0xbbe8a8, repl=0x0, 
    flags=1, floor=0) at op.c:7030
#10 0x00000000004c9a65 in Perl_yyparse (gramtype=258) at perly.y:1228
#11 0x000000000044bc52 in S_parse_body (env=0x0, xsinit=0x74f799 <xs_init>)
    at perl.c:2503
#12 0x000000000044a70a in perl_parse (my_perl=0xb95010, 
    xsinit=0x74f799 <xs_init>, argc=3, argv=0x7fffffffe508, env=0x0)
    at perl.c:1797
#13 0x000000000074f700 in main (argc=3, argv=0x7fffffffe508, 
    env=0x7fffffffe528) at miniperlmain.c:127

.. but deepest stack has had at least two further levels of study_chunk -> rck_make_trie frames before coming back to hit this SEGV.

Unsurprisingly this does not fail if perl is built without PERL_ENABLE_TRIE_OPTIMISATION.

Changing the condition to cur < scan is enough to make both the two testcases here survive, and passes tests; I don't know if there's some reason it might be turning off a bunch of useful trie optimizations though.

Hugo

@p5pRT
Copy link
Author

p5pRT commented Apr 11, 2019

From @khwilliamson

On 4/11/19 10​:15 AM, Hugo van der Sanden via RT wrote​:

On Wed, 10 Apr 2019 16​:31​:13 -0700, public@​khwilliamson.com wrote​:

On 4/10/19 5​:22 PM, Karl Williamson via RT wrote​:

Here is what the pattern looks up before the start of optimization

 1&#8203;: BRANCH \(4\)
 2&#8203;:   EXACT \<0> \(19\)
 4&#8203;: BRANCH \(FAIL\)
 5&#8203;:   NOTHING \(6\)
 6&#8203;:   NOTHING \(7\)
 7&#8203;:   BRANCH \(12\)
 8&#8203;:     NOTHING \(9\)
 9&#8203;:     GOSUB0\[\+0&#8203;:9\] \(16\)
12&#8203;:   BRANCH \(FAIL\)
13&#8203;:     GOSUB0\[\+1&#8203;:14\] \(16\)
16&#8203;:   TAIL \(17\)
17&#8203;:   EXACT \<0> \(19\)
19&#8203;: END \(0\)

This is the result of the first example in the ticket.
The segfault arises because the NEXTOPER() is getting corrupted
apparently, and is returning 0x0, and we can't access that memory address

As far as I can see, the loop​:
for (cur = startbranch; cur != scan; cur = regnext(cur))
is falling off the end in one of its incarnations; I don't yet understand why that's happening.

I wonder whether it would be valid to change the loop condition to 'cur < scan', or maybe 'cur != scan && OP(cur) != END'.

FWIW I find it marginally easier to track what's going on with the hv/study_chunk branch, even though it's a bit out of date. That gives this stack trace​:
#0 0x00000000004eba16 in S_rck_make_trie (pRExC_state=0x7fffffffdab0,
params=0x7fffffffd270) at regcomp.c​:5882
#1 0x00000000004e7123 in S_rck_branch (pRExC_state=0x7fffffffdab0,
params=0x7fffffffd270) at regcomp.c​:4431
#2 0x00000000004ee4a2 in S_study_chunk_one_node (pRExC_state=0x7fffffffdab0,
params=0x7fffffffd270) at regcomp.c​:6420
#3 0x00000000004e6893 in S_study_chunk (pRExC_state=0x7fffffffdab0,
params=0x7fffffffd270) at regcomp.c​:4280
#4 0x00000000004eaf43 in S_rck_make_trie (pRExC_state=0x7fffffffdab0,
params=0x7fffffffd920) at regcomp.c​:5661
#5 0x00000000004e7123 in S_rck_branch (pRExC_state=0x7fffffffdab0,
params=0x7fffffffd920) at regcomp.c​:4431
#6 0x00000000004ee4a2 in S_study_chunk_one_node (pRExC_state=0x7fffffffdab0,
params=0x7fffffffd920) at regcomp.c​:6420
#7 0x00000000004e6893 in S_study_chunk (pRExC_state=0x7fffffffdab0,
params=0x7fffffffd920) at regcomp.c​:4280
#8 0x00000000004f7c6f in Perl_re_op_compile (patternp=0x0, pat_count=1,
expr=0xbbe8a8, eng=0x7a00a0 <PL_core_reg_engine>, old_re=0x0,
is_bare_re=0x0, orig_rx_flags=0, pm_flags=0) at regcomp.c​:8158
#9 0x000000000041c753 in Perl_pmruntime (o=0xbbe8e8, expr=0xbbe8a8, repl=0x0,
flags=1, floor=0) at op.c​:7030
#10 0x00000000004c9a65 in Perl_yyparse (gramtype=258) at perly.y​:1228
#11 0x000000000044bc52 in S_parse_body (env=0x0, xsinit=0x74f799 <xs_init>)
at perl.c​:2503
#12 0x000000000044a70a in perl_parse (my_perl=0xb95010,
xsinit=0x74f799 <xs_init>, argc=3, argv=0x7fffffffe508, env=0x0)
at perl.c​:1797
#13 0x000000000074f700 in main (argc=3, argv=0x7fffffffe508,
env=0x7fffffffe528) at miniperlmain.c​:127

.. but deepest stack has had at least two further levels of study_chunk -> rck_make_trie frames before coming back to hit this SEGV.

Unsurprisingly this does not fail if perl is built without PERL_ENABLE_TRIE_OPTIMISATION.

Changing the condition to 'cur < scan' is enough to make both the two testcases here survive, and passes tests; I don't know if there's some reason it might be turning off a bunch of useful trie optimizations though.

I would think that using cur != scan && OP(cur) != END'
as you also suggested would not turn off any real optimizations.

@p5pRT
Copy link
Author

p5pRT commented Apr 11, 2019

From @hvds

On Thu, 11 Apr 2019 10​:05​:43 -0700, public@​khwilliamson.com wrote​:

On 4/11/19 10​:15 AM, Hugo van der Sanden via RT wrote​:

On Wed, 10 Apr 2019 16​:31​:13 -0700, public@​khwilliamson.com wrote​:

On 4/10/19 5​:22 PM, Karl Williamson via RT wrote​:

Here is what the pattern looks up before the start of optimization

1​: BRANCH (4)
2​: EXACT <0> (19)
4​: BRANCH (FAIL)
5​: NOTHING (6)
6​: NOTHING (7)
7​: BRANCH (12)
8​: NOTHING (9)
9​: GOSUB0[+0​:9] (16)
12​: BRANCH (FAIL)
13​: GOSUB0[+1​:14] (16)
16​: TAIL (17)
17​: EXACT <0> (19)
19​: END (0)

This is the result of the first example in the ticket.
The segfault arises because the NEXTOPER() is getting corrupted
apparently, and is returning 0x0, and we can't access that memory
address

As far as I can see, the loop​:
for (cur = startbranch; cur != scan; cur = regnext(cur))
is falling off the end in one of its incarnations; I don't yet
understand why that's happening.

I wonder whether it would be valid to change the loop condition to
'cur < scan', or maybe 'cur != scan && OP(cur) != END'.

FWIW I find it marginally easier to track what's going on with the
hv/study_chunk branch, even though it's a bit out of date. That gives
this stack trace​:
#0 0x00000000004eba16 in S_rck_make_trie
(pRExC_state=0x7fffffffdab0,
params=0x7fffffffd270) at regcomp.c​:5882
#1 0x00000000004e7123 in S_rck_branch (pRExC_state=0x7fffffffdab0,
params=0x7fffffffd270) at regcomp.c​:4431
#2 0x00000000004ee4a2 in S_study_chunk_one_node
(pRExC_state=0x7fffffffdab0,
params=0x7fffffffd270) at regcomp.c​:6420
#3 0x00000000004e6893 in S_study_chunk (pRExC_state=0x7fffffffdab0,
params=0x7fffffffd270) at regcomp.c​:4280
#4 0x00000000004eaf43 in S_rck_make_trie
(pRExC_state=0x7fffffffdab0,
params=0x7fffffffd920) at regcomp.c​:5661
#5 0x00000000004e7123 in S_rck_branch (pRExC_state=0x7fffffffdab0,
params=0x7fffffffd920) at regcomp.c​:4431
#6 0x00000000004ee4a2 in S_study_chunk_one_node
(pRExC_state=0x7fffffffdab0,
params=0x7fffffffd920) at regcomp.c​:6420
#7 0x00000000004e6893 in S_study_chunk (pRExC_state=0x7fffffffdab0,
params=0x7fffffffd920) at regcomp.c​:4280
#8 0x00000000004f7c6f in Perl_re_op_compile (patternp=0x0,
pat_count=1,
expr=0xbbe8a8, eng=0x7a00a0 <PL_core_reg_engine>, old_re=0x0,
is_bare_re=0x0, orig_rx_flags=0, pm_flags=0) at regcomp.c​:8158
#9 0x000000000041c753 in Perl_pmruntime (o=0xbbe8e8, expr=0xbbe8a8,
repl=0x0,
flags=1, floor=0) at op.c​:7030
#10 0x00000000004c9a65 in Perl_yyparse (gramtype=258) at perly.y​:1228
#11 0x000000000044bc52 in S_parse_body (env=0x0, xsinit=0x74f799
<xs_init>)
at perl.c​:2503
#12 0x000000000044a70a in perl_parse (my_perl=0xb95010,
xsinit=0x74f799 <xs_init>, argc=3, argv=0x7fffffffe508, env=0x0)
at perl.c​:1797
#13 0x000000000074f700 in main (argc=3, argv=0x7fffffffe508,
env=0x7fffffffe528) at miniperlmain.c​:127

.. but deepest stack has had at least two further levels of
study_chunk -> rck_make_trie frames before coming back to hit this
SEGV.

Unsurprisingly this does not fail if perl is built without
PERL_ENABLE_TRIE_OPTIMISATION.

Changing the condition to 'cur < scan' is enough to make both the two
testcases here survive, and passes tests; I don't know if there's
some reason it might be turning off a bunch of useful trie
optimizations though.

I would think that using cur != scan && OP(cur) != END'
as you also suggested would not turn off any real optimizations.

Hmm, the second case still fails with that variant​:
% ./miniperl -e '0 =~ /0(?n)|()(()((?0)|(?0)))0*\N0/'
Floating point exception (core dumped)
%

I'll try digging into that when I have some more time - Yves, if you have any chance to look at this I'm sure that would help a lot.

Hugo

@p5pRT
Copy link
Author

p5pRT commented Apr 25, 2019

From @hvds

On Thu, 11 Apr 2019 10​:41​:35 -0700, hv wrote​:

On Thu, 11 Apr 2019 10​:05​:43 -0700, public@​khwilliamson.com wrote​:

On 4/11/19 10​:15 AM, Hugo van der Sanden via RT wrote​:

On Wed, 10 Apr 2019 16​:31​:13 -0700, public@​khwilliamson.com wrote​:

On 4/10/19 5​:22 PM, Karl Williamson via RT wrote​:

Here is what the pattern looks up before the start of
optimization

1​: BRANCH (4)
2​: EXACT <0> (19)
4​: BRANCH (FAIL)
5​: NOTHING (6)
6​: NOTHING (7)
7​: BRANCH (12)
8​: NOTHING (9)
9​: GOSUB0[+0​:9] (16)
12​: BRANCH (FAIL)
13​: GOSUB0[+1​:14] (16)
16​: TAIL (17)
17​: EXACT <0> (19)
19​: END (0)

This is the result of the first example in the ticket.
The segfault arises because the NEXTOPER() is getting corrupted
apparently, and is returning 0x0, and we can't access that memory
address

As far as I can see, the loop​:
for (cur = startbranch; cur != scan; cur = regnext(cur))
is falling off the end in one of its incarnations; I don't yet
understand why that's happening.

I wonder whether it would be valid to change the loop condition to
'cur < scan', or maybe 'cur != scan && OP(cur) != END'.

FWIW I find it marginally easier to track what's going on with the
hv/study_chunk branch, even though it's a bit out of date. That
gives
this stack trace​:
#0 0x00000000004eba16 in S_rck_make_trie
(pRExC_state=0x7fffffffdab0,
params=0x7fffffffd270) at regcomp.c​:5882
#1 0x00000000004e7123 in S_rck_branch (pRExC_state=0x7fffffffdab0,
params=0x7fffffffd270) at regcomp.c​:4431
#2 0x00000000004ee4a2 in S_study_chunk_one_node
(pRExC_state=0x7fffffffdab0,
params=0x7fffffffd270) at regcomp.c​:6420
#3 0x00000000004e6893 in S_study_chunk
(pRExC_state=0x7fffffffdab0,
params=0x7fffffffd270) at regcomp.c​:4280
#4 0x00000000004eaf43 in S_rck_make_trie
(pRExC_state=0x7fffffffdab0,
params=0x7fffffffd920) at regcomp.c​:5661
#5 0x00000000004e7123 in S_rck_branch (pRExC_state=0x7fffffffdab0,
params=0x7fffffffd920) at regcomp.c​:4431
#6 0x00000000004ee4a2 in S_study_chunk_one_node
(pRExC_state=0x7fffffffdab0,
params=0x7fffffffd920) at regcomp.c​:6420
#7 0x00000000004e6893 in S_study_chunk
(pRExC_state=0x7fffffffdab0,
params=0x7fffffffd920) at regcomp.c​:4280
#8 0x00000000004f7c6f in Perl_re_op_compile (patternp=0x0,
pat_count=1,
expr=0xbbe8a8, eng=0x7a00a0 <PL_core_reg_engine>, old_re=0x0,
is_bare_re=0x0, orig_rx_flags=0, pm_flags=0) at regcomp.c​:8158
#9 0x000000000041c753 in Perl_pmruntime (o=0xbbe8e8,
expr=0xbbe8a8,
repl=0x0,
flags=1, floor=0) at op.c​:7030
#10 0x00000000004c9a65 in Perl_yyparse (gramtype=258) at
perly.y​:1228
#11 0x000000000044bc52 in S_parse_body (env=0x0, xsinit=0x74f799
<xs_init>)
at perl.c​:2503
#12 0x000000000044a70a in perl_parse (my_perl=0xb95010,
xsinit=0x74f799 <xs_init>, argc=3, argv=0x7fffffffe508,
env=0x0)
at perl.c​:1797
#13 0x000000000074f700 in main (argc=3, argv=0x7fffffffe508,
env=0x7fffffffe528) at miniperlmain.c​:127

.. but deepest stack has had at least two further levels of
study_chunk -> rck_make_trie frames before coming back to hit this
SEGV.

Unsurprisingly this does not fail if perl is built without
PERL_ENABLE_TRIE_OPTIMISATION.

Changing the condition to 'cur < scan' is enough to make both the
two
testcases here survive, and passes tests; I don't know if there's
some reason it might be turning off a bunch of useful trie
optimizations though.

I would think that using cur != scan && OP(cur) != END'
as you also suggested would not turn off any real optimizations.

Hmm, the second case still fails with that variant​:
% ./miniperl -e '0 =~ /0(?n)|()(()((?0)|(?0)))0*\N0/'
Floating point exception (core dumped)
%

I'll try digging into that when I have some more time - Yves, if you
have any chance to look at this I'm sure that would help a lot.

With -Dr that second case hits an assertion with all three variants of the loop​:
% ./miniperl -Dr -e '0 =~ /0(?n)|()(()((?0)|(?0)))0*\N0/'
Compiling REx "0(?n)|()(()((?0)|(?0)))0*\N0"
~ tying lastbr BRANCH (11) to ender TAIL (15) offset 4
~ tying lastbr BRANCH (4) to ender END (22) offset 18
Freeing REx​: "0(?n)|()(()((?0)|(?0)))0*\N0"
~ tying lastbr BRANCH (11) to ender TAIL (15) offset 4
~ tying lastbr BRANCH (4) to ender END (22) offset 18
miniperl​: regcomp.c​:8324​: Perl_re_op_compile​: Assertion `scan && ((scan)->type) == 85' failed.
Aborted (core dumped)
%

Disturbingly, if I fix the assertion as advised with the patch below, that case survives with -Dr but still gives a floating point exception without -Dr. It turns out that _with_ -Dr we're hitting the startbranch..scan loop twice; _without_ -Dr we're hitting it a third time, and by the looks of it that third time the scan we're aiming at is in an optimized-out section​:
% ./miniperl -e '0 =~ /0(?n)|()(()((?0)|(?0)))0*\N0/'
(first loop, scan=db7d84)
cur db7d64 (BRANCH)
cur db7d74 (BRANCH)
(second loop, scan=db7da0)
cur db7d4c (BRANCH)
cur db7d58 (BRANCH)
(third loop, scan=db7d84)
cur db7d64 (BRANCH)
cur db7d74 (BRANCH)
cur db7d88 (STAR) # oops
cur db7d94 (REG_ANY)
cur db7d98 (EXACT)
Floating point exception (core dumped)

The difference appears to be because in at least some cases we only mark optimized-out nodes as OPTIMIZED if -Dr is enabled​: the point of divergence is the test
  if (PERL_ENABLE_TRIE_OPTIMISATION &&
  OP( startbranch ) == BRANCH )
.. which with -Dr is false (the op is OPTIMIZED) on the third time of asking.

Hugo

Inline Patch
diff --git a/regcomp.c b/regcomp.c
index fbd5c18..ba2b2b5 100644
--- a/regcomp.c
+++ b/regcomp.c
@@ -8316,12 +8316,11 @@ Perl_re_op_compile(pTHX_ SV ** const patternp, int pat_count,
          * the OPEN/CURLYM (CURLYM's are special and can act like OPEN's)
          * it refers to.
          *
-         * If for some reason someone writes code that optimises
-         * away a GOSUB opcode then the assert should be changed to
-         * an if(scan) to guard the ARG2L_SET() - Yves
-         *
          */
-        assert(scan && OP(scan) == GOSUB);
+        assert(scan);
+        if (OP(scan) == OPTIMIZED)
+            continue;
+        assert(OP(scan) == GOSUB);
         ARG2L_SET( scan, RExC_open_parens[ARG(scan)] - REGNODE_OFFSET(scan));
     }
 

@p5pRT
Copy link
Author

p5pRT commented May 5, 2019

From @dur-randir

On Thu, 25 Apr 2019 04​:18​:31 -0700, hv wrote​:

Disturbingly, if I fix the assertion as advised with the patch below,
that case survives with -Dr but still gives a floating point exception
without -Dr.

The difference appears to be because in at least some cases we only
mark optimized-out nodes as OPTIMIZED if -Dr is enabled​: the point of
divergence is the test
if (PERL_ENABLE_TRIE_OPTIMISATION &&
OP( startbranch ) == BRANCH )
.. which with -Dr is false (the op is OPTIMIZED) on the third time of
asking.

Maybe this will help, I have a case that fails even with this patch both with and without -Dr​:

"000000"=~/0(?0)|0(?|0|0)/

emitting 'panic​: regexp memory corruption'.

@p5pRT
Copy link
Author

p5pRT commented May 6, 2019

From @hvds

On Sun, 05 May 2019 00​:39​:15 -0700, randir wrote​:

On Thu, 25 Apr 2019 04​:18​:31 -0700, hv wrote​:

Disturbingly, if I fix the assertion as advised with the patch below,
that case survives with -Dr but still gives a floating point
exception
without -Dr.

The difference appears to be because in at least some cases we only
mark optimized-out nodes as OPTIMIZED if -Dr is enabled​: the point of
divergence is the test
if (PERL_ENABLE_TRIE_OPTIMISATION &&
OP( startbranch ) == BRANCH )
.. which with -Dr is false (the op is OPTIMIZED) on the third time of
asking.

Maybe this will help, I have a case that fails even with this patch
both with and without -Dr​:

"000000"=~/0(?0)|0(?|0|0)/

emitting 'panic​: regexp memory corruption'.

Cool, I'll take a look at that when I'm back home at the end of the week if nobody gets there first.

Hugo

@khwilliamson
Copy link
Contributor

I hoped the hv/gh17553 branch would fix this, but no.

@hvds
Copy link
Contributor

hvds commented Apr 10, 2020

I have a case that fails even with this patch both with and without -Dr​:

"000000"=~/0(?0)|0(?|0|0)/

emitting 'panic​: regexp memory corruption'.

Sorry, I also completely dropped this one. I'll try to dig some more - an initial look shows that the (?|0|0) is getting corrupted (wrongly optimized out, despite pointers jumping into it):

% ./perl -Ilib -Mre=Debug,DUMP_PRE_OPTIMIZE,All -e 'qr{x(?0)|x(?|y|y)}'
Program before optimization:
   1: BRANCH (7)
   2:   EXACT <x> (4)
   4:   GOSUB0[+0:4] (17)
   7: BRANCH (FAIL)
   8:   EXACT <x> (10)
  10:   BRANCH (13)
  11:     EXACT <y> (16)
  13:   BRANCH (FAIL)
  14:     EXACT <y> (16)
  16:   TAIL (17)
  17: END (0)
Final program:
   1: TRIE-EXACT[x] (17)
      <x> (4)
   4:   GOSUB0[-3:1] (17)
      <x> (10)
  16:   TAIL (17)
  17: END (0)
% ./perl -Ilib -Mre=Debug,All -e 'qr{x(?0)|x(?|y|z)}'
Final program:
   1: TRIE-EXACT[x] (17)
      <x> (4)
   4:   GOSUB0[-3:1] (17)
      <x> (10)
  10:   TRIE-EXACT[yz] (17)
        <y> 
        <z> 
  17: END (0)
% 

Clearly the first case should have had a "10: TRIE-EXACT[y]" (or possibly "EXACT <y>") that's gone missing.

@hvds
Copy link
Contributor

hvds commented Apr 11, 2020

Ok, what's happening is approximately:

  • the GOSUB causes us to recurse study_chunk by enframing, over the whole program, while in the middle of considering the set 1: BRANCH => 7: BRANCH => 17: END for rewriting as a trie
  • during that recursed study, the set 10: BRANCH => 13: BRANCH => 16: FAIL is considered for rewriting as a trie; in the event it spots that it can be optimized even further to rewrite as EXACT <y>
  • still in the recursed study, the outer set is rewritten as TRIE-EXACT[x] with endpoints 4 and 10
  • the recursed study completes, and returns to the original study. This does not know that the program has been rewritten underneath it (and as it happens, overwriting only nodes it has already passed over) so continues its TRIE investigation with 7: BRANCH (FAIL) (which is no longer part of the program)
  • it then reaches 8: EXACT <x> (10) (which is no longer part of the program) followed by the rewritten 10: EXACT <y> (16) (which is still part of the program), which join_exact combines into 8: EXACT <xy> (16).
  • thus we have a TRIE with an endpoint 10 that is no longer part of the program.

A possible solution is to inhibit all changes that involve rewriting nodes while enframed - things like join_exact, making tries, curly upgrades, switches between anyof nodes and exactf nodes etc.

In this case it's enough just to inhibit tries:

% git diff
diff --git a/regcomp.c b/regcomp.c
index 5e0f37599c..9275bc5ac9 100644
--- a/regcomp.c
+++ b/regcomp.c
@@ -4804,9 +4804,10 @@ S_study_chunk(pTHX_ RExC_state_t *pRExC_state, regnode **scanp,
                    }
                }
 
-                if (PERL_ENABLE_TRIE_OPTIMISATION &&
-                        OP( startbranch ) == BRANCH )
-                {
+                if (PERL_ENABLE_TRIE_OPTIMISATION
+                    && OP(startbranch) == BRANCH
+                    && !frame
+                ) {
                /* demq.
 
                    Assuming this was/is a branch we are dealing with: 'scan'
% ./miniperl -Dr -e 'qr{x(?0)|x(?|y|y)}'
Final program:
   1: TRIE-EXACT[x] (17)
      <x> (4)
   4:   GOSUB0[-3:1] (17)
      <xy> (17)
  17: END (0)
stclass AHOCORASICK-EXACT[x] minlen 1 
% 

@xsawyerx I think we should aim to have this (along with a test) as part of 5.32; I'll start looking at other types of transformation to inhibit similarly, I don't know to what extent we should be aiming to cover all of them for 5.32.

@hvds
Copy link
Contributor

hvds commented Apr 11, 2020

Created PR #17714 for this.

@xsawyerx
Copy link
Member

Let's have it approved and merged prior to 5.31.11 (in 9 days).

@khwilliamson
Copy link
Contributor

merged via 869e8e3

@hvds
Copy link
Contributor

hvds commented Apr 12, 2020

I want to keep this open for the other types of regexp transformation, I'll look further at those tomorrow.

@hvds hvds reopened this Apr 12, 2020
@hvds
Copy link
Contributor

hvds commented Apr 12, 2020

@xsawyerx @khwilliamson: I've created PR #17715 to cover all the other examples I can find of op types being changed during study_chunk.

The nature of the change is the same for each chunk, and in every case I expect the outer frame will eventually see the same ops and make the same decision, so I feel pretty confident this should not introduce problems.

@xsawyerx
Copy link
Member

@hvds Thank you for doing the work. I approved it as well.

@hvds
Copy link
Contributor

hvds commented Apr 16, 2020

@khwilliamson++ has merged the second branch, closing now.

@hvds hvds closed this as completed Apr 16, 2020
@hvds
Copy link
Contributor

hvds commented Apr 18, 2020

It turns out this a bit more complicated:

  • enframing is used for both GOSUB and SUSPEND
  • when used for GOSUB, the nodes acted on are studied multiply (once inside the frame, once outside it); when used for SUSPEND they are studied only once (inside the frame)
  • when studying we have a mix of optional optimizations and mandatory fixups
  • we want all such changes to happen only once, at the outermost layer in which they are studied.

The fixes merged do not take account of SUSPEND, so the optional optimizations and mandatory fixups are not happening at all, as reported by @dur-randir in #17715:

% ./miniperl -e '"sx" =~ m/ss++/i'
panic: regrepeat() called with unrecognized node type 53='EXACTFU_S_EDGE' at -e line 1.
% 

To fix this, we need to replace the simple if frame test to distinguish the GOSUB and SUSPEND cases. I think this is most easily done at this point by storing an extra boolean in the frame structure, and using this to derive a "no fixups" flag within study_chunk.

I'll start working on that now.

@hvds hvds reopened this Apr 18, 2020
@hvds
Copy link
Contributor

hvds commented Apr 18, 2020

I've pushed the branch https://github.com/Perl/perl5/tree/gh16947-3 with my proposed fix for this for early eyeballs (@khwilliamson, @demerphq in particular).

I'm testing it now, and will also try to add some more tests exploring interactions between GOSUB and SUSPEND at various depths.

@jkeenan
Copy link
Contributor

jkeenan commented Apr 18, 2020 via email

@dur-randir
Copy link
Member

I observed no new failures on this branch after a day, but since it'a stochastic process, it's not a bullet-proof confirmation. I'm keeping it to run more.

@hvds
Copy link
Contributor

hvds commented Apr 21, 2020

I've made a pull request #17736 for this.

I've eyeballed -Dr output for a variety of possible frame-to-frame interactions, and they all look ok to me.

There's a possibility that interaction between enframing and simple recursion may still be able to trip this up, though I haven't found an example that does so. If that were possible we'd also need to give study_chunk an additional argument so as to pass through either in_gosub or mutate_ok - not hard to do, but more invasive than I'd prefer to consider at this point in the release cycle. If time permits I'll put together such a patch anyway so we can see what it looks like.

@xsawyerx
Copy link
Member

Fix approved!

@hvds
Copy link
Contributor

hvds commented Apr 23, 2020

Merged #17736.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants