Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unknown regexp modifier/unmatched [ becomes Assertion `(IV)elen >= 0' failed. #14965

Closed
p5pRT opened this issue Oct 4, 2015 · 10 comments
Closed

Comments

@p5pRT
Copy link

p5pRT commented Oct 4, 2015

Migrated from rt.perl.org#126261 (status was 'resolved')

Searchable as RT126261$

@p5pRT
Copy link
Author

p5pRT commented Oct 4, 2015

From @dcollinsn

Greetings Porters,

I have compiled bleadperl with the afl-gcc compiler using​:

./Configure -Dusedevel -Dprefix='/usr/local/perl-afl' -Dcc='ccache afl-gcc' -Duselongdouble -Duse64bitall -Doptimize=-g -Uversiononly -Uman1dir -Uman3dir -DDEBUGGING -DPERL_POISON -des
AFL_HARDEN=1 make && make test

And then fuzzed the resulting binary using​:

AFL_NO_VAR_CHECK=1 afl-fuzz -i in -o out bin/perl @​@​

After reducing testcases using `afl-tmin` and performing additional minimization by hand, I have located the following testcase that triggers an assert fail in the perl interpreter, but which (correctly) errors out in non-debugging perl. The simplest testcase is the file​:

dcollins@​nightshade64​:/usr/local/perl-afl/out$ od -c allcrash/perlu/f3i000270
0000000 / 0 0 0 0 0 0 0 0 / 0 ? s > > 0
0000020 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0000040 0 0 0 0 0 0 0 > g x / 0 0 0 0
0000060 0 0 0 0 0 0 0 0 0 [ 0 0 0 0 0 0
0000100 0 337 0 0 0 0 0 0 0 0 0 0 [ . 0 0
0000120 . / i \ 0 0 0 0
0000130

In a not-debugging perl, the output is​:
Unknown regexp modifier "/0" at allcrash/perlu/f3i000270 line 1, at end of line
Unmatched [ in regex; marked by <-- HERE in m/0000000000000[0000000▒0000000000[.00.1▒▒!\x{DF}!d [1[.00.1P▒▒1?​:\x{DF}|[ <-- HERE 0000000▒0000000000[.00.])/ at allcrash/perlu/f3i000270 line 1.

In a debugging perl, the output is​:
perl​: sv.c​:11449​: Perl_sv_vcatpvfn_flags​: Assertion `(IV)elen >= 0' failed.
Aborted

**GDB**
(gdb) run
Starting program​: /home/dcollins/perldebug/perl allcrash/perlu/f3i000270
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
perl​: sv.c​:11450​: Perl_sv_vcatpvfn_flags​: Assertion `0' failed.

Program received signal SIGABRT, Aborted.
0x00007ffff6cf4107 in raise () from /lib/x86_64-linux-gnu/libc.so.6
(gdb) bt
#0 0x00007ffff6cf4107 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#1 0x00007ffff6cf54e8 in abort () from /lib/x86_64-linux-gnu/libc.so.6
#2 0x00007ffff6ced226 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#3 0x00007ffff6ced2d2 in __assert_fail () from /lib/x86_64-linux-gnu/libc.so.6
#4 0x0000000000954dad in Perl_sv_vcatpvfn_flags (sv=0x1206bd0, pat=0x0, patlen=140737488343776, args=0x10, svargs=0xfefefefefefefe00, svargs@​entry=0x0, svmax=18957408, svmax@​entry=0, maybe_tainted=0x0, flags=4294967295,
  flags@​entry=4294955744) at sv.c​:11450
#5 0x0000000000a21dd1 in Perl_sv_vsetpvfn (sv=sv@​entry=0x1206bd0, pat=pat@​entry=0xf3e520 "%s in regex; marked by <-- HERE in m/%d%lu%4p <-- HERE %d%lu%4p/", patlen=64, args=args@​entry=0x7fffffffd2e0, svargs=svargs@​entry=0x0,
  svmax=svmax@​entry=0, maybe_tainted=0x0) at sv.c​:10730
#6 0x00000000007f8859 in Perl_vmess (pat=0xf3e520 "%s in regex; marked by <-- HERE in m/%d%lu%4p <-- HERE %d%lu%4p/", args=0x7fffffffd2e0) at util.c​:1472
#7 0x00000000007efa35 in Perl_vcroak (pat=<optimized out>, args=args@​entry=0x7fffffffd2e0) at util.c​:1701
#8 0x00000000007f0052 in Perl_croak (pat=<optimized out>) at util.c​:1748
#9 0x0000000000736580 in S_regatom (pRExC_state=0x7fffffffddc0, flagp=0x7fffffffd5ac, depth=9) at regcomp.c​:11798
#10 0x0000000000737b62 in S_regpiece (depth=<optimized out>, flagp=<optimized out>, pRExC_state=<optimized out>) at regcomp.c​:10876
#11 S_regbranch (pRExC_state=0x7fffffffddc0, flagp=0x50d8, first=-10564, depth=4294967295) at regcomp.c​:10801
#12 0x00000000006f764d in S_reg (pRExC_state=0x7fffffffddc0, paren=20696, flagp=0x7fffffffd850, depth=4294967295) at regcomp.c​:10596
#13 0x000000000070c735 in S_regclass (pRExC_state=0x7fffffffddc0, flagp=0x50d8, depth=6, stop_at_1=255, allow_multi_folds=false, silence_non_portable=96, strict=false, optimizable=true, ret_invlist=0x0) at regcomp.c​:15437
#14 0x000000000072812c in S_regatom (pRExC_state=0x7fffffffddc0, flagp=0x7fffffffdabc, depth=4) at regcomp.c​:11783
#15 0x0000000000737b62 in S_regpiece (depth=<optimized out>, flagp=<optimized out>, pRExC_state=<optimized out>) at regcomp.c​:10876
#16 S_regbranch (pRExC_state=0x7fffffffddc0, flagp=0x50d8, first=-8672, depth=4294967295) at regcomp.c​:10801
#17 0x000000000075a037 in S_reg (pRExC_state=0x7fffffffddc0, flagp=0x7fffffffdc8c, depth=<optimized out>, paren=<optimized out>) at regcomp.c​:10547
#18 0x00000000007944d1 in Perl_re_op_compile (patternp=0x7fffffffde20, pat_count=4, expr=0x1212f55, eng=0xffffffffffffffff, old_re=0x0, is_bare_re=0x4, orig_rx_flags=4, pm_flags=4) at regcomp.c​:6739
#19 0x00000000004e8cf2 in Perl_pmruntime (o=0x12118d0, expr=0x12119f8, repl=0x6, isreg=255, floor=7939376) at op.c​:5572
#20 0x000000000066f76d in Perl_yyparse (gramtype=18946256) at perly.y​:1032
#21 0x000000000053ad55 in S_parse_body (env=env@​entry=0x0, xsinit=xsinit@​entry=0x42c850 <xs_init>) at perl.c​:2304
#22 0x0000000000542ad3 in perl_parse (my_perl=<optimized out>, xsinit=xsinit@​entry=0x42c850 <xs_init>, argc=<optimized out>, argv=<optimized out>, env=env@​entry=0x0) at perl.c​:1634
#23 0x000000000042c478 in main (argc=2, argv=0x7fffffffe658, env=0x7fffffffe670) at perlmain.c​:114
(gdb) f 4
#4 0x0000000000954dad in Perl_sv_vcatpvfn_flags (sv=0x1206bd0, pat=0x0, patlen=140737488343776, args=0x10, svargs=0xfefefefefefefe00, svargs@​entry=0x0, svmax=18957408, svmax@​entry=0, maybe_tainted=0x0, flags=4294967295,
  flags@​entry=4294955744) at sv.c​:11450
11450 assert(0); /* in DEBUGGING build we want to crash */
(gdb) info locals
sv = 216
veclen = 0
dotstrlen = 1
esignbuf = "\n", <incomplete sequence \324>
iv = 6
elen = 18446744073709551098
i = -134224640
utf8buf = "\276\002\177\000\000\000\000\000\000\000\000\000\000"
zeros = 32
infnan = false
c = 0 '\000'
have = 0
q = 0xf3e546 "d%lu%4p <-- HERE %d%lu%4p/"
origlen = 0
nullstr = "(null)"
ebuf = "\356Ø\000\000\000\000\000\001\000\000\000\000\000\000\000@​.\037\001\000\000\000\000 1!\001\000\000\000\000\n\000\000\000\000\000\000\000\222\370\224\000\000\000\000\000@​\000\000\000\000\000\000\000\340\322\377\377\377\177\000\000\003D\b\000\000\000\000\000\000@​\000\000\000\000\000\000@​.\037\001\000\000\000\000\003D\b\000\000\000\000\000\n\000\000\000\000\000\000\000=\001\177\000\000\000\000\000\nڛ", '\000' <repeats 13 times>, "@​."
__PRETTY_FUNCTION__ = "Perl_sv_vcatpvfn_flags"

**VALGRIND**
==30947== Memcheck, a memory error detector
==30947== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==30947== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==30947== Command​: /home/dcollins/perldebug/perl allcrash/perlu/f3i000270
==30947==
==30947== Invalid read of size 2
==30947== at 0x4C2C3B6​: memcpy@​@​GLIBC_2.14 (vg_replace_strmem.c​:1018)
==30947== by 0x95538A​: memcpy (string3.h​:51)
==30947== by 0x95538A​: Perl_sv_vcatpvfn_flags (sv.c​:12727)
==30947== by 0xA21DD0​: Perl_sv_vsetpvfn (sv.c​:10730)
==30947== by 0x7F8858​: Perl_vmess (util.c​:1472)
==30947== by 0x7EFA34​: Perl_vcroak (util.c​:1701)
==30947== by 0x7F0051​: Perl_croak (util.c​:1748)
==30947== by 0x73657F​: S_regatom (regcomp.c​:11798)
==30947== by 0x737B61​: S_regpiece (regcomp.c​:10876)
==30947== by 0x737B61​: S_regbranch (regcomp.c​:10801)
==30947== by 0x6F764C​: S_reg (regcomp.c​:10596)
==30947== by 0x70C734​: S_regclass (regcomp.c​:15437)
==30947== by 0x72812B​: S_regatom (regcomp.c​:11783)
==30947== by 0x737B61​: S_regpiece (regcomp.c​:10876)
==30947== by 0x737B61​: S_regbranch (regcomp.c​:10801)
==30947== Address 0x5f7d616 is 38 bytes inside a block of size 39 alloc'd
==30947== at 0x4C27C0F​: malloc (vg_replace_malloc.c​:299)
==30947== by 0x7F013C​: Perl_safesysmalloc (util.c​:153)
==30947== by 0x98C5FF​: Perl_sv_grow (sv.c​:1624)
==30947== by 0x998847​: Perl_newSV (sv.c​:5528)
==30947== by 0x5E4879​: S_scan_const (toke.c​:2822)
==30947== by 0x5E4879​: Perl_yylex (toke.c​:4764)
==30947== by 0x66A804​: Perl_yyparse (perly.c​:322)
==30947== by 0x53AD54​: S_parse_body (perl.c​:2304)
==30947== by 0x542AD2​: perl_parse (perl.c​:1634)
==30947== by 0x42C477​: main (perlmain.c​:114)
==30947==
==30947== Invalid read of size 2
==30947== at 0x4C2C3A8​: memcpy@​@​GLIBC_2.14 (vg_replace_strmem.c​:1018)
==30947== by 0x95538A​: memcpy (string3.h​:51)
==30947== by 0x95538A​: Perl_sv_vcatpvfn_flags (sv.c​:12727)
==30947== by 0xA21DD0​: Perl_sv_vsetpvfn (sv.c​:10730)
==30947== by 0x7F8858​: Perl_vmess (util.c​:1472)
==30947== by 0x7EFA34​: Perl_vcroak (util.c​:1701)
==30947== by 0x7F0051​: Perl_croak (util.c​:1748)
==30947== by 0x73657F​: S_regatom (regcomp.c​:11798)
==30947== by 0x737B61​: S_regpiece (regcomp.c​:10876)
==30947== by 0x737B61​: S_regbranch (regcomp.c​:10801)
==30947== by 0x6F764C​: S_reg (regcomp.c​:10596)
==30947== by 0x70C734​: S_regclass (regcomp.c​:15437)
==30947== by 0x72812B​: S_regatom (regcomp.c​:11783)
==30947== by 0x737B61​: S_regpiece (regcomp.c​:10876)
==30947== by 0x737B61​: S_regbranch (regcomp.c​:10801)
==30947== Address 0x5f7d61a is 3 bytes after a block of size 39 alloc'd
==30947== at 0x4C27C0F​: malloc (vg_replace_malloc.c​:299)
==30947== by 0x7F013C​: Perl_safesysmalloc (util.c​:153)
==30947== by 0x98C5FF​: Perl_sv_grow (sv.c​:1624)
==30947== by 0x998847​: Perl_newSV (sv.c​:5528)
==30947== by 0x5E4879​: S_scan_const (toke.c​:2822)
==30947== by 0x5E4879​: Perl_yylex (toke.c​:4764)
==30947== by 0x66A804​: Perl_yyparse (perly.c​:322)
==30947== by 0x53AD54​: S_parse_body (perl.c​:2304)
==30947== by 0x542AD2​: perl_parse (perl.c​:1634)
==30947== by 0x42C477​: main (perlmain.c​:114)
==30947==
==30947== Syscall param write(buf) points to uninitialised byte(s)
==30947== at 0x4E41A60​: __write_nocancel (syscall-template.S​:81)
==30947== by 0xD99A0D​: PerlIOUnix_write (perlio.c​:2740)
==30947== by 0xDAFA8B​: Perl_PerlIO_write (perlio.c​:1584)
==30947== by 0xDAFA8B​: PerlIOBuf_flush (perlio.c​:3906)
==30947== by 0xDAA11E​: Perl_PerlIO_flush (perlio.c​:1607)
==30947== by 0xDB57AF​: PerlIOBuf_write (perlio.c​:4133)
==30947== by 0xC60E5A​: Perl_do_print (doio.c​:1393)
==30947== by 0x7ED6DE​: Perl_write_to_stderr (util.c​:1492)
==30947== by 0xB7668F​: Perl_die_unwind (pp_ctl.c​:1676)
==30947== by 0x7EFA8C​: Perl_vcroak (util.c​:1703)
==30947== by 0x7F0051​: Perl_croak (util.c​:1748)
==30947== by 0x73657F​: S_regatom (regcomp.c​:11798)
==30947== by 0x737B61​: S_regpiece (regcomp.c​:10876)
==30947== by 0x737B61​: S_regbranch (regcomp.c​:10801)
==30947== Address 0x5f7e585 is 325 bytes inside a block of size 8,192 alloc'd
==30947== at 0x4C29986​: calloc (vg_replace_malloc.c​:711)
==30947== by 0x7F0844​: Perl_safesyscalloc (util.c​:440)
==30947== by 0xD97979​: PerlIOBuf_get_base (perlio.c​:4244)
==30947== by 0xDB645D​: Perl_PerlIO_get_base (perlio.c​:1758)
==30947== by 0xDB645D​: PerlIOBuf_write (perlio.c​:4106)
==30947== by 0xC60E5A​: Perl_do_print (doio.c​:1393)
==30947== by 0x7ED6DE​: Perl_write_to_stderr (util.c​:1492)
==30947== by 0xB7668F​: Perl_die_unwind (pp_ctl.c​:1676)
==30947== by 0x7EFA8C​: Perl_vcroak (util.c​:1703)
==30947== by 0x7F0051​: Perl_croak (util.c​:1748)
==30947== by 0x73657F​: S_regatom (regcomp.c​:11798)
==30947== by 0x737B61​: S_regpiece (regcomp.c​:10876)
==30947== by 0x737B61​: S_regbranch (regcomp.c​:10801)
==30947== by 0x6F764C​: S_reg (regcomp.c​:10596)
==30947==
Unknown regexp modifier "/0" at allcrash/perlu/f3i000270 line 1, at end of line
Unmatched [ in regex; marked by <-- HERE in m/0000000000000[0000000▒0000000000[.00.pP▒▒▒PP\x{DF}P`H▒▒```P?​:\x{DF}P`?​:\x{DF}|[`p?​:\x{DF}|[ <-- HERE 0000000▒0000000000[.00.])/ at allcrash/perlu/f3i000270 line 1.
PuTTYPuTTY==30947==
==30947== HEAP SUMMARY​:
==30947== in use at exit​: 108,390 bytes in 498 blocks
==30947== total heap usage​: 701 allocs, 203 frees, 150,609 bytes allocated
==30947==
==30947== LEAK SUMMARY​:
==30947== definitely lost​: 0 bytes in 0 blocks
==30947== indirectly lost​: 0 bytes in 0 blocks
==30947== possibly lost​: 0 bytes in 0 blocks
==30947== still reachable​: 108,390 bytes in 498 blocks
==30947== suppressed​: 0 bytes in 0 blocks
==30947== Rerun with --leak-check=full to see details of leaked memory
==30947==
==30947== For counts of detected and suppressed errors, rerun with​: -v
==30947== Use --track-origins=yes to see where uninitialised values come from
==30947== ERROR SUMMARY​: 261 errors from 3 contexts (suppressed​: 0 from 0)

**PERL -V**
dcollins@​nightshade64​:~/perldebug$ ./perl -V
Summary of my perl5 (revision 5 version 23 subversion 4) configuration​:
  Commit id​: 94757bf
  Platform​:
  osname=linux, osvers=3.16.0-4-amd64, archname=x86_64-linux-ld
  uname='linux nightshade64 3.16.0-4-amd64 #1 smp debian 3.16.7-ckt11-1+deb8u4 (2015-09-19) x86_64 gnulinux '
  config_args='-Dusedevel -Dprefix=/usr/local/perl-afl -Dcc=ccache afl-gcc -Duselongdouble -Duse64bitall -Doptimize=-g -Uversiononly -Uman1dir -Uman3dir -DDEBUGGING -DPERL_POISON -des'
  hint=recommended, useposix=true, d_sigaction=define
  useithreads=undef, usemultiplicity=undef
  use64bitint=define, use64bitall=define, uselongdouble=define
  usemymalloc=n, bincompat5005=undef
  Compiler​:
  cc='ccache afl-gcc', ccflags ='-fwrapv -DDEBUGGING -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
  optimize='-g',
  cppflags='-fwrapv -DDEBUGGING -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include'
  ccversion='', gccversion='4.9.2', gccosandvers=''
  intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678, doublekind=3
  d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16, longdblkind=3
  ivtype='long', ivsize=8, nvtype='long double', nvsize=16, Off_t='off_t', lseeksize=8
  alignbytes=16, prototype=define
  Linker and Libraries​:
  ld='ccache afl-gcc', ldflags =' -fstack-protector-strong -L/usr/local/lib'
  libpth=/usr/local/lib /usr/lib/gcc/x86_64-linux-gnu/4.9/include-fixed /usr/include/x86_64-linux-gnu /usr/lib /lib/x86_64-linux-gnu /lib/../lib /usr/lib/x86_64-linux-gnu /usr/lib/../lib /lib
  libs=-lpthread -lnsl -ldl -lm -lcrypt -lutil -lc
  perllibs=-lpthread -lnsl -ldl -lm -lcrypt -lutil -lc
  libc=libc-2.19.so, so=so, useshrplib=false, libperl=libperl.a
  gnulibc_version='2.19'
  Dynamic Linking​:
  dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E'
  cccdlflags='-fPIC', lddlflags='-shared -g -L/usr/local/lib -fstack-protector-strong'

Characteristics of this binary (from libperl)​:
  Compile-time options​: DEBUGGING HAS_TIMES PERLIO_LAYERS PERL_COPY_ON_WRITE
  PERL_DONT_CREATE_GVSV
  PERL_HASH_FUNC_ONE_AT_A_TIME_HARD PERL_MALLOC_WRAP
  PERL_PRESERVE_IVUV PERL_USE_DEVEL USE_64_BIT_ALL
  USE_64_BIT_INT USE_LARGE_FILES USE_LOCALE
  USE_LOCALE_COLLATE USE_LOCALE_CTYPE
  USE_LOCALE_NUMERIC USE_LOCALE_TIME USE_LONG_DOUBLE
  USE_PERLIO USE_PERL_ATOF
  Built under linux
  Compiled at Oct 2 2015 22​:41​:42
  @​INC​:
  /usr/local/perl-afl/lib/site_perl/5.23.4/x86_64-linux-ld
  /usr/local/perl-afl/lib/site_perl/5.23.4
  /usr/local/perl-afl/lib/5.23.4/x86_64-linux-ld
  /usr/local/perl-afl/lib/5.23.4
  .

@p5pRT
Copy link
Author

p5pRT commented Oct 4, 2015

From @dcollinsn

f3i000270

@p5pRT
Copy link
Author

p5pRT commented Dec 3, 2015

From @iabyn

On Sun, Oct 04, 2015 at 12​:41​:26PM -0700, Dan Collins wrote​:

In a not-debugging perl, the output is​:
Unknown regexp modifier "/0" at allcrash/perlu/f3i000270 line 1, at end of line
Unmatched [ in regex; marked by <-- HERE in m/0000000000000[0000000▒0000000000[.00.1▒▒!\x{DF}!d [1[.00.1P▒▒1?​:\x{DF}|[ <-- HERE 0000000▒0000000000[.00.])/ at allcrash/perlu/f3i000270 line 1.

In a debugging perl, the output is​:
perl​: sv.c​:11449​: Perl_sv_vcatpvfn_flags​: Assertion `(IV)elen >= 0' failed.
Aborted

It can be reduced to this​:

  /[\x{df}[.00./i;

At its heart, it's because S_regclass(), when confronted with weirdo chars
like \xdf which can match more than one char, it constructs a new char
class string and then reparses it. During this it temporarily sets
RExC_parse and RExC_end to point to the new string, but fails to update
RExC_precomp. If an error is subsequently detected, the error message (via
VFAIL) is supposed to contain the chars between RExC_precomp and
RExC_parse, followed by '<-- HERE'. Since those two pointers point to
different strings, the string printed out may include a big chunk of
random memory.

The basic fix seems simple; just localise RExC_precomp at the same time as
RExC_parse and RExC_end. The diff attached does just that, and indeed
stops the crashes / assertion failures.

However, when I tried to add a test in t/re/reg_mesg.t, I ran into two
problems.

First, the error message lists the faked-up pattern being recursively
compiled, rather than the original pattern. That could be rather
surprising to a programmer. I don't know whether there's a way to avoid
that.

Second, reg_mesg.t tests each entry twice, without and with "use re
'strict'". However, that causes the error message to change​:

  $ p -e'/[\x{DF}[.00./i'
  Unmatched [ in regex; marked by <-- HERE ...

  $ p -e'use re "strict"; /[\x{DF}[.00./i'
  Unmatched '[' in POSIX class in regex; marked by <-- HERE ...

I don't know whether that's intentional or not.

I then got into what what exactly triggered the error, and whether it
did the recursive compilation thinggy, and whether it was treated as
POSIX, and got very confused about [.XXX.] verses [​:XXX​:] etc and what
constitutes a char class, and what error you get if just the closing ]
is missing rather than the closing .] or :].

I came to the conclusion that I'm a bit out of my depth here, and am
hoping that Karl or someone can finish this off.

--
A major Starfleet emergency breaks out near the Enterprise, but
fortunately some other ships in the area are able to deal with it to
everyone's satisfaction.
  -- Things That Never Happen in "Star Trek" #13

@p5pRT
Copy link
Author

p5pRT commented Dec 3, 2015

From @iabyn

precomp.diff
diff --git a/regcomp.c b/regcomp.c
index 059745d..c810284 100644
--- a/regcomp.c
+++ b/regcomp.c
@@ -15443,6 +15443,7 @@ S_regclass(pTHX_ RExC_state_t *pRExC_state, I32 *flagp, U32 depth,
 	STRLEN len;
 	char *save_end = RExC_end;
 	char *save_parse = RExC_parse;
+	char *save_precomp = RExC_precomp;
         bool first_time = TRUE;     /* First multi-char occurrence doesn't get
                                        a "|" */
         I32 reg_flags;
@@ -15496,6 +15497,7 @@ S_regclass(pTHX_ RExC_state_t *pRExC_state, I32 *flagp, U32 depth,
 #endif
 
 	RExC_parse = SvPV(substitute_parse, len);
+	RExC_precomp = RExC_parse;
 	RExC_end = RExC_parse + len;
         RExC_in_multi_char_class = 1;
 	RExC_override_recoding = 1;
@@ -15505,6 +15507,7 @@ S_regclass(pTHX_ RExC_state_t *pRExC_state, I32 *flagp, U32 depth,
 
 	*flagp |= reg_flags&(HASWIDTH|SIMPLE|SPSTART|POSTPONED|RESTART_PASS1|NEED_UTF8);
 
+	RExC_precomp = save_precomp;
 	RExC_parse = save_parse;
 	RExC_end = save_end;
 	RExC_in_multi_char_class = 0;
diff --git a/t/re/reg_mesg.t b/t/re/reg_mesg.t
index 62e3e4a..478ddeb 100644
--- a/t/re/reg_mesg.t
+++ b/t/re/reg_mesg.t
@@ -263,6 +263,7 @@ my @death =
  '/(?[\ &!])/' => 'Incomplete expression within \'(?[ ])\' {#} m/(?[\ &!{#}])/',    # [perl #126180]
  '/(?[()-!])/' => 'Incomplete expression within \'(?[ ])\' {#} m/(?[()-!{#}])/',    # [perl #126204]
  '/(?[!()])/' => 'Incomplete expression within \'(?[ ])\' {#} m/(?[!(){#}])/',      # [perl #126404]
+  '/[\x{DF}[.00./i' => 'Unmatched [ {#} m/?:\x{DF}|[{#}\x{DF}[.00.])/', #  [perl #126261]
 );
 
 # These are messages that are warnings when not strict; death under 'use re

@p5pRT
Copy link
Author

p5pRT commented Dec 3, 2015

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Dec 3, 2015

From @khwilliamson

On 12/03/2015 03​:10 AM, Dave Mitchell wrote​:

On Sun, Oct 04, 2015 at 12​:41​:26PM -0700, Dan Collins wrote​:

In a not-debugging perl, the output is​:
Unknown regexp modifier "/0" at allcrash/perlu/f3i000270 line 1, at end of line
Unmatched [ in regex; marked by <-- HERE in m/0000000000000[0000000▒0000000000[.00.1▒▒!\x{DF}!d [1[.00.1P▒▒1?​:\x{DF}|[ <-- HERE 0000000▒0000000000[.00.])/ at allcrash/perlu/f3i000270 line 1.

In a debugging perl, the output is​:
perl​: sv.c​:11449​: Perl_sv_vcatpvfn_flags​: Assertion `(IV)elen >= 0' failed.
Aborted

It can be reduced to this​:

 /\[\\x\{df\}\[\.00\./i;

At its heart, it's because S_regclass(), when confronted with weirdo chars
like \xdf which can match more than one char, it constructs a new char
class string and then reparses it. During this it temporarily sets
RExC_parse and RExC_end to point to the new string, but fails to update
RExC_precomp. If an error is subsequently detected, the error message (via
VFAIL) is supposed to contain the chars between RExC_precomp and
RExC_parse, followed by '<-- HERE'. Since those two pointers point to
different strings, the string printed out may include a big chunk of
random memory.

The basic fix seems simple; just localise RExC_precomp at the same time as
RExC_parse and RExC_end. The diff attached does just that, and indeed
stops the crashes / assertion failures.

However, when I tried to add a test in t/re/reg_mesg.t, I ran into two
problems.

First, the error message lists the faked-up pattern being recursively
compiled, rather than the original pattern. That could be rather
surprising to a programmer. I don't know whether there's a way to avoid
that.

I added the code that fails, and didn't think about this case. This
isn't the only place where this trick of reparsing something is done. I
believe I wasn't the one who first did it. All of these will have to be
investigated and fixed. I will do that.

Second, reg_mesg.t tests each entry twice, without and with "use re
'strict'". However, that causes the error message to change​:

 $ p \-e'/\[\\x\{DF\}\[\.00\./i'
 Unmatched \[ in regex; marked by \<\-\- HERE \.\.\.

 $ p \-e'use re "strict"; /\[\\x\{DF\}\[\.00\./i'
 Unmatched '\[' in POSIX class in regex; marked by \<\-\- HERE \.\.\.

I don't know whether that's intentional or not.

It's sort of intentional, but can be improved on.

I then got into what what exactly triggered the error, and whether it
did the recursive compilation thinggy, and whether it was treated as
POSIX, and got very confused about [.XXX.] verses [​:XXX​:] etc and what
constitutes a char class, and what error you get if just the closing ]
is missing rather than the closing .] or :].

I came to the conclusion that I'm a bit out of my depth here, and am
hoping that Karl or someone can finish this off.

I have had a work long-in-progress to fix up the POSIX class things.
And am getting closer to finishing that up real-soon-now™

@p5pRT
Copy link
Author

p5pRT commented Dec 22, 2015

From @khwilliamson

This is now fixed by commit 285b5ca

Thanks for the report
--
Karl Williamson

@p5pRT
Copy link
Author

p5pRT commented Dec 22, 2015

@khwilliamson - Status changed from 'open' to 'pending release'

@p5pRT
Copy link
Author

p5pRT commented May 13, 2016

From @khwilliamson

Thank you for submitting this report. You have helped make Perl better.
 
With the release of Perl 5.24.0 on May 9, 2016, this and 149 other issues have been resolved.

Perl 5.24.0 may be downloaded via https://metacpan.org/release/RJBS/perl-5.24.0

@p5pRT
Copy link
Author

p5pRT commented May 13, 2016

@khwilliamson - Status changed from 'pending release' to 'resolved'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant