Skip Menu |

Date: Sat, 3 Jan 2015 13:48:10 -0600
Subject: Segmentation Fault in Perl 5.21.8 while fuzzing Perl binary
From: Brian Carpenter <brian.carpenter [...] gmail.com>
To: perlbug [...] perl.org
I cloned the git repo on 01/02/2015 and built from source using the afl-gcc compiler:

CC=/path/to/afl-gcc ./Configure
AFL_HARDEN=1 make

This is perl 5, version 21, subversion 8 (v5.21.8 (v5.21.7-209-g4e27940)) built for x86_64-linux

Besides the above information, this version of Perl was compiled using all defaults (enter was pressed for every question).

While fuzzing the new perl binary, I found a testcase that causes a segfault. I've attached the file for inclusion in the bug report. Please note that a file named = containing zero bytes was also created in a tmp directory around 1 minute 5 seconds before the testcase was written to disk. Not sure if this is related but I wanted to add the information just in case it was important. Please advise if this is a security issue and if you will handle the CVE request.

@@ VALGRIND OUTPUT @@

valgrind -q /home/geeknik/perl5/perl ./id\:000000\,sig\:11\,src\:002461+016504\,op\:splice\,rep\:32
==54123== Use of uninitialised value of size 8
==54123==    at 0x651E15: S_regatom (regcomp.c:12632)
==54123==    by 0x656B49: S_regpiece (regcomp.c:10821)
==54123==    by 0x65AB0D: S_regbranch (regcomp.c:10747)
==54123==    by 0x66C9A5: S_reg.constprop.9 (regcomp.c:10497)
==54123==    by 0x690D3B: Perl_re_op_compile (regcomp.c:6697)
==54123==    by 0x4C1222: Perl_pmruntime (op.c:5629)
==54123==    by 0x5CBEBC: Perl_yyparse (perly.y:1000)
==54123==    by 0x4F3114: perl_parse (perl.c:2273)
==54123==    by 0x42A92B: main (perlmain.c:114)
==54123==
==54123== Invalid write of size 1
==54123==    at 0x64C878: S_regatom (regcomp.c:12484)
==54123==    by 0x656B49: S_regpiece (regcomp.c:10821)
==54123==    by 0x65AB0D: S_regbranch (regcomp.c:10747)
==54123==    by 0x66C9A5: S_reg.constprop.9 (regcomp.c:10497)
==54123==    by 0x693D3B: Perl_re_op_compile (regcomp.c:6895)
==54123==    by 0x4C1222: Perl_pmruntime (op.c:5629)
==54123==    by 0x5CBEBC: Perl_yyparse (perly.y:1000)
==54123==    by 0x4F3114: perl_parse (perl.c:2273)
==54123==    by 0x42A92B: main (perlmain.c:114)
==54123==  Address 0x5eea96c is 0 bytes after a block of size 3,500 alloc'd
==54123==    at 0x4C28BED: malloc (vg_replace_malloc.c:263)
==54123==    by 0x6CEFBC: Perl_safesysmalloc (util.c:144)
==54123==    by 0x692C7A: Perl_re_op_compile (regcomp.c:6746)
==54123==    by 0x4C1222: Perl_pmruntime (op.c:5629)
==54123==    by 0x5CBEBC: Perl_yyparse (perly.y:1000)
==54123==    by 0x4F3114: perl_parse (perl.c:2273)
==54123==    by 0x42A92B: main (perlmain.c:114)
==54123==
syntax error at ./id:000000,sig:11,src:002461+016504,op:splice,rep:32 line 11, near "printr)"
Unknown regexp modifier "/b" at ./id:000000,sig:11,src:002461+016504,op:splice,rep:32 line 17, at end of line
Unknown regexp modifier "/G" at ./id:000000,sig:11,src:002461+016504,op:splice,rep:32 line 17, at end of line
Unknown regexp modifier "/e" at ./id:000000,sig:11,src:002461+016504,op:splice,rep:32 line 17, at end of line
Unknown regexp modifier "/r" at ./id:000000,sig:11,src:002461+016504,op:splice,rep:32 line 17, at end of line
panic: reg_node overrun trying to emit 0, 5eea97c>=5eea96c at ./id:000000,sig:11,src:002461+016504,op:splice,rep:32 line 20.


@@ GDB OUTPUT @@

gdb /home/geeknik/perl5/perl core
GNU gdb (GDB) 7.4.1-debian
Copyright (C) 2012 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /home/geeknik/perl5/perl...done.
[New LWP 12797]

warning: Can't read pathname for load map: Input/output error.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/home/geeknik/perl5/perl ./id:000000,sig:11,src:002461+016504,op:splice,rep:32'.
Program terminated with signal 11, Segmentation fault.
#0  0x00007f1223db11f0 in _int_free (av=0x7f12240c0e40, p=0x238f260) at malloc.c:5002
5002 malloc.c: No such file or directory.
(gdb) bt
#0  0x00007f1223db11f0 in _int_free (av=0x7f12240c0e40, p=0x238f260) at malloc.c:5002
#1  0x00007f1223db47bc in *__GI___libc_free (mem=<optimized out>) at malloc.c:3738
#2  0x00000000007bf7e8 in Perl_sv_clear (orig_sv=orig_sv@entry=0x238bbf8) at sv.c:6743
#3  0x00000000007b75f8 in Perl_sv_free2 (sv=0x238bbf8, rc=<optimized out>) at sv.c:7018
#4  0x000000000043cc85 in S_SvREFCNT_dec (sv=<optimized out>) at inline.h:166
#5  Perl_op_clear (o=o@entry=0x238df50) at op.c:931
#6  0x000000000043d568 in Perl_op_free (o=0x238df50) at op.c:781
#7  0x000000000043e119 in Perl_opslab_force_free (slab=0x2389980) at op.c:443
#8  0x00000000005f12c1 in Perl_cv_undef_flags (cv=0x2368328, flags=0) at pad.c:350
#9  0x00000000007bedc0 in Perl_sv_clear (orig_sv=orig_sv@entry=0x2368328) at sv.c:6592
#10 0x00000000007b75f8 in Perl_sv_free2 (sv=0x2368328, rc=<optimized out>) at sv.c:7018
#11 0x00000000004e77b5 in S_SvREFCNT_dec (sv=<optimized out>) at inline.h:166
#12 perl_destruct (my_perl=<optimized out>) at perl.c:780
#13 0x000000000042ab24 in main (argc=2, argv=0x7fff5c8b3a58, env=0x7fff5c8b3a70) at perlmain.c:127
#14 0x00007f1223d57ead in __libc_start_main (main=<optimized out>, argc=<optimized out>,
    ubp_av=<optimized out>, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>,
    stack_end=0x7fff5c8b3a48) at libc-start.c:244
#15 0x000000000042ac45 in _start ()
(gdb)



geeknik@deb7fuzz:~/perl5/utils$ ./perlbug -d
---
Flags:
    category=core
    severity=low
---
This perlbug was built using Perl 5.21.8 - Fri Jan  2 19:02:59 CST 2015
It is being executed now by  Perl 5.21.7 - Thu Dec 18 14:34:01 CST 2014.

Site configuration information for perl 5.21.7:

Configured by geeknik at Thu Dec 18 14:34:01 CST 2014.

Summary of my perl5 (revision 5 version 21 subversion 7) configuration:
  Commit id: e9d2bd8a490981edfc4ddabb5889ca0e86f29e29
  Platform:
    osname=linux, osvers=3.2.0-4-amd64, archname=x86_64-linux
    uname='linux deb7fuzz 3.2.0-4-amd64 #1 smp debian 3.2.63-2+deb7u2 x86_64 gnulinux '
    config_args=''
    hint=recommended, useposix=true, d_sigaction=define
    useithreads=undef, usemultiplicity=undef
    use64bitint=define, use64bitall=define, uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='/home/geeknik/afl/afl-gcc', ccflags ='-fwrapv -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -D_FORTIFY_SOURCE=2',
    optimize='-O2',
    cppflags='-fwrapv -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include'
    ccversion='', gccversion='4.7.2', gccosandvers=''
    intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678, doublekind=3
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16, longdblkind=3
    ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
    alignbytes=8, prototype=define
  Linker and Libraries:
    ld='/home/geeknik/afl/afl-gcc', ldflags =' -fstack-protector -L/usr/local/lib'
    libpth=/usr/local/lib /usr/lib/gcc/x86_64-linux-gnu/4.7/include-fixed /usr/include/x86_64-linux-gnu /usr/lib /lib/x86_64-linux-gnu /lib/../lib /usr/lib/x86_64-linux-gnu /usr/lib/../lib /lib
    libs=-lnsl -ldl -lm -lcrypt -lutil -lc -lpthread
    perllibs=-lnsl -ldl -lm -lcrypt -lutil -lc -lpthread
    libc=libc-2.13.so, so=so, useshrplib=false, libperl=libperl.a
    gnulibc_version='2.13'
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E'
    cccdlflags='-fPIC', lddlflags='-shared -O2 -L/usr/local/lib -fstack-protector'


---
@INC for perl 5.21.7:
    /usr/local/lib/perl5/site_perl/5.21.7/x86_64-linux
    /usr/local/lib/perl5/site_perl/5.21.7
    /usr/local/lib/perl5/5.21.7/x86_64-linux
    /usr/local/lib/perl5/5.21.7
    .

---
Environment for perl 5.21.7:
    HOME=/home/geeknik
    LANG=en_US.UTF-8
    LANGUAGE (unset)
    LD_LIBRARY_PATH (unset)
    LOGDIR (unset)
    PATH=/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
    PERL_BADLANG (unset)
    SHELL=/bin/bash
Download crash.gz
application/x-gzip 689b

Message body not shown because it is not plain text.

RT-Send-CC: perl5-porters [...] perl.org
Download (untitled) / with headers
text/plain 1.1k
On Sat Jan 03 11:49:13 2015, brian.carpenter@gmail.com wrote: Show quoted text
> I cloned the git repo on 01/02/2015 and built from source using the afl-gcc > compiler: > > CC=/path/to/afl-gcc ./Configure > AFL_HARDEN=1 make >
Can you describe how one obtains the afl-gcc compiler? Show quoted text
> This is perl 5, version 21, subversion 8 (v5.21.8 (v5.21.7-209-g4e27940)) > built for x86_64-linux > > Besides the above information, this version of Perl was compiled using all > defaults (enter was pressed for every question). > > While fuzzing the new perl binary, I found a testcase that causes a > segfault. I've attached the file for inclusion in the bug report. Please > note that a file named = containing zero bytes was also created in a tmp > directory around 1 minute 5 seconds before the testcase was written to > disk. Not sure if this is related but I wanted to add the information just > in case it was important. Please advise if this is a security issue and if > you will handle the CVE request. >
I'm a bit puzzled about the test case, since, insofar as it is intended to be a perl program, it fails to compile. It reports a syntax error near 'printr)'. Can you clarify? Thank you very much. -- James E Keenan (jkeenan@cpan.org)
Date: Sat, 03 Jan 2015 23:38:10 +0100
From: Lukas Mai <plokinom [...] gmail.com>
To: perl5-porters [...] perl.org
Subject: Re: [perl #123539] Segmentation Fault in Perl 5.21.8 while fuzzing Perl binary
Am 03.01.2015 um 22:54 schrieb James E Keenan via RT: Show quoted text
> On Sat Jan 03 11:49:13 2015, brian.carpenter@gmail.com wrote:
>> I cloned the git repo on 01/02/2015 and built from source using the afl-gcc >> compiler: >> >> CC=/path/to/afl-gcc ./Configure >> AFL_HARDEN=1 make >>
> > Can you describe how one obtains the afl-gcc compiler?
http://lcamtuf.coredump.cx/afl/ Show quoted text
>> This is perl 5, version 21, subversion 8 (v5.21.8 (v5.21.7-209-g4e27940)) >> built for x86_64-linux >> >> Besides the above information, this version of Perl was compiled using all >> defaults (enter was pressed for every question). >> >> While fuzzing the new perl binary, I found a testcase that causes a >> segfault. I've attached the file for inclusion in the bug report. Please >> note that a file named = containing zero bytes was also created in a tmp >> directory around 1 minute 5 seconds before the testcase was written to >> disk. Not sure if this is related but I wanted to add the information just >> in case it was important. Please advise if this is a security issue and if >> you will handle the CVE request. >>
> > I'm a bit puzzled about the test case, since, insofar as it is intended to be a perl program, it fails to compile. It reports a syntax error near 'printr)'. Can you clarify?
As the original bug report says: syntax error at ./id:000000,sig:11,src:002461+016504,op:splice,rep:32 line 11, near "printr)" Unknown regexp modifier "/b" at ./id:000000,sig:11,src:002461+016504,op:splice,rep:32 line 17, at end of line Unknown regexp modifier "/G" at ./id:000000,sig:11,src:002461+016504,op:splice,rep:32 line 17, at end of line Unknown regexp modifier "/e" at ./id:000000,sig:11,src:002461+016504,op:splice,rep:32 line 17, at end of line Unknown regexp modifier "/r" at ./id:000000,sig:11,src:002461+016504,op:splice,rep:32 line 17, at end of line panic: reg_node overrun trying to emit 0, 5eea97c>=5eea96c at ./id:000000,sig:11,src:002461+016504,op:splice,rep:32 line 20. Perl panics/crashes while trying to handle a syntax error, apparently. -- Lukas Mai <plokinom@gmail.com>
Date: Sat, 3 Jan 2015 16:04:10 -0600
To: perlbug-followup [...] perl.org
From: Brian Carpenter <brian.carpenter [...] gmail.com>
Subject: Re: [perl #123539] Segmentation Fault in Perl 5.21.8 while fuzzing Perl binary
Download (untitled) / with headers
text/plain 699b
The original test case that I fed to AFL was this:

#!/usr/local/bin/perl
 print "Content-type: text/html\n\n";
 print "Hello World.\n";
 print "Heres the form info:<P>\n";
 my($buffer);
 my(@pairs);
 my($pair);
 read(STDIN,$buffer,$ENV{'CONTENT_LENGTH'});
 @pairs = split(/&/, $buffer);
 foreach $pair (@pairs)
   {
   print "$pair<BR>\n"
   }
print "<P>Note that further parsing is\n";
print "necessary to turn the plus signs\n";
print "into spaces and get rid of some\n";
print "other web encoding.\n";

After about 25 million iterations, that script was turned into the test case that I submitted. Shouldn't perl fail gracefully instead of just crashing because of invalid input/characters/etc?

From: Brian Carpenter <brian.carpenter [...] gmail.com>
To: perlbug-followup [...] perl.org
Subject: Re: [perl #123539] Segmentation Fault in Perl 5.21.8 while fuzzing Perl binary
Date: Sat, 3 Jan 2015 16:05:15 -0600
Download (untitled) / with headers
text/plain 119b
The afl-gcc is available as part of American Fuzzy Lop which you can obtain from here: http://lcamtuf.coredump.cx/afl/
RT-Send-CC: perl5-porters [...] perl.org
Download (untitled) / with headers
text/plain 396b
On Sat Jan 03 14:50:18 2015, brian.carpenter@gmail.com wrote: Show quoted text
> Shouldn't perl fail gracefully instead of just > crashing because of invalid input/characters/etc?
Yes, it should. I don’t get a crash when I run the script. I do get a panic message, though, in bleadperl and in 5.20.1. 5.18.3 just shows ‘normal’ errors, no panics. I’m going to run a bisect. -- Father Chrysostomos
RT-Send-CC: perl5-porters [...] perl.org
Download (untitled) / with headers
text/plain 965b
On Sat Jan 03 17:35:31 2015, sprout wrote: Show quoted text
> On Sat Jan 03 14:50:18 2015, brian.carpenter@gmail.com wrote:
> > Shouldn't perl fail gracefully instead of just > > crashing because of invalid input/characters/etc?
> > Yes, it should. I don’t get a crash when I run the script. I do get > a panic message, though, in bleadperl and in 5.20.1. 5.18.3 just > shows ‘normal’ errors, no panics. I’m going to run a bisect.
31f05a37c4e9c37a7263491f2fc0237d836e1a80 is the first bad commit commit 31f05a37c4e9c37a7263491f2fc0237d836e1a80 Author: Karl Williamson <public@khwilliamson.com> Date: Mon Jan 27 15:35:00 2014 -0700 Work properly under UTF-8 LC_CTYPE locales This large (sorry, I couldn't figure out how to meaningfully split it up) commit causes Perl to fully support LC_CTYPE operations (case changing, character classification) in UTF-8 locales. As a side effect it resolves [perl #56820]. -- Father Chrysostomos
RT-Send-CC: perl5-porters [...] perl.org
Download (untitled) / with headers
text/plain 1.6k
On Sat Jan 03 18:00:41 2015, sprout wrote: Show quoted text
> On Sat Jan 03 17:35:31 2015, sprout wrote:
> > On Sat Jan 03 14:50:18 2015, brian.carpenter@gmail.com wrote:
> > > Shouldn't perl fail gracefully instead of just > > > crashing because of invalid input/characters/etc?
> > > > Yes, it should. I don’t get a crash when I run the script. I do get > > a panic message, though, in bleadperl and in 5.20.1. 5.18.3 just > > shows ‘normal’ errors, no panics. I’m going to run a bisect.
> > 31f05a37c4e9c37a7263491f2fc0237d836e1a80 is the first bad commit > commit 31f05a37c4e9c37a7263491f2fc0237d836e1a80 > Author: Karl Williamson <public@khwilliamson.com> > Date: Mon Jan 27 15:35:00 2014 -0700 > > Work properly under UTF-8 LC_CTYPE locales > > This large (sorry, I couldn't figure out how to meaningfully split it > up) commit causes Perl to fully support LC_CTYPE operations (case > changing, character classification) in UTF-8 locales. > > As a side effect it resolves [perl #56820].
Reduced case. This appears to have nothing to do with the syntax error. /TffffffffffffTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTT5TT TTTTTTTTTTTTTTTTTTTTTTT3TTgTTTTTTTTTTTTTTTTTTTTT2TTTTTTTTTTTTTTTTTTTTTT THHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHiHHHHHH Hffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff ffffffffffffffffffffffffffffffffffffffffffffffffffffffff fffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff ffffffffffff&ffff/il The output I get is: panic: reg_node overrun trying to emit 0, 7fcdf140cca8>=7fcdf140cca4 at /Users/sprout/Downloads/crash line 7. With some variations of the above, I got malloc errors. -- Father Chrysostomos
Date: Sat, 03 Jan 2015 22:32:56 -0700
CC: perl5-porters [...] perl.org
Subject: Re: [perl #123539] Segmentation Fault in Perl 5.21.8 while fuzzing Perl binary
From: Karl Williamson <public [...] khwilliamson.com>
To: perlbug-followup [...] perl.org
Download (untitled) / with headers
text/plain 1.9k
On 01/03/2015 08:39 PM, Father Chrysostomos via RT wrote: Show quoted text
> On Sat Jan 03 18:00:41 2015, sprout wrote:
>> On Sat Jan 03 17:35:31 2015, sprout wrote:
>>> On Sat Jan 03 14:50:18 2015, brian.carpenter@gmail.com wrote:
>>>> Shouldn't perl fail gracefully instead of just >>>> crashing because of invalid input/characters/etc?
>>> >>> Yes, it should. I don’t get a crash when I run the script. I do get >>> a panic message, though, in bleadperl and in 5.20.1. 5.18.3 just >>> shows ‘normal’ errors, no panics. I’m going to run a bisect.
>> >> 31f05a37c4e9c37a7263491f2fc0237d836e1a80 is the first bad commit >> commit 31f05a37c4e9c37a7263491f2fc0237d836e1a80 >> Author: Karl Williamson <public@khwilliamson.com> >> Date: Mon Jan 27 15:35:00 2014 -0700 >> >> Work properly under UTF-8 LC_CTYPE locales >> >> This large (sorry, I couldn't figure out how to meaningfully split it >> up) commit causes Perl to fully support LC_CTYPE operations (case >> changing, character classification) in UTF-8 locales. >> >> As a side effect it resolves [perl #56820].
> > Reduced case. This appears to have nothing to do with the syntax error. > > /TffffffffffffTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTT5TT > TTTTTTTTTTTTTTTTTTTTTTT3TTgTTTTTTTTTTTTTTTTTTTTT2TTTTTTTTTTTTTTTTTTTTTT > THHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHiHHHHHH > Hffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff > ffffffffffffffffffffffffffffffffffffffffffffffffffffffff > fffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff > ffffffffffff&ffff/il > > The output I get is: > > panic: reg_node overrun trying to emit 0, 7fcdf140cca8>=7fcdf140cca4 at /Users/sprout/Downloads/crash line 7. > > With some variations of the above, I got malloc errors. >
I didn't get these errors, but I did see some valgrind issues, which the attached patch resolves. Please try it out to see if it fixes your problems.

Message body is not shown because sender requested not to inline it.

RT-Send-CC: perl5-porters [...] perl.org
Download (untitled) / with headers
text/plain 2.2k
On Sat Jan 03 21:33:22 2015, public@khwilliamson.com wrote: Show quoted text
> On 01/03/2015 08:39 PM, Father Chrysostomos via RT wrote:
> > On Sat Jan 03 18:00:41 2015, sprout wrote:
> >> On Sat Jan 03 17:35:31 2015, sprout wrote:
> >>> On Sat Jan 03 14:50:18 2015, brian.carpenter@gmail.com wrote:
> >>>> Shouldn't perl fail gracefully instead of just > >>>> crashing because of invalid input/characters/etc?
> >>> > >>> Yes, it should. I don’t get a crash when I run the script. I do > >>> get > >>> a panic message, though, in bleadperl and in 5.20.1. 5.18.3 just > >>> shows ‘normal’ errors, no panics. I’m going to run a bisect.
> >> > >> 31f05a37c4e9c37a7263491f2fc0237d836e1a80 is the first bad commit > >> commit 31f05a37c4e9c37a7263491f2fc0237d836e1a80 > >> Author: Karl Williamson <public@khwilliamson.com> > >> Date: Mon Jan 27 15:35:00 2014 -0700 > >> > >> Work properly under UTF-8 LC_CTYPE locales > >> > >> This large (sorry, I couldn't figure out how to meaningfully split > >> it > >> up) commit causes Perl to fully support LC_CTYPE operations (case > >> changing, character classification) in UTF-8 locales. > >> > >> As a side effect it resolves [perl #56820].
> > > > Reduced case. This appears to have nothing to do with the syntax > > error. > > > > /TffffffffffffTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTT5TT > > TTTTTTTTTTTTTTTTTTTTTTT3TTgTTTTTTTTTTTTTTTTTTTTT2TTTTTTTTTTTTTTTTTTTTTT > > THHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHiHHHHHH > > Hffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff > > ffffffffffffffffffffffffffffffffffffffffffffffffffffffff > > fffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff > > ffffffffffff&ffff/il > > > > The output I get is: > > > > panic: reg_node overrun trying to emit 0, 7fcdf140cca8>=7fcdf140cca4 > > at /Users/sprout/Downloads/crash line 7. > > > > With some variations of the above, I got malloc errors. > >
> > I didn't get these errors, but I did see some valgrind issues, which > the > attached patch resolves. Please try it out to see if it fixes your > problems.
Yes, it works for me, both with the reduced case and with the original script from this ticket. I have no idea how it works, though. -- Father Chrysostomos
RT-Send-CC: perl5-porters [...] perl.org
Download (untitled) / with headers
text/plain 379b
On Sat Jan 03 21:33:22 2015, public@khwilliamson.com wrote: Show quoted text
> > I didn't get these errors, but I did see some valgrind issues, which > the > attached patch resolves. Please try it out to see if it fixes your > problems.
The patch works for me as well. With it I get normal syntax errors as shown by the attachment. Thank you very much. -- James E Keenan (jkeenan@cpan.org)
Subject: 123539-crash-after-khw-patch.txt
Bareword found where operator expected at /home/jkeenan/learn/perl/p5p/noshebang-123539-crash line 21, near "HHHHHHHHHHHHHHHHHHHH" (Might be a runaway multi-line // string starting on line 17) (Missing semicolon on previous line?) syntax error at /home/jkeenan/learn/perl/p5p/noshebang-123539-crash line 11, near "printr)" Unknown regexp modifier "/b" at /home/jkeenan/learn/perl/p5p/noshebang-123539-crash line 17, at end of line Unknown regexp modifier "/G" at /home/jkeenan/learn/perl/p5p/noshebang-123539-crash line 17, at end of line Unknown regexp modifier "/e" at /home/jkeenan/learn/perl/p5p/noshebang-123539-crash line 17, at end of line Unknown regexp modifier "/r" at /home/jkeenan/learn/perl/p5p/noshebang-123539-crash line 17, at end of line syntax error at /home/jkeenan/learn/perl/p5p/noshebang-123539-crash line 21, near "HHHHHHHHHHHHHHHHHHHH" Unrecognized character \x83; marked by <-- HERE after HHHHHHHHHH<-- HERE near column 22 at /home/jkeenan/learn/perl/p5p/noshebang-123539-crash line 21.
Date: Sun, 04 Jan 2015 13:20:33 +0000
CC: perl5-porters [...] perl.org, hv [...] crypt.org
From: hv [...] crypt.org
To: perlbug-followup [...] perl.org
Subject: Re: [perl #123539] Segmentation Fault in Perl 5.21.8 while fuzzing Perl binary
Download (untitled) / with headers
text/plain 1.3k
"Father Chrysostomos via RT" <perlbug-followup@perl.org> wrote: :On Sat Jan 03 21:33:22 2015, public@khwilliamson.com wrote: :> I didn't get these errors, but I did see some valgrind issues, which :> the :> attached patch resolves. Please try it out to see if it fixes your :> problems. : :Yes, it works for me, both with the reduced case and with the original script from this ticket. I have no idea how it works, though. I don't understand the fix either, but I was able to come up with a somewhat shorter test case: % ./perl -we '/bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbAAbbb/il' panic: reg_node overrun trying to emit 0, 1720b50>=1720b50 at -e line 1. % That's the first hit (len=130, pos1=125, pos2=126) from this search: #!./perl my $lc = 'b'; my $uc = 'A'; # any of [AFHIJSTWY] for my $len (2 .. 408) { for my $pos1 (0 .. $len - 2) { for my $pos2 ($pos1 + 1 .. $len - 1) { my $s = $lc x $len; substr($s, $_, 1) = $uc for ($pos1, $pos2); eval qq{/$s/il}; die "$len: $pos1, $pos2" if $@; } } } I also don't know what's special about [AFHIJSTWY], but I assume they relate somehow to the PROBLEMATIC_LOCALE_FOLD characters. With Karl's suggested patch, the search completes without seeing any problems. Hugo
Date: Mon, 5 Jan 2015 12:24:58 +0100
CC: Perl RT Bug Tracker <perlbug-followup [...] perl.org>, Perl5 Porteros <perl5-porters [...] perl.org>
Subject: Re: [perl #123539] Segmentation Fault in Perl 5.21.8 while fuzzing Perl binary
To: "hv [...] crypt.org" <hv [...] crypt.org>, Father Chrysostomos <sprout [...] cpan.org>
From: demerphq <demerphq [...] gmail.com>
Download (untitled) / with headers
text/plain 1.7k
On 4 January 2015 at 14:20, <hv@crypt.org> wrote: Show quoted text
> "Father Chrysostomos via RT" <perlbug-followup@perl.org> wrote: > :On Sat Jan 03 21:33:22 2015, public@khwilliamson.com wrote: > :> I didn't get these errors, but I did see some valgrind issues, which > :> the > :> attached patch resolves. Please try it out to see if it fixes your > :> problems. > : > :Yes, it works for me, both with the reduced case and with the original script from this ticket. I have no idea how it works, though. > > I don't understand the fix either,
Just to explain, I am pretty sure the key part of the patch is: - const STRLEN unilen = reguni(pRExC_state, ender, s); + const STRLEN unilen = (SIZE_ONLY && ! FOLD) + ? UNISKIP(ender) + : (uvchr_to_utf8((U8*)s, ender) - (U8*)s); The regex engine does two compile passes over the pattern (the first is sometimes repeated so sometimes it is actually three passes), the first pass is a sizing pass, which calculates/estimates an upper limit on the size of the compiled regexp program that will result from the pattern. Once this has been calculated the caller allocates the buffer the program will be written into and then recompiles. During the first pass the SIZE_ONLY flag is set, and this flag is used in all of the compiler logic to decide what to do. Prior to Karl's patch it looks like we would try to write to the non-existent program buffer during the first path. With his patch it does the right thing on the first patch. Hugo: bear in mind that the logic in study_chunk() fires AFTER the final write pass. IOW, this is all happening well before the optimizer fires. And indeed the multi-pass nature is exactly what I want to change (someday). cheers, yves
To: demerphq <demerphq [...] gmail.com>
From: hv [...] crypt.org
Subject: Re: [perl #123539] Segmentation Fault in Perl 5.21.8 while fuzzing Perl binary
Date: Mon, 05 Jan 2015 16:28:31 +0000
CC: "hv [...] crypt.org" <hv [...] crypt.org>, Father Chrysostomos <sprout [...] cpan.org>, Perl RT Bug Tracker <perlbug-followup [...] perl.org>, Perl5 Porteros <perl5-porters [...] perl.org>
demerphq <demerphq@gmail.com> wrote: :On 4 January 2015 at 14:20, <hv@crypt.org> wrote: :> "Father Chrysostomos via RT" <perlbug-followup@perl.org> wrote: :> :On Sat Jan 03 21:33:22 2015, public@khwilliamson.com wrote: :> :> I didn't get these errors, but I did see some valgrind issues, which :> :> the :> :> attached patch resolves. Please try it out to see if it fixes your :> :> problems. :> : :> :Yes, it works for me, both with the reduced case and with the original script from this ticket. I have no idea how it works, though. :> :> I don't understand the fix either, : :Just to explain, I am pretty sure the key part of the patch is: : :- const STRLEN unilen = reguni(pRExC_state, ender, s); :+ const STRLEN unilen = (SIZE_ONLY && ! FOLD) :+ ? UNISKIP(ender) :+ : :(uvchr_to_utf8((U8*)s, ender) - (U8*)s); : :The regex engine does two compile passes over the pattern (the first :is sometimes repeated so sometimes it is actually three passes), the :first pass is a sizing pass, which calculates/estimates an upper limit :on the size of the compiled regexp program that will result from the :pattern. Once this has been calculated the caller allocates the buffer :the program will be written into and then recompiles. During the first :pass the SIZE_ONLY flag is set, and this flag is used in all of the :compiler logic to decide what to do. : :Prior to Karl's patch it looks like we would try to write to the :non-existent program buffer during the first path. With his patch it :does the right thing on the first patch. It's the other way round isn't it? reguni() will just return the size when SIZE_ONLY, but write to s otherwise; the patch makes it also write to s when SIZE_ONLY && !FOLD. Looking further I see this, way further up: /* In pass1, folded, we use a temporary buffer instead of the * actual node, as the node doesn't exist yet */ s = (SIZE_ONLY && FOLD) ? foldbuf : STRING(ret); .. at which point it starts to make a little more sense. Hugo
From: demerphq <demerphq [...] gmail.com>
To: "hv [...] crypt.org" <hv [...] crypt.org>
Subject: Re: [perl #123539] Segmentation Fault in Perl 5.21.8 while fuzzing Perl binary
Date: Mon, 5 Jan 2015 20:36:44 +0100
CC: Father Chrysostomos <sprout [...] cpan.org>, Perl RT Bug Tracker <perlbug-followup [...] perl.org>, Perl5 Porteros <perl5-porters [...] perl.org>
Download (untitled) / with headers
text/plain 2.3k
On 5 January 2015 at 17:28, <hv@crypt.org> wrote: Show quoted text
> demerphq <demerphq@gmail.com> wrote: > :On 4 January 2015 at 14:20, <hv@crypt.org> wrote: > :> "Father Chrysostomos via RT" <perlbug-followup@perl.org> wrote: > :> :On Sat Jan 03 21:33:22 2015, public@khwilliamson.com wrote: > :> :> I didn't get these errors, but I did see some valgrind issues, which > :> :> the > :> :> attached patch resolves. Please try it out to see if it fixes your > :> :> problems. > :> : > :> :Yes, it works for me, both with the reduced case and with the original script from this ticket. I have no idea how it works, though. > :> > :> I don't understand the fix either, > : > :Just to explain, I am pretty sure the key part of the patch is: > : > :- const STRLEN unilen = reguni(pRExC_state, ender, s); > :+ const STRLEN unilen = (SIZE_ONLY && ! FOLD) > :+ ? UNISKIP(ender) > :+ : > :(uvchr_to_utf8((U8*)s, ender) - (U8*)s); > : > :The regex engine does two compile passes over the pattern (the first > :is sometimes repeated so sometimes it is actually three passes), the > :first pass is a sizing pass, which calculates/estimates an upper limit > :on the size of the compiled regexp program that will result from the > :pattern. Once this has been calculated the caller allocates the buffer > :the program will be written into and then recompiles. During the first > :pass the SIZE_ONLY flag is set, and this flag is used in all of the > :compiler logic to decide what to do. > : > :Prior to Karl's patch it looks like we would try to write to the > :non-existent program buffer during the first path. With his patch it > :does the right thing on the first patch. > > It's the other way round isn't it? reguni() will just return the size > when SIZE_ONLY, but write to s otherwise;
Yes, I missed that. Show quoted text
> the patch makes it also write > to s when SIZE_ONLY && !FOLD. > > Looking further I see this, way further up: > /* In pass1, folded, we use a temporary buffer instead of the > * actual node, as the node doesn't exist yet */ > s = (SIZE_ONLY && FOLD) ? foldbuf : STRING(ret); > .. at which point it starts to make a little more sense.
Indeed. Sorry for the misdirection. Cheers, Yves -- perl -Mre=debug -e "/just|another|perl|hacker/"
RT-Send-CC: perl5-porters [...] perl.org
Fixed in blead by 405dffcb17b9cc9d0e5d7b41835b998ca7f1d873 -- Karl Williamson
RT-Send-CC: perl5-porters [...] perl.org
Download (untitled) / with headers
text/plain 5.1k
On Tue Jan 06 14:11:00 2015, khw wrote: Show quoted text
> Fixed in blead by > 405dffcb17b9cc9d0e5d7b41835b998ca7f1d873
which is suitable for a maintenance release. Here's the text of that commit message: This is a minimal patch suitable for a maintenance release. It extracts the guts of reguni and REGC without the conditional they have. The next commit will do some refactoring to avoid branching and to make things clearer. This bug is due to the current two pass structure of the Perl regular expression compiler. The first pass attempts to do just enough work to figure out how much space to malloc for the compiled pattern; and the 2nd pass actually fills in the details. One problem with this design is that in many cases quite a bit of work is required to figure out the size, and this work is thrown away and redone in the second pass. Another problem is that it is easy to forget to do enough work in the sizing pass, and that is what happened with the blamed commit. I understand that there are plans (God speed) to change the compiler design. When not under /i matching, the size of a node that will match a sequence of characters is just the number of bytes those characters take up. We have an easy way to calculate the number of bytes any code point will occupy in UTF-8, and it's just 1 byte per code point for non-UTF-8. So in the sizing pass, we don't actually have to figure out the representation of the characters. However under /i matching, we do. First of all, matching of UTF-8 strings is done by replacing each character of each string by its fold-case (function fc()) and then comparing. This is required by the nature of full Unicode matching which is not 1-1. If we do that replacement for the pattern at compile time, we avoid having to do it over-and-over as pattern matching backtracks at execution. And because fc(x) may not occupy the same number of bytes as x, and there is no easy way to know that size without actually doing the fc(), we have to do the fold in the sizing pass. Now, there are relatively few folds where sizeof(fc(x)) != sizeof(x), so we could construct an exception table for those few cases where it is, and look up through that. But there is another reason that we have to fold in the sizing pass. And that is because of the potential for multi-character folds being split across regnodes. The regular expression compiler generates EXACTish regnodes for matching sequences of characters exactly or via /i. The limit for how many bytes in a sequence such a node can match is 255 because the length is stored in a U8. If the pattern has a sequence longer than that, it is split into two or more EXACTish nodes in a row. (Actually, the compiler splits at a size much lower than that; I'm not sure why, but then two adjoining nodes whose total sum length is at most 255 get joined later in the third, optimizing pass.) Now consider, matching the character U+FB03 LATIN SMALL LIGATURE FFI. It matches the sequence of the three characters "f f i". Because of the design of the regex pattern matching code, if these characters are such that the first one or two are at the end of one EXACTish node, and the final two or one are in another EXACTish node, then U+FB03 wrongly would not match them. Matches can't cross node boundaries. If the pattern were tweaked so all three characters were in either the first or second node, then the match would succeed. And that is what the compiler does. When it reaches the node's size limit, and the final character is one that is a non-terminal character in a multi-char fold, what's in the node is backed-off until it ends with a character without this characteristic. This has to be done in the sizing pass, as we are repacking the nodes, which can affect the size of the pattern, and we have to know what the folds are in order to determine all this. (We don't fold non-UTF-8 patterns. This is for two reasons. One is that one character, the U+00B5 MICRO SIGN, folds to above-Latin1, and if we folded it, we would have to change the pattern into UTF-8, and that would slow everything down. I've thought about adding a regnode type for the much more common case of a sequence that doesn't have this character in it, and which could hence be folded at compile time. But I've not been able to justify this because of the 2nd reason, which is folds in this range are simple enough to be handled by an array lookup, so folding is fast at runtime.) Then there is the complication of matching under locale rules. This bug manifested itself only under /l matching. We can't fold at pattern compile time, because the folding rules won't be known until runtime. This isn't a problem for non-UTF-8 locales, as all folds are 1-1, and so there never will be a multi-char fold. But there could be such folds in a UTF-8 locale, so the regnodes have to be packed to work for that eventuality. The blamed commit did not do that, and because this issue doesn't arise unless there is a string long enough to trigger the problem, this wasn't found until now. What is needed, and what this commit does, is for the unfolded characters to be accumulated in both passes. The code that looks for potential multi-char fold issues handles both folded and unfolded-inputs, so will work.
Subject: Re: [perl #123539] Segmentation Fault in Perl 5.21.8 while fuzzing Perl binary
From: demerphq <demerphq [...] gmail.com>
To: Father Chrysostomos via RT <perlbug-comment [...] perl.org>
Date: Wed, 7 Jan 2015 11:04:26 +0100
CC: Perl5 Porteros <perl5-porters [...] perl.org>
Download (untitled) / with headers
text/plain 603b
On 6 January 2015 at 23:16, Karl Williamson via RT <perlbug-comment@perl.org> wrote: Show quoted text
> On Tue Jan 06 14:11:00 2015, khw wrote:
>> Fixed in blead by >> 405dffcb17b9cc9d0e5d7b41835b998ca7f1d873
> > which is suitable for a maintenance release. > > Here's the text of that commit message: > > This is a minimal patch suitable for a maintenance release. It extracts > the guts of reguni and REGC without the conditional they have. The next > commit will do some refactoring to avoid branching and to make things > clearer.
++ Nice explanation. Yves -- perl -Mre=debug -e "/just|another|perl|hacker/"
CC: perl5-porters [...] perl.org, hv [...] crypt.org
Date: Wed, 07 Jan 2015 09:04:34 +0000
From: hv [...] crypt.org
To: perlbug-comment [...] perl.org
Subject: Re: [perl #123539] Segmentation Fault in Perl 5.21.8 while fuzzing Perl binary
"Karl Williamson via RT" <perlbug-comment@perl.org> wrote: :On Tue Jan 06 14:11:00 2015, khw wrote: :> Fixed in blead by :> 405dffcb17b9cc9d0e5d7b41835b998ca7f1d873 : :which is suitable for a maintenance release. : :Here's the text of that commit message: [snip] Thanks Karl, that's very clear. So I assume we ended up splitting in pass 2 (or 3) resulting in a size growth that should have been anticipated in pass 1 but wasn't, and this is why the symptom was a "panic: reg_node overrun". Is that right? I think there'd be value in having a test for this that's more minimal than the original fuzz-generated example: I'm not sure if my shorter [1] case is shortest, or best-suited, but it'll be easier (particularly with some additional comments) for the next guy to make sense of if something in this area breaks again in the future. Hugo [1] % ./perl -we '/bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbAAbbb/il' panic: reg_node overrun trying to emit 0, 1720b50>=1720b50 at -e line 1. %
Date: Wed, 07 Jan 2015 12:01:07 -0700
CC: perl5-porters [...] perl.org
Subject: Re: [perl #123539] Segmentation Fault in Perl 5.21.8 while fuzzing Perl binary
To: hv [...] crypt.org, perlbug-comment [...] perl.org
From: Karl Williamson <public [...] khwilliamson.com>
Download (untitled) / with headers
text/plain 2.1k
On 01/07/2015 02:04 AM, hv@crypt.org wrote: Show quoted text
> "Karl Williamson via RT" <perlbug-comment@perl.org> wrote: > :On Tue Jan 06 14:11:00 2015, khw wrote: > :> Fixed in blead by > :> 405dffcb17b9cc9d0e5d7b41835b998ca7f1d873 > : > :which is suitable for a maintenance release. > : > :Here's the text of that commit message: > [snip] > > Thanks Karl, that's very clear. > > So I assume we ended up splitting in pass 2 (or 3) resulting in a size > growth that should have been anticipated in pass 1 but wasn't, and this > is why the symptom was a "panic: reg_node overrun". Is that right?
The problem was that it was reading uninitialized memory in pass 1. That ended up giving a different result than when the memory was properly initialized in pass2, so the nodes it was sized for were not correct. I don't known how this translated into the overrun. Show quoted text
> > I think there'd be value in having a test for this that's more minimal > than the original fuzz-generated example: I'm not sure if my shorter [1] > case is shortest, or best-suited, but it'll be easier (particularly with > some additional comments) for the next guy to make sense of if something > in this area breaks again in the future.
The reason I went with the test I did was that it had a bunch of f's in a row, which I know off the top of my head are part of multi-char folds, whereas yours didn't have any characters that I knew that about, so I would have had to look it up. Probably the bug is triggered by just exceeding the node size limit, depending on what's in memory at the time. But, by being sure that it will have to execute the loop to back off all the f's put some extra stress on the code, so I went with that. I thought the reference to the bug number in the .t was sufficient comment if someone needed to understand the test. I didn't want to really explain it there, as it is rather complicated. (It took me a couple hours to write that commit message) Patches welcome. Show quoted text
> > Hugo > > [1] % ./perl -we '/bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbAAbbb/il' > panic: reg_node overrun trying to emit 0, 1720b50>=1720b50 at -e line 1. > % >
Date: Sat, 10 Jan 2015 11:20:08 +0000
CC: hv [...] crypt.org, perlbug-comment [...] perl.org, perl5-porters [...] perl.org
Subject: Re: [perl #123539] Segmentation Fault in Perl 5.21.8 while fuzzing Perl binary
From: hv [...] crypt.org
To: Karl Williamson <public [...] khwilliamson.com>
Download (untitled) / with headers
text/plain 829b
Karl Williamson <public@khwilliamson.com> wrote: :On 01/07/2015 02:04 AM, hv@crypt.org wrote: :> "Karl Williamson via RT" <perlbug-comment@perl.org> wrote: :> :On Tue Jan 06 14:11:00 2015, khw wrote: :> :> Fixed in blead by :> :> 405dffcb17b9cc9d0e5d7b41835b998ca7f1d873 :> : :> :which is suitable for a maintenance release. :> : :> :Here's the text of that commit message: :> [snip] :> :> Thanks Karl, that's very clear. :> :> So I assume we ended up splitting in pass 2 (or 3) resulting in a size :> growth that should have been anticipated in pass 1 but wasn't, and this :> is why the symptom was a "panic: reg_node overrun". Is that right? : :The problem was that it was reading uninitialized memory in pass 1. Ah, this is the bit I was missing; I understand now why reproducing it (and testing it) usefully is hard. Hugo
Subject: Your ticket against Perl 5 has been resolved
Download (untitled) / with headers
text/plain 263b
Thanks for submitting this ticket The issue should be resolved with the release today of Perl v5.22, available at http://www.perl.org/get.html If you find that the problem persists, feel free to reopen this ticket -- Karl Williamson for the Perl 5 porters team


This service is sponsored and maintained by Best Practical Solutions and runs on Perl.org infrastructure.

For issues related to this RT instance (aka "perlbug"), please contact perlbug-admin at perl.org