Skip Menu |
Report information
Id: 131649
Status: resolved
Priority: 0/
Queue: perl5

Owner: Nobody
Requestors: jwilk [at] jwilk.net
Cc:
AdminCc:

Operating System: (no value)
PatchStatus: (no value)
Severity: (no value)
Type: (no value)
Perl Version: (no value)
Fixed In: (no value)

Attachments
0001-5.26-fix-131649-extended-charclass-can-trigger-assert.patch
0001-fix-131649-extended-charclass-can-trigger-assert.patch.gz



From: Jakub Wilk <jwilk [...] jwilk.net>
Date: Sun, 25 Jun 2017 00:25:07 +0200
To: perl5-security-report [...] perl.org
Subject: Heap-based buffer overflow in S_regatom
Download (untitled) / with headers
text/plain 4.6k
Consider the following Perl program: m/(?[(?s:(?[[x]][xx]xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx])/; __END__ This is what happens when I run it: The regex_sets feature is experimental in regex; marked by <-- HERE in m/(?[ <-- HERE (?s:(?[[x]][xx]xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx])/ at test line 1. The regex_sets feature is experimental in regex; marked by <-- HERE in m/(?[(?s:(?[ <-- HERE [x]][xx]xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx])/ at test line 1. panic: reg_node overrun trying to emit 0, 8677794>=8677790 at test line 1. Valgrind says it's a heap-based buffer overflow: Invalid write of size 1 at 0x80EE11B: S_regatom (regcomp.c:13490) by 0x80E6A9D: S_regpiece (regcomp.c:11669) by 0x80E6872: S_regbranch (regcomp.c:11594) by 0x80E58D4: S_reg (regcomp.c:11332) by 0x80D85AF: Perl_re_op_compile (regcomp.c:7309) by 0x806D7DF: Perl_pmruntime (op.c:5882) by 0x80C2417: Perl_yyparse (perly.y:1204) by 0x8086637: S_parse_body (perl.c:2377) by 0x808578B: perl_parse (perl.c:1692) by 0x8061751: main (perlmain.c:121) Address 0x428e698 is 0 bytes after a block of size 160 alloc'd at 0x402A1CC: malloc (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so) by 0x81082BC: Perl_safesysmalloc (util.c:153) by 0x80D7E43: Perl_re_op_compile (regcomp.c:7155) by 0x806D7DF: Perl_pmruntime (op.c:5882) by 0x80C2417: Perl_yyparse (perly.y:1204) by 0x8086637: S_parse_body (perl.c:2377) by 0x808578B: perl_parse (perl.c:1692) by 0x8061751: main (perlmain.c:121) $ perl -V Summary of my perl5 (revision 5 version 26 subversion 0) configuration: Commit id: 95388f2eb27e74cdbfb715c0097f16aeba4e6e4e Platform: osname=linux osvers=3.16.0-4-686-pae archname=i686-linux uname='linux dirac 3.16.0-4-686-pae #1 smp debian 3.16.43-2 (2017-04-30) i686 gnulinux ' config_args='-Dcc=gcc -d' hint=previous useposix=true d_sigaction=define useithreads=undef usemultiplicity=undef use64bitint=undef use64bitall=undef uselongdouble=undef usemymalloc=n default_inc_excludes_dot=define bincompat5005=undef Compiler: cc='gcc' ccflags ='-fwrapv -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -D_FORTIFY_SOURCE=2' optimize='-g' cppflags='-fwrapv -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include' ccversion='' gccversion='4.9.2' gccosandvers='' intsize=4 longsize=4 ptrsize=4 doublesize=8 byteorder=1234 doublekind=3 d_longlong=define longlongsize=8 d_longdbl=define longdblsize=12 longdblkind=3 ivtype='long' ivsize=4 nvtype='double' nvsize=8 Off_t='off_t' lseeksize=8 alignbytes=4 prototype=define Linker and Libraries: ld='cc' ldflags =' -fstack-protector-strong -L/usr/local/lib' libpth=/usr/local/lib /usr/lib/gcc/i586-linux-gnu/4.9/include-fixed /usr/include/i386-linux-gnu /usr/lib /lib/i386-linux-gnu /lib/../lib /usr/lib/i386-linux-gnu /usr/lib/../lib /lib /usr/local/lib /usr/lib/gcc/i586-linux-gnu/4.9/include-fixed /usr/include/i386-linux-gnu /usr/lib /usr/local/lib /usr/lib/gcc/i586-linux-gnu/4.9/include-fixed /usr/include/i386-linux-gnu /usr/lib /usr/local/lib /usr/lib/gcc/i586-linux-gnu/4.9/include-fixed /usr/include/i386-linux-gnu /usr/lib /usr/local/lib /usr/lib/gcc/i586-linux-gnu/4.9/include-fixed /usr/include/i386-linux-gnu /usr/lib libs=-lpthread -lnsl -ldl -lm -lcrypt -lutil -lc perllibs=-lpthread -lnsl -ldl -lm -lcrypt -lutil -lc libc=libc-2.19.so so=so useshrplib=false libperl=libperl.a gnulibc_version='2.19' Dynamic Linking: dlsrc=dl_dlopen.xs dlext=so d_dlsymun=undef ccdlflags='-Wl,-E' cccdlflags='-fPIC' lddlflags='-shared -O2 -L/usr/local/lib -fstack-protector-strong' Characteristics of this binary (from libperl): Compile-time options: HAS_TIMES PERLIO_LAYERS PERL_COPY_ON_WRITE PERL_DONT_CREATE_GVSV PERL_HASH_FUNC_ONE_AT_A_TIME_HARD PERL_MALLOC_WRAP PERL_OP_PARENT PERL_PRESERVE_IVUV USE_LARGE_FILES USE_LOCALE USE_LOCALE_COLLATE USE_LOCALE_CTYPE USE_LOCALE_NUMERIC USE_LOCALE_TIME USE_PERLIO USE_PERL_ATOF Built under linux Compiled at Jun 21 2017 19:00:20 @INC: /usr/local/lib/perl5/site_perl/5.26.0/i686-linux /usr/local/lib/perl5/site_perl/5.26.0 /usr/local/lib/perl5/5.26.0/i686-linux /usr/local/lib/perl5/5.26.0 -- Jakub Wilk
Subject: Re: [perl #131649] Heap-based buffer overflow in S_regatom
To: "perl5-security-report [...] perl.org" <perl5-security-report [...] perl.org>, karl williamson <public [...] khwilliamson.com>
CC: "bugs-bitbucket [...] rt.perl.org" <bugs-bitbucket [...] rt.perl.org>
Date: Mon, 26 Jun 2017 13:42:57 +0200
From: demerphq <demerphq [...] gmail.com>
Download (untitled) / with headers
text/plain 5.9k
I have a patch for this in yves/fix_131649 The extended charclass parser is a bit unusual in that it uses quite different and separate code for the sizing pass and the actual code generation pass, and the difference in the parsing rules means that things would cause an assertion failure in the second pass are not caught as errors in the first pass. My patch cleans up some of this logic and adds some trace, but it needs some review by Karl. Personally I would expect the first pass to use the same parsing rules as the second. I also was surprised to find that (?s:(?[[x]])) is allowed inside an extended regex, but (?[ [x] ]) is not. I understand the cause of the discrepancy, but it seems undesirable. Cheers, Yves On 25 June 2017 at 04:40, Jakub Wilk <perl5-security-report@perl.org> wrote: Show quoted text
> # New Ticket Created by Jakub Wilk > # Please include the string: [perl #131649] > # in the subject line of all future correspondence about this issue. > # <URL: https://rt.perl.org/Ticket/Display.html?id=131649 > > > > Consider the following Perl program: > > m/(?[(?s:(?[[x]][xx]xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx])/; > __END__ > > This is what happens when I run it: > > The regex_sets feature is experimental in regex; marked by <-- HERE in m/(?[ <-- HERE (?s:(?[[x]][xx]xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx])/ at test line 1. > The regex_sets feature is experimental in regex; marked by <-- HERE in m/(?[(?s:(?[ <-- HERE [x]][xx]xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx])/ at test line 1. > panic: reg_node overrun trying to emit 0, 8677794>=8677790 at test line 1. > > Valgrind says it's a heap-based buffer overflow: > > Invalid write of size 1 > at 0x80EE11B: S_regatom (regcomp.c:13490) > by 0x80E6A9D: S_regpiece (regcomp.c:11669) > by 0x80E6872: S_regbranch (regcomp.c:11594) > by 0x80E58D4: S_reg (regcomp.c:11332) > by 0x80D85AF: Perl_re_op_compile (regcomp.c:7309) > by 0x806D7DF: Perl_pmruntime (op.c:5882) > by 0x80C2417: Perl_yyparse (perly.y:1204) > by 0x8086637: S_parse_body (perl.c:2377) > by 0x808578B: perl_parse (perl.c:1692) > by 0x8061751: main (perlmain.c:121) > Address 0x428e698 is 0 bytes after a block of size 160 alloc'd > at 0x402A1CC: malloc (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so) > by 0x81082BC: Perl_safesysmalloc (util.c:153) > by 0x80D7E43: Perl_re_op_compile (regcomp.c:7155) > by 0x806D7DF: Perl_pmruntime (op.c:5882) > by 0x80C2417: Perl_yyparse (perly.y:1204) > by 0x8086637: S_parse_body (perl.c:2377) > by 0x808578B: perl_parse (perl.c:1692) > by 0x8061751: main (perlmain.c:121) > > $ perl -V > Summary of my perl5 (revision 5 version 26 subversion 0) configuration: > Commit id: 95388f2eb27e74cdbfb715c0097f16aeba4e6e4e > Platform: > osname=linux > osvers=3.16.0-4-686-pae > archname=i686-linux > uname='linux dirac 3.16.0-4-686-pae #1 smp debian 3.16.43-2 (2017-04-30) i686 gnulinux ' > config_args='-Dcc=gcc -d' > hint=previous > useposix=true > d_sigaction=define > useithreads=undef > usemultiplicity=undef > use64bitint=undef > use64bitall=undef > uselongdouble=undef > usemymalloc=n > default_inc_excludes_dot=define > bincompat5005=undef > Compiler: > cc='gcc' > ccflags ='-fwrapv -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -D_FORTIFY_SOURCE=2' > optimize='-g' > cppflags='-fwrapv -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include' > ccversion='' > gccversion='4.9.2' > gccosandvers='' > intsize=4 > longsize=4 > ptrsize=4 > doublesize=8 > byteorder=1234 > doublekind=3 > d_longlong=define > longlongsize=8 > d_longdbl=define > longdblsize=12 > longdblkind=3 > ivtype='long' > ivsize=4 > nvtype='double' > nvsize=8 > Off_t='off_t' > lseeksize=8 > alignbytes=4 > prototype=define > Linker and Libraries: > ld='cc' > ldflags =' -fstack-protector-strong -L/usr/local/lib' > libpth=/usr/local/lib /usr/lib/gcc/i586-linux-gnu/4.9/include-fixed /usr/include/i386-linux-gnu /usr/lib /lib/i386-linux-gnu /lib/../lib /usr/lib/i386-linux-gnu /usr/lib/../lib /lib /usr/local/lib /usr/lib/gcc/i586-linux-gnu/4.9/include-fixed /usr/include/i386-linux-gnu /usr/lib /usr/local/lib /usr/lib/gcc/i586-linux-gnu/4.9/include-fixed /usr/include/i386-linux-gnu /usr/lib /usr/local/lib /usr/lib/gcc/i586-linux-gnu/4.9/include-fixed /usr/include/i386-linux-gnu /usr/lib /usr/local/lib /usr/lib/gcc/i586-linux-gnu/4.9/include-fixed /usr/include/i386-linux-gnu /usr/lib > libs=-lpthread -lnsl -ldl -lm -lcrypt -lutil -lc > perllibs=-lpthread -lnsl -ldl -lm -lcrypt -lutil -lc > libc=libc-2.19.so > so=so > useshrplib=false > libperl=libperl.a > gnulibc_version='2.19' > Dynamic Linking: > dlsrc=dl_dlopen.xs > dlext=so > d_dlsymun=undef > ccdlflags='-Wl,-E' > cccdlflags='-fPIC' > lddlflags='-shared -O2 -L/usr/local/lib -fstack-protector-strong' > > > Characteristics of this binary (from libperl): > Compile-time options: > HAS_TIMES > PERLIO_LAYERS > PERL_COPY_ON_WRITE > PERL_DONT_CREATE_GVSV > PERL_HASH_FUNC_ONE_AT_A_TIME_HARD > PERL_MALLOC_WRAP > PERL_OP_PARENT > PERL_PRESERVE_IVUV > USE_LARGE_FILES > USE_LOCALE > USE_LOCALE_COLLATE > USE_LOCALE_CTYPE > USE_LOCALE_NUMERIC > USE_LOCALE_TIME > USE_PERLIO > USE_PERL_ATOF > Built under linux > Compiled at Jun 21 2017 19:00:20 > @INC: > /usr/local/lib/perl5/site_perl/5.26.0/i686-linux > /usr/local/lib/perl5/site_perl/5.26.0 > /usr/local/lib/perl5/5.26.0/i686-linux > /usr/local/lib/perl5/5.26.0 > > -- > Jakub Wilk >
-- perl -Mre=debug -e "/just|another|perl|hacker/"
Subject: Re: [perl #131649] Heap-based buffer overflow in S_regatom
To: demerphq <demerphq [...] gmail.com>, "perl5-security-report [...] perl.org" <perl5-security-report [...] perl.org>
CC: "bugs-bitbucket [...] rt.perl.org" <bugs-bitbucket [...] rt.perl.org>
From: Karl Williamson <public [...] khwilliamson.com>
Date: Sat, 1 Jul 2017 22:03:36 -0600
Download (untitled) / with headers
text/plain 6.6k
On 06/26/2017 05:42 AM, demerphq wrote: Show quoted text
> I have a patch for this in yves/fix_131649 > > The extended charclass parser is a bit unusual in that it uses quite > different and separate code for the sizing pass and the actual code > generation pass, and the difference in the parsing rules means that > things would cause an assertion failure in the second pass are not > caught as errors in the first pass. > > My patch cleans up some of this logic and adds some trace, but it > needs some review by Karl. > > Personally I would expect the first pass to use the same parsing rules > as the second. I also was surprised to find that (?s:(?[[x]])) is > allowed inside an extended regex, but (?[ [x] ]) is not. I understand > the cause of the discrepancy, but it seems undesirable.
Maybe we should bite the bullet and fold the passes together, like the rest of regcomp. It looks like it was my naivete showing up in the initial design. I would have to refamiliarize myself with the code to be able to carefully evaluate your patch (that wouldn't take too long), but I suspect that there are other glitches awaiting if we don't fold the passes together. That makes it easy to keep them in sync as well. Show quoted text
> > Cheers, > Yves > > On 25 June 2017 at 04:40, Jakub Wilk <perl5-security-report@perl.org> wrote:
>> # New Ticket Created by Jakub Wilk >> # Please include the string: [perl #131649] >> # in the subject line of all future correspondence about this issue. >> # <URL: https://rt.perl.org/Ticket/Display.html?id=131649 > >> >> >> Consider the following Perl program: >> >> m/(?[(?s:(?[[x]][xx]xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx])/; >> __END__ >> >> This is what happens when I run it: >> >> The regex_sets feature is experimental in regex; marked by <-- HERE in m/(?[ <-- HERE (?s:(?[[x]][xx]xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx])/ at test line 1. >> The regex_sets feature is experimental in regex; marked by <-- HERE in m/(?[(?s:(?[ <-- HERE [x]][xx]xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx])/ at test line 1. >> panic: reg_node overrun trying to emit 0, 8677794>=8677790 at test line 1. >> >> Valgrind says it's a heap-based buffer overflow: >> >> Invalid write of size 1 >> at 0x80EE11B: S_regatom (regcomp.c:13490) >> by 0x80E6A9D: S_regpiece (regcomp.c:11669) >> by 0x80E6872: S_regbranch (regcomp.c:11594) >> by 0x80E58D4: S_reg (regcomp.c:11332) >> by 0x80D85AF: Perl_re_op_compile (regcomp.c:7309) >> by 0x806D7DF: Perl_pmruntime (op.c:5882) >> by 0x80C2417: Perl_yyparse (perly.y:1204) >> by 0x8086637: S_parse_body (perl.c:2377) >> by 0x808578B: perl_parse (perl.c:1692) >> by 0x8061751: main (perlmain.c:121) >> Address 0x428e698 is 0 bytes after a block of size 160 alloc'd >> at 0x402A1CC: malloc (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so) >> by 0x81082BC: Perl_safesysmalloc (util.c:153) >> by 0x80D7E43: Perl_re_op_compile (regcomp.c:7155) >> by 0x806D7DF: Perl_pmruntime (op.c:5882) >> by 0x80C2417: Perl_yyparse (perly.y:1204) >> by 0x8086637: S_parse_body (perl.c:2377) >> by 0x808578B: perl_parse (perl.c:1692) >> by 0x8061751: main (perlmain.c:121) >> >> $ perl -V >> Summary of my perl5 (revision 5 version 26 subversion 0) configuration: >> Commit id: 95388f2eb27e74cdbfb715c0097f16aeba4e6e4e >> Platform: >> osname=linux >> osvers=3.16.0-4-686-pae >> archname=i686-linux >> uname='linux dirac 3.16.0-4-686-pae #1 smp debian 3.16.43-2 (2017-04-30) i686 gnulinux ' >> config_args='-Dcc=gcc -d' >> hint=previous >> useposix=true >> d_sigaction=define >> useithreads=undef >> usemultiplicity=undef >> use64bitint=undef >> use64bitall=undef >> uselongdouble=undef >> usemymalloc=n >> default_inc_excludes_dot=define >> bincompat5005=undef >> Compiler: >> cc='gcc' >> ccflags ='-fwrapv -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -D_FORTIFY_SOURCE=2' >> optimize='-g' >> cppflags='-fwrapv -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include' >> ccversion='' >> gccversion='4.9.2' >> gccosandvers='' >> intsize=4 >> longsize=4 >> ptrsize=4 >> doublesize=8 >> byteorder=1234 >> doublekind=3 >> d_longlong=define >> longlongsize=8 >> d_longdbl=define >> longdblsize=12 >> longdblkind=3 >> ivtype='long' >> ivsize=4 >> nvtype='double' >> nvsize=8 >> Off_t='off_t' >> lseeksize=8 >> alignbytes=4 >> prototype=define >> Linker and Libraries: >> ld='cc' >> ldflags =' -fstack-protector-strong -L/usr/local/lib' >> libpth=/usr/local/lib /usr/lib/gcc/i586-linux-gnu/4.9/include-fixed /usr/include/i386-linux-gnu /usr/lib /lib/i386-linux-gnu /lib/../lib /usr/lib/i386-linux-gnu /usr/lib/../lib /lib /usr/local/lib /usr/lib/gcc/i586-linux-gnu/4.9/include-fixed /usr/include/i386-linux-gnu /usr/lib /usr/local/lib /usr/lib/gcc/i586-linux-gnu/4.9/include-fixed /usr/include/i386-linux-gnu /usr/lib /usr/local/lib /usr/lib/gcc/i586-linux-gnu/4.9/include-fixed /usr/include/i386-linux-gnu /usr/lib /usr/local/lib /usr/lib/gcc/i586-linux-gnu/4.9/include-fixed /usr/include/i386-linux-gnu /usr/lib >> libs=-lpthread -lnsl -ldl -lm -lcrypt -lutil -lc >> perllibs=-lpthread -lnsl -ldl -lm -lcrypt -lutil -lc >> libc=libc-2.19.so >> so=so >> useshrplib=false >> libperl=libperl.a >> gnulibc_version='2.19' >> Dynamic Linking: >> dlsrc=dl_dlopen.xs >> dlext=so >> d_dlsymun=undef >> ccdlflags='-Wl,-E' >> cccdlflags='-fPIC' >> lddlflags='-shared -O2 -L/usr/local/lib -fstack-protector-strong' >> >> >> Characteristics of this binary (from libperl): >> Compile-time options: >> HAS_TIMES >> PERLIO_LAYERS >> PERL_COPY_ON_WRITE >> PERL_DONT_CREATE_GVSV >> PERL_HASH_FUNC_ONE_AT_A_TIME_HARD >> PERL_MALLOC_WRAP >> PERL_OP_PARENT >> PERL_PRESERVE_IVUV >> USE_LARGE_FILES >> USE_LOCALE >> USE_LOCALE_COLLATE >> USE_LOCALE_CTYPE >> USE_LOCALE_NUMERIC >> USE_LOCALE_TIME >> USE_PERLIO >> USE_PERL_ATOF >> Built under linux >> Compiled at Jun 21 2017 19:00:20 >> @INC: >> /usr/local/lib/perl5/site_perl/5.26.0/i686-linux >> /usr/local/lib/perl5/site_perl/5.26.0 >> /usr/local/lib/perl5/5.26.0/i686-linux >> /usr/local/lib/perl5/5.26.0 >> >> -- >> Jakub Wilk >>
> > >
To: Karl Williamson <public [...] khwilliamson.com>
Subject: Re: [perl #131649] Heap-based buffer overflow in S_regatom
CC: "perl5-security-report [...] perl.org" <perl5-security-report [...] perl.org>, "bugs-bitbucket [...] rt.perl.org" <bugs-bitbucket [...] rt.perl.org>
From: demerphq <demerphq [...] gmail.com>
Date: Sun, 2 Jul 2017 12:17:39 +0200
Download (untitled) / with headers
text/plain 2.6k
On 2 July 2017 at 06:03, Karl Williamson <public@khwilliamson.com> wrote: Show quoted text
> On 06/26/2017 05:42 AM, demerphq wrote:
>> >> I have a patch for this in yves/fix_131649 >> >> The extended charclass parser is a bit unusual in that it uses quite >> different and separate code for the sizing pass and the actual code >> generation pass, and the difference in the parsing rules means that >> things would cause an assertion failure in the second pass are not >> caught as errors in the first pass. >> >> My patch cleans up some of this logic and adds some trace, but it >> needs some review by Karl. >> >> Personally I would expect the first pass to use the same parsing rules >> as the second. I also was surprised to find that (?s:(?[[x]])) is >> allowed inside an extended regex, but (?[ [x] ]) is not. I understand >> the cause of the discrepancy, but it seems undesirable.
> > > Maybe we should bite the bullet and fold the passes together, like the rest > of regcomp. It looks like it was my naivete showing up in the initial > design. I would have to refamiliarize myself with the code to be able to > carefully evaluate your patch (that wouldn't take too long), but I suspect > that there are other glitches awaiting if we don't fold the passes together. > That makes it easy to keep them in sync as well.
I think we should formally define the grammar before we touch the code. Once we have a formal definition the rest should flow easily. We have something like the following unreduced grammar. extended_regexp: '(?[' eclass_expr '])' binop: '+' | '-' | ... simple_charclass_def: .... escape_class: '\\s' | ... eclass_expr: escape_class | '[' simple_charclass_def ']' | '(' eclass_expr ')' | eclass_expr op exclass_expr | '!' eclass_expr | '(?' modifiers ':' extended_regexp ')' Personally I think it is odd we only allow extended regexps to embedded in one another when they are wrapped in '(?...:)' style brackets that have modifiers. To fix that I think we should add | extended_regexp to the eclass_expr production. In other words /(?[ (?[ (?[ (?[ [x] ]) ]) ]) ])/ should be legal. We shouldn't have to write /(?s:(?[ (?s:(?[ (?s:(?[ (?s:(?[ [x] ])) ])) ])) ]))/ for instance, and if we allow /(?[ (?s:(?[ ... ])) ])/ we should allow /(?[ (?:(?[ ... ])) ])/ and /(?[ (?: (?[ ... ]) ) ])/ and etc... (note the whitespace) I think once we have a proper definition of the grammar allowed inside of an extended regex it should be easy to fix the code or prove it correct. Yves -- perl -Mre=debug -e "/just|another|perl|hacker/"
CC: Karl Williamson <public [...] khwilliamson.com>, "perl5-security-report [...] perl.org" <perl5-security-report [...] perl.org>, "bugs-bitbucket [...] rt.perl.org" <bugs-bitbucket [...] rt.perl.org>
To: demerphq <demerphq [...] gmail.com>
Date: Wed, 29 Nov 2017 08:58:23 +0000
From: Dave Mitchell <davem [...] iabyn.com>
Subject: Re: [perl #131649] Heap-based buffer overflow in S_regatom
Download (untitled) / with headers
text/plain 2.8k
On Sun, Jul 02, 2017 at 12:17:39PM +0200, demerphq wrote: Show quoted text
> On 2 July 2017 at 06:03, Karl Williamson <public@khwilliamson.com> wrote:
> > On 06/26/2017 05:42 AM, demerphq wrote:
> >> > >> I have a patch for this in yves/fix_131649 > >> > >> The extended charclass parser is a bit unusual in that it uses quite > >> different and separate code for the sizing pass and the actual code > >> generation pass, and the difference in the parsing rules means that > >> things would cause an assertion failure in the second pass are not > >> caught as errors in the first pass. > >> > >> My patch cleans up some of this logic and adds some trace, but it > >> needs some review by Karl. > >> > >> Personally I would expect the first pass to use the same parsing rules > >> as the second. I also was surprised to find that (?s:(?[[x]])) is > >> allowed inside an extended regex, but (?[ [x] ]) is not. I understand > >> the cause of the discrepancy, but it seems undesirable.
> > > > > > Maybe we should bite the bullet and fold the passes together, like the rest > > of regcomp. It looks like it was my naivete showing up in the initial > > design. I would have to refamiliarize myself with the code to be able to > > carefully evaluate your patch (that wouldn't take too long), but I suspect > > that there are other glitches awaiting if we don't fold the passes together. > > That makes it easy to keep them in sync as well.
> > I think we should formally define the grammar before we touch the > code. Once we have a formal definition the rest should flow easily. We > have something like the following unreduced grammar. > > extended_regexp: '(?[' eclass_expr '])' > binop: '+' | '-' | ... > simple_charclass_def: .... > escape_class: '\\s' | ... > eclass_expr: escape_class > | '[' simple_charclass_def ']' > | '(' eclass_expr ')' > | eclass_expr op exclass_expr > | '!' eclass_expr > | '(?' modifiers ':' extended_regexp ')' > > Personally I think it is odd we only allow extended regexps to > embedded in one another when they are wrapped in '(?...:)' style > brackets that have modifiers. To fix that I think we should add > > | extended_regexp > > to the eclass_expr production. In other words > > /(?[ (?[ (?[ (?[ [x] ]) ]) ]) ])/ > > should be legal. We shouldn't have to write > > /(?s:(?[ (?s:(?[ (?s:(?[ (?s:(?[ [x] ])) ])) ])) ]))/ > > for instance, and if we allow > > /(?[ (?s:(?[ ... ])) ])/ > > we should allow > > /(?[ (?:(?[ ... ])) ])/ > > and > > /(?[ (?: (?[ ... ]) ) ])/ > > and etc... (note the whitespace) > > I think once we have a proper definition of the grammar allowed inside > of an extended regex it should be easy to fix the code or prove it > correct.
Has there been any progress on this issue? -- Justice is when you get what you deserve. Law is when you get what you pay for.
Date: Wed, 29 Nov 2017 10:46:57 +0100
CC: Karl Williamson <public [...] khwilliamson.com>, "perl5-security-report [...] perl.org" <perl5-security-report [...] perl.org>, "bugs-bitbucket [...] rt.perl.org" <bugs-bitbucket [...] rt.perl.org>
To: Dave Mitchell <davem [...] iabyn.com>
Subject: Re: [perl #131649] Heap-based buffer overflow in S_regatom
From: demerphq <demerphq [...] gmail.com>
On 29 November 2017 at 09:58, Dave Mitchell <davem@iabyn.com> wrote: Show quoted text
> On Sun, Jul 02, 2017 at 12:17:39PM +0200, demerphq wrote:
>> On 2 July 2017 at 06:03, Karl Williamson <public@khwilliamson.com> wrote:
>> > On 06/26/2017 05:42 AM, demerphq wrote:
>> >> >> >> I have a patch for this in yves/fix_131649 >> >> >> >> The extended charclass parser is a bit unusual in that it uses quite >> >> different and separate code for the sizing pass and the actual code >> >> generation pass, and the difference in the parsing rules means that >> >> things would cause an assertion failure in the second pass are not >> >> caught as errors in the first pass. >> >> >> >> My patch cleans up some of this logic and adds some trace, but it >> >> needs some review by Karl. >> >> >> >> Personally I would expect the first pass to use the same parsing rules >> >> as the second. I also was surprised to find that (?s:(?[[x]])) is >> >> allowed inside an extended regex, but (?[ [x] ]) is not. I understand >> >> the cause of the discrepancy, but it seems undesirable.
>> > >> > >> > Maybe we should bite the bullet and fold the passes together, like the rest >> > of regcomp. It looks like it was my naivete showing up in the initial >> > design. I would have to refamiliarize myself with the code to be able to >> > carefully evaluate your patch (that wouldn't take too long), but I suspect >> > that there are other glitches awaiting if we don't fold the passes together. >> > That makes it easy to keep them in sync as well.
>> >> I think we should formally define the grammar before we touch the >> code. Once we have a formal definition the rest should flow easily. We >> have something like the following unreduced grammar. >> >> extended_regexp: '(?[' eclass_expr '])' >> binop: '+' | '-' | ... >> simple_charclass_def: .... >> escape_class: '\\s' | ... >> eclass_expr: escape_class >> | '[' simple_charclass_def ']' >> | '(' eclass_expr ')' >> | eclass_expr op exclass_expr >> | '!' eclass_expr >> | '(?' modifiers ':' extended_regexp ')' >> >> Personally I think it is odd we only allow extended regexps to >> embedded in one another when they are wrapped in '(?...:)' style >> brackets that have modifiers. To fix that I think we should add >> >> | extended_regexp >> >> to the eclass_expr production. In other words >> >> /(?[ (?[ (?[ (?[ [x] ]) ]) ]) ])/ >> >> should be legal. We shouldn't have to write >> >> /(?s:(?[ (?s:(?[ (?s:(?[ (?s:(?[ [x] ])) ])) ])) ]))/ >> >> for instance, and if we allow >> >> /(?[ (?s:(?[ ... ])) ])/ >> >> we should allow >> >> /(?[ (?:(?[ ... ])) ])/ >> >> and >> >> /(?[ (?: (?[ ... ]) ) ])/ >> >> and etc... (note the whitespace) >> >> I think once we have a proper definition of the grammar allowed inside >> of an extended regex it should be easy to fix the code or prove it >> correct.
> > Has there been any progress on this issue?
I am waiting on Karl to reply. As far as I know my patch does fix the issue. Yves -- perl -Mre=debug -e "/just|another|perl|hacker/"
To: demerphq <demerphq [...] gmail.com>, Dave Mitchell <davem [...] iabyn.com>
CC: "perl5-security-report [...] perl.org" <perl5-security-report [...] perl.org>, "bugs-bitbucket [...] rt.perl.org" <bugs-bitbucket [...] rt.perl.org>
Date: Wed, 29 Nov 2017 11:53:21 -0700
From: Karl Williamson <public [...] khwilliamson.com>
Subject: Re: [perl #131649] Heap-based buffer overflow in S_regatom
Download (untitled) / with headers
text/plain 3.1k
On 11/29/2017 02:46 AM, demerphq wrote: Show quoted text
> On 29 November 2017 at 09:58, Dave Mitchell <davem@iabyn.com> wrote:
>> On Sun, Jul 02, 2017 at 12:17:39PM +0200, demerphq wrote:
>>> On 2 July 2017 at 06:03, Karl Williamson <public@khwilliamson.com> wrote:
>>>> On 06/26/2017 05:42 AM, demerphq wrote:
>>>>> >>>>> I have a patch for this in yves/fix_131649 >>>>> >>>>> The extended charclass parser is a bit unusual in that it uses quite >>>>> different and separate code for the sizing pass and the actual code >>>>> generation pass, and the difference in the parsing rules means that >>>>> things would cause an assertion failure in the second pass are not >>>>> caught as errors in the first pass. >>>>> >>>>> My patch cleans up some of this logic and adds some trace, but it >>>>> needs some review by Karl. >>>>> >>>>> Personally I would expect the first pass to use the same parsing rules >>>>> as the second. I also was surprised to find that (?s:(?[[x]])) is >>>>> allowed inside an extended regex, but (?[ [x] ]) is not. I understand >>>>> the cause of the discrepancy, but it seems undesirable.
>>>> >>>> >>>> Maybe we should bite the bullet and fold the passes together, like the rest >>>> of regcomp. It looks like it was my naivete showing up in the initial >>>> design. I would have to refamiliarize myself with the code to be able to >>>> carefully evaluate your patch (that wouldn't take too long), but I suspect >>>> that there are other glitches awaiting if we don't fold the passes together. >>>> That makes it easy to keep them in sync as well.
>>> >>> I think we should formally define the grammar before we touch the >>> code. Once we have a formal definition the rest should flow easily. We >>> have something like the following unreduced grammar. >>> >>> extended_regexp: '(?[' eclass_expr '])' >>> binop: '+' | '-' | ... >>> simple_charclass_def: .... >>> escape_class: '\\s' | ... >>> eclass_expr: escape_class >>> | '[' simple_charclass_def ']' >>> | '(' eclass_expr ')' >>> | eclass_expr op exclass_expr >>> | '!' eclass_expr >>> | '(?' modifiers ':' extended_regexp ')' >>> >>> Personally I think it is odd we only allow extended regexps to >>> embedded in one another when they are wrapped in '(?...:)' style >>> brackets that have modifiers. To fix that I think we should add >>> >>> | extended_regexp >>> >>> to the eclass_expr production. In other words >>> >>> /(?[ (?[ (?[ (?[ [x] ]) ]) ]) ])/ >>> >>> should be legal. We shouldn't have to write >>> >>> /(?s:(?[ (?s:(?[ (?s:(?[ (?s:(?[ [x] ])) ])) ])) ]))/ >>> >>> for instance, and if we allow >>> >>> /(?[ (?s:(?[ ... ])) ])/ >>> >>> we should allow >>> >>> /(?[ (?:(?[ ... ])) ])/ >>> >>> and >>> >>> /(?[ (?: (?[ ... ]) ) ])/ >>> >>> and etc... (note the whitespace) >>> >>> I think once we have a proper definition of the grammar allowed inside >>> of an extended regex it should be easy to fix the code or prove it >>> correct.
>> >> Has there been any progress on this issue?
> > I am waiting on Karl to reply. As far as I know my patch does fix the issue. > > Yves >
I think his patch solves the issue, but my writing a grammar comes after the code freeze
RT-Send-CC: perl5-security-report [...] perl.org
Download (untitled) / with headers
text/plain 3.9k
On Wed, 29 Nov 2017 10:53:43 -0800, public@khwilliamson.com wrote: Show quoted text
> On 11/29/2017 02:46 AM, demerphq wrote:
> > On 29 November 2017 at 09:58, Dave Mitchell <davem@iabyn.com> wrote:
> >> On Sun, Jul 02, 2017 at 12:17:39PM +0200, demerphq wrote:
> >>> On 2 July 2017 at 06:03, Karl Williamson <public@khwilliamson.com> > >>> wrote:
> >>>> On 06/26/2017 05:42 AM, demerphq wrote:
> >>>>> > >>>>> I have a patch for this in yves/fix_131649 > >>>>> > >>>>> The extended charclass parser is a bit unusual in that it uses > >>>>> quite > >>>>> different and separate code for the sizing pass and the actual > >>>>> code > >>>>> generation pass, and the difference in the parsing rules means > >>>>> that > >>>>> things would cause an assertion failure in the second pass are > >>>>> not > >>>>> caught as errors in the first pass. > >>>>> > >>>>> My patch cleans up some of this logic and adds some trace, but it > >>>>> needs some review by Karl. > >>>>> > >>>>> Personally I would expect the first pass to use the same parsing > >>>>> rules > >>>>> as the second. I also was surprised to find that (?s:(?[[x]])) is > >>>>> allowed inside an extended regex, but (?[ [x] ]) is not. I > >>>>> understand > >>>>> the cause of the discrepancy, but it seems undesirable.
> >>>> > >>>> > >>>> Maybe we should bite the bullet and fold the passes together, like > >>>> the rest > >>>> of regcomp. It looks like it was my naivete showing up in the > >>>> initial > >>>> design. I would have to refamiliarize myself with the code to be > >>>> able to > >>>> carefully evaluate your patch (that wouldn't take too long), but I > >>>> suspect > >>>> that there are other glitches awaiting if we don't fold the passes > >>>> together. > >>>> That makes it easy to keep them in sync as well.
> >>> > >>> I think we should formally define the grammar before we touch the > >>> code. Once we have a formal definition the rest should flow easily. > >>> We > >>> have something like the following unreduced grammar. > >>> > >>> extended_regexp: '(?[' eclass_expr '])' > >>> binop: '+' | '-' | ... > >>> simple_charclass_def: .... > >>> escape_class: '\\s' | ... > >>> eclass_expr: escape_class > >>> | '[' simple_charclass_def ']' > >>> | '(' eclass_expr ')' > >>> | eclass_expr op exclass_expr > >>> | '!' eclass_expr > >>> | '(?' modifiers ':' extended_regexp ')' > >>> > >>> Personally I think it is odd we only allow extended regexps to > >>> embedded in one another when they are wrapped in '(?...:)' style > >>> brackets that have modifiers. To fix that I think we should add > >>> > >>> | extended_regexp > >>> > >>> to the eclass_expr production. In other words > >>> > >>> /(?[ (?[ (?[ (?[ [x] ]) ]) ]) ])/ > >>> > >>> should be legal. We shouldn't have to write > >>> > >>> /(?s:(?[ (?s:(?[ (?s:(?[ (?s:(?[ [x] ])) ])) ])) ]))/ > >>> > >>> for instance, and if we allow > >>> > >>> /(?[ (?s:(?[ ... ])) ])/ > >>> > >>> we should allow > >>> > >>> /(?[ (?:(?[ ... ])) ])/ > >>> > >>> and > >>> > >>> /(?[ (?: (?[ ... ]) ) ])/ > >>> > >>> and etc... (note the whitespace) > >>> > >>> I think once we have a proper definition of the grammar allowed > >>> inside > >>> of an extended regex it should be easy to fix the code or prove it > >>> correct.
> >> > >> Has there been any progress on this issue?
> > > > I am waiting on Karl to reply. As far as I know my patch does fix the > > issue. > > > > Yves > >
> > I think his patch solves the issue, but my writing a grammar comes > after > the code freeze
If I understand the issue, compiling an attacker controlled regexp can overflow a heap allocated buffer, making it a security issue. I suspect, given it's a regexp compilation that an attacker has a lot of control over the data written beyond the end of the buffer.[1] Tony [1] which doesn't matter much, given attacks like the one you linked https://daniel.haxx.se/blog/2016/10/14/a-single-byte-write-opened-a-root-execution-exploit/
RT-Send-CC: perl5-security-report [...] perl.org
Download (untitled) / with headers
text/plain 381b
On Wed, 29 Nov 2017 10:53:43 -0800, public@khwilliamson.com wrote: Show quoted text
> On 11/29/2017 02:46 AM, demerphq wrote:
> > I am waiting on Karl to reply. As far as I know my patch does fix the > > issue. > > > > Yves > >
> > I think his patch solves the issue, but my writing a grammar comes > after > the code freeze
Which patch? I don't see a yves/fix_131649 branch anywhere. Tony Tony
RT-Send-CC: perl5-security-report [...] perl.org
Download (untitled) / with headers
text/plain 143b
I plan to request a CVE ID for this issue in the next couple of days. If anyone has already requested an ID, please let me know. Thanks, Tony
RT-Send-CC: perl5-security-report [...] perl.org
Download (untitled) / with headers
text/plain 648b
On Wed, 19 Sep 2018 17:37:48 -0700, tonyc wrote: Show quoted text
> On Wed, 29 Nov 2017 10:53:43 -0800, public@khwilliamson.com wrote:
> > On 11/29/2017 02:46 AM, demerphq wrote:
> > > I am waiting on Karl to reply. As far as I know my patch does fix the > > > issue. > > > > > > Yves > > >
> > > > I think his patch solves the issue, but my writing a grammar comes > > after > > the code freeze
> > Which patch? > > I don't see a yves/fix_131649 branch anywhere.
It turns out this was fixed in 19a498a461d7c81ae3507c450953d1148efecf4f, which is included in 5.28. This doesn't cherry-pick cleanly to 5.26, the attached is a merged cherry-pick of the fix. Tony
Subject: 0001-5.26-fix-131649-extended-charclass-can-trigger-assert.patch
From 926efad7572e351b649af3139c4ca53abe20f19b Mon Sep 17 00:00:00 2001 From: Yves Orton <demerphq@gmail.com> Date: Mon, 26 Jun 2017 13:19:55 +0200 Subject: fix #131649 - extended charclass can trigger assert The extended charclass parser makes some assumptions during the first pass which are only true on well structured input, and it does not properly catch various errors. later on the code assumes that things the first pass will let through are valid, when in fact they should trigger errors. (cherry picked from commit 19a498a461d7c81ae3507c450953d1148efecf4f) --- pod/perldiag.pod | 27 ++++++++++++++++++++++++++- pod/perlrecharclass.pod | 4 ++-- regcomp.c | 28 ++++++++++++++++++---------- t/lib/warnings/regcomp | 6 +++--- t/re/reg_mesg.t | 29 ++++++++++++++++------------- t/re/regex_sets.t | 6 +++--- 6 files changed, 68 insertions(+), 32 deletions(-) diff --git a/pod/perldiag.pod b/pod/perldiag.pod index 106fe41121..c29925a2a4 100644 --- a/pod/perldiag.pod +++ b/pod/perldiag.pod @@ -5904,7 +5904,7 @@ yourself. a perl4 interpreter, especially if the next 2 tokens are "use strict" or "my $var" or "our $var". -=item Syntax error in (?[...]) in regex m/%s/ +=item Syntax error in (?[...]) in regex; marked by <-- HERE in m/%s/ (F) Perl could not figure out what you meant inside this construct; this notifies you that it is giving up trying. @@ -6402,6 +6402,31 @@ to find out why that isn't happening. (F) The unexec() routine failed for some reason. See your local FSF representative, who probably put it there in the first place. +=item Unexpected ']' with no following ')' in (?[... in regex; marked by <-- HERE in m/%s/ + +(F) While parsing an extended character class a ']' character was encountered +at a point in the definition where the only legal use of ']' is to close the +character class definition as part of a '])', you may have forgotten the close +paren, or otherwise confused the parser. + +=item Expecting close paren for nested extended charclass in regex; marked by <-- HERE in m/%s/ + +(F) While parsing a nested extended character class like: + + (?[ ... (?flags:(?[ ... ])) ... ]) + ^ + +we expected to see a close paren ')' (marked by ^) but did not. + +=item Expecting close paren for wrapper for nested extended charclass in regex; marked by <-- HERE in m/%s/ + +(F) While parsing a nested extended character class like: + + (?[ ... (?flags:(?[ ... ])) ... ]) + ^ + +we expected to see a close paren ')' (marked by ^) but did not. + =item Unexpected binary operator '%c' with no preceding operand in regex; marked by S<<-- HERE> in m/%s/ diff --git a/pod/perlrecharclass.pod b/pod/perlrecharclass.pod index 79480e4131..8c008507d1 100644 --- a/pod/perlrecharclass.pod +++ b/pod/perlrecharclass.pod @@ -1128,8 +1128,8 @@ hence both of the following work: Any contained POSIX character classes, including things like C<\w> and C<\D> respect the C<E<sol>a> (and C<E<sol>aa>) modifiers. -C<< (?[ ]) >> is a regex-compile-time construct. Any attempt to use -something which isn't knowable at the time the containing regular +Note that C<< (?[ ]) >> is a regex-compile-time construct. Any attempt +to use something which isn't knowable at the time the containing regular expression is compiled is a fatal error. In practice, this means just three limitations: diff --git a/regcomp.c b/regcomp.c index 2bee9d4460..5583b0a7bd 100644 --- a/regcomp.c +++ b/regcomp.c @@ -14823,8 +14823,9 @@ S_handle_regex_sets(pTHX_ RExC_state_t *pRExC_state, SV** return_invlist, TRUE /* Force /x */ ); switch (*RExC_parse) { - case '?': - if (RExC_parse[1] == '[') depth++, RExC_parse++; + case '(': + if (RExC_parse[1] == '?' && RExC_parse[2] == '[') + depth++, RExC_parse+=2; /* FALLTHROUGH */ default: break; @@ -14881,9 +14882,9 @@ S_handle_regex_sets(pTHX_ RExC_state_t *pRExC_state, SV** return_invlist, } case ']': - if (depth--) break; - RExC_parse++; - if (*RExC_parse == ')') { + if (RExC_parse[1] == ')') { + RExC_parse++; + if (depth--) break; node = reganode(pRExC_state, ANYOF, 0); RExC_size += ANYOF_SKIP; nextchar(pRExC_state); @@ -14895,20 +14896,25 @@ S_handle_regex_sets(pTHX_ RExC_state_t *pRExC_state, SV** return_invlist, return node; } - goto no_close; + /* We output the messages even if warnings are off, because we'll fail + * the very next thing, and these give a likely diagnosis for that */ + if (posix_warnings && av_tindex_skip_len_mg(posix_warnings) >= 0) { + output_or_return_posix_warnings(pRExC_state, posix_warnings, NULL); + } + RExC_parse++; + vFAIL("Unexpected ']' with no following ')' in (?[..."); } RExC_parse += UTF ? UTF8SKIP(RExC_parse) : 1; } - no_close: /* We output the messages even if warnings are off, because we'll fail * the very next thing, and these give a likely diagnosis for that */ if (posix_warnings && av_tindex_skip_len_mg(posix_warnings) >= 0) { output_or_return_posix_warnings(pRExC_state, posix_warnings, NULL); } - FAIL("Syntax error in (?[...])"); + vFAIL("Syntax error in (?[...])"); } /* Pass 2 only after this. */ @@ -15088,12 +15094,14 @@ redo_curchar: * inversion list, and RExC_parse points to the trailing * ']'; the next character should be the ')' */ RExC_parse++; - assert(UCHARAT(RExC_parse) == ')'); + if (UCHARAT(RExC_parse) != ')') + vFAIL("Expecting close paren for nested extended charclass"); /* Then the ')' matching the original '(' handled by this * case: statement */ RExC_parse++; - assert(UCHARAT(RExC_parse) == ')'); + if (UCHARAT(RExC_parse) != ')') + vFAIL("Expecting close paren for wrapper for nested extended charclass"); RExC_parse++; RExC_flags = save_flags; diff --git a/t/lib/warnings/regcomp b/t/lib/warnings/regcomp index 2b084c59b0..51ad57ccbe 100644 --- a/t/lib/warnings/regcomp +++ b/t/lib/warnings/regcomp @@ -59,21 +59,21 @@ Unmatched [ in regex; marked by <-- HERE in m/abc[ <-- HERE fi[.00./ at - line qr/(?[[[:word]]])/; EXPECT Assuming NOT a POSIX class since there is no terminating ':' in regex; marked by <-- HERE in m/(?[[[:word <-- HERE ]]])/ at - line 2. -syntax error in (?[...]) in regex m/(?[[[:word]]])/ at - line 2. +Unexpected ']' with no following ')' in (?[... in regex; marked by <-- HERE in m/(?[[[:word]] <-- HERE ])/ at - line 2. ######## # NAME qr/(?[ [[:digit: ])/ # OPTION fatal qr/(?[[[:digit: ])/; EXPECT Assuming NOT a POSIX class since no blanks are allowed in one in regex; marked by <-- HERE in m/(?[[[:digit: ] <-- HERE )/ at - line 2. -syntax error in (?[...]) in regex m/(?[[[:digit: ])/ at - line 2. +syntax error in (?[...]) in regex; marked by <-- HERE in m/(?[[[:digit: ]) <-- HERE / at - line 2. ######## # NAME qr/(?[ [:digit: ])/ # OPTION fatal qr/(?[[:digit: ])/ EXPECT Assuming NOT a POSIX class since no blanks are allowed in one in regex; marked by <-- HERE in m/(?[[:digit: ] <-- HERE )/ at - line 2. -syntax error in (?[...]) in regex m/(?[[:digit: ])/ at - line 2. +syntax error in (?[...]) in regex; marked by <-- HERE in m/(?[[:digit: ]) <-- HERE / at - line 2. ######## # NAME [perl #126141] # OPTION fatal diff --git a/t/re/reg_mesg.t b/t/re/reg_mesg.t index 39cfcf7df1..1b36d6df20 100644 --- a/t/re/reg_mesg.t +++ b/t/re/reg_mesg.t @@ -213,8 +213,9 @@ my @death = '/\b{gc}/' => "'gc' is an unknown bound type {#} m/\\b{gc{#}}/", '/\B{gc}/' => "'gc' is an unknown bound type {#} m/\\B{gc{#}}/", - '/(?[[[::]]])/' => "Syntax error in (?[...]) in regex m/(?[[[::]]])/", - '/(?[[[:w:]]])/' => "Syntax error in (?[...]) in regex m/(?[[[:w:]]])/", + + '/(?[[[::]]])/' => "Unexpected ']' with no following ')' in (?[... {#} m/(?[[[::]]{#}])/", + '/(?[[[:w:]]])/' => "Unexpected ']' with no following ')' in (?[... {#} m/(?[[[:w:]]{#}])/", '/(?[[:w:]])/' => "", '/([.].*)[.]/' => "", # [perl #127582] '/[.].*[.]/' => "", # [perl #127604] @@ -237,11 +238,12 @@ my @death = '/(?[ \p{foo} ])/' => 'Can\'t find Unicode property definition "foo" {#} m/(?[ \p{foo}{#} ])/', '/(?[ \p{ foo = bar } ])/' => 'Can\'t find Unicode property definition "foo = bar" {#} m/(?[ \p{ foo = bar }{#} ])/', '/(?[ \8 ])/' => 'Unrecognized escape \8 in character class {#} m/(?[ \8{#} ])/', - '/(?[ \t ]/' => 'Syntax error in (?[...]) in regex m/(?[ \t ]/', - '/(?[ [ \t ]/' => 'Syntax error in (?[...]) in regex m/(?[ [ \t ]/', - '/(?[ \t ] ]/' => 'Syntax error in (?[...]) in regex m/(?[ \t ] ]/', - '/(?[ [ ] ]/' => 'Syntax error in (?[...]) in regex m/(?[ [ ] ]/', - '/(?[ \t + \e # This was supposed to be a comment ])/' => 'Syntax error in (?[...]) in regex m/(?[ \t + \e # This was supposed to be a comment ])/', + '/(?[ \t ]/' => "Unexpected ']' with no following ')' in (?[... {#} m/(?[ \\t ]{#}/", + '/(?[ [ \t ]/' => "Syntax error in (?[...]) {#} m/(?[ [ \\t ]{#}/", + '/(?[ \t ] ]/' => "Unexpected ']' with no following ')' in (?[... {#} m/(?[ \\t ]{#} ]/", + '/(?[ [ ] ]/' => "Syntax error in (?[...]) {#} m/(?[ [ ] ]{#}/", + '/(?[ \t + \e # This was supposed to be a comment ])/' => + "Syntax error in (?[...]) {#} m/(?[ \\t + \\e # This was supposed to be a comment ]){#}/", '/(?[ ])/' => 'Incomplete expression within \'(?[ ])\' {#} m/(?[ {#}])/', 'm/(?[[a-\d]])/' => 'False [] range "a-\d" {#} m/(?[[a-\d{#}]])/', 'm/(?[[\w-x]])/' => 'False [] range "\w-" {#} m/(?[[\w-{#}x]])/', @@ -427,10 +429,10 @@ my @death_utf8 = mark_as_utf8( '/ネ\p{}ネ/' => 'Empty \p{} {#} m/ネ\p{{#}}ネ/', - '/ネ(?[[[:ネ]]])ネ/' => "Syntax error in (?[...]) in regex m/ネ(?[[[:ネ]]])ネ/", - '/ネ(?[[[:ネ: ])ネ/' => "Syntax error in (?[...]) in regex m/ネ(?[[[:ネ: ])ネ/", - '/ネ(?[[[::]]])ネ/' => "Syntax error in (?[...]) in regex m/ネ(?[[[::]]])ネ/", - '/ネ(?[[[:ネ:]]])ネ/' => "Syntax error in (?[...]) in regex m/ネ(?[[[:ネ:]]])ネ/", + '/ネ(?[[[:ネ]]])ネ/' => "Unexpected ']' with no following ')' in (?[... {#} m/ネ(?[[[:ネ]]{#}])ネ/", + '/ネ(?[[[:ネ: ])ネ/' => "Syntax error in (?[...]) {#} m/ネ(?[[[:ネ: ])ネ{#}/", + '/ネ(?[[[::]]])ネ/' => "Unexpected ']' with no following ')' in (?[... {#} m/ネ(?[[[::]]{#}])ネ/", + '/ネ(?[[[:ネ:]]])ネ/' => "Unexpected ']' with no following ')' in (?[... {#} m/ネ(?[[[:ネ:]]{#}])ネ/", '/ネ(?[[:ネ:]])ネ/' => "", '/ネ(?[ネ])ネ/' => 'Unexpected character {#} m/ネ(?[ネ{#}])ネ/', '/ネ(?[ + [ネ] ])/' => 'Unexpected binary operator \'+\' with no preceding operand {#} m/ネ(?[ +{#} [ネ] ])/', @@ -443,8 +445,9 @@ my @death_utf8 = mark_as_utf8( '/(?[ \x{ネ} ])ネ/' => 'Non-hex character {#} m/(?[ \x{ネ{#}} ])ネ/', '/(?[ \p{ネ} ])/' => 'Can\'t find Unicode property definition "ネ" {#} m/(?[ \p{ネ}{#} ])/', '/(?[ \p{ ネ = bar } ])/' => 'Can\'t find Unicode property definition "ネ = bar" {#} m/(?[ \p{ ネ = bar }{#} ])/', - '/ネ(?[ \t ]/' => 'Syntax error in (?[...]) in regex m/ネ(?[ \t ]/', - '/(?[ \t + \e # ネ This was supposed to be a comment ])/' => 'Syntax error in (?[...]) in regex m/(?[ \t + \e # ネ This was supposed to be a comment ])/', + '/ネ(?[ \t ]/' => "Unexpected ']' with no following ')' in (?[... {#} m/ネ(?[ \\t ]{#}/", + '/(?[ \t + \e # ネ This was supposed to be a comment ])/' => + "Syntax error in (?[...]) {#} m/(?[ \\t + \\e # ネ This was supposed to be a comment ]){#}/", 'm/(*ネ)ネ/' => q<Unknown verb pattern 'ネ' {#} m/(*ネ){#}ネ/>, '/\cネ/' => "Character following \"\\c\" must be printable ASCII", '/\b{ネ}/' => "'ネ' is an unknown bound type {#} m/\\b{ネ{#}}/", diff --git a/t/re/regex_sets.t b/t/re/regex_sets.t index 6a79f9d692..e9644bd4e6 100644 --- a/t/re/regex_sets.t +++ b/t/re/regex_sets.t @@ -158,13 +158,13 @@ for my $char ("٠", "٥", "٩") { eval { $_ = '/(?[(\c]) /'; qr/$_/ }; like($@, qr/^Syntax error/, '/(?[(\c]) / should not panic'); eval { $_ = '(?[\c#]' . "\n])"; qr/$_/ }; - like($@, qr/^Syntax error/, '/(?[(\c]) / should not panic'); + like($@, qr/^Unexpected/, '/(?[(\c]) / should not panic'); eval { $_ = '(?[(\c])'; qr/$_/ }; like($@, qr/^Syntax error/, '/(?[(\c])/ should be a syntax error'); eval { $_ = '(?[(\c]) ]\b'; qr/$_/ }; - like($@, qr/^Syntax error/, '/(?[(\c]) ]\b/ should be a syntax error'); + like($@, qr/^Unexpected/, '/(?[(\c]) ]\b/ should be a syntax error'); eval { $_ = '(?[\c[]](])'; qr/$_/ }; - like($@, qr/^Syntax error/, '/(?[\c[]](])/ should be a syntax error'); + like($@, qr/^Unexpected/, '/(?[\c[]](])/ should be a syntax error'); like("\c#", qr/(?[\c#])/, '\c# should match itself'); like("\c[", qr/(?[\c[])/, '\c[ should match itself'); like("\c\ ", qr/(?[\c\])/, '\c\ should match itself'); -- 2.11.0
RT-Send-CC: perl5-security-report [...] perl.org
Download (untitled) / with headers
text/plain 151b
On Sun, 23 Sep 2018 23:42:36 -0700, tonyc wrote: Show quoted text
> I plan to request a CVE ID for this issue in the next couple of days.
This is CVE-2018-18314. Tony
To: Tony Cook via RT <perl5-security-report-followup [...] perl.org>
CC: perl5-security-report [...] perl.org
Subject: Re: [perl #131649] Heap-based buffer overflow in S_regatom
Date: Mon, 29 Oct 2018 14:11:27 +1100
From: Tony Cook <tony [...] develop-help.com>
Download (untitled) / with headers
text/plain 895b
On Mon, Oct 01, 2018 at 05:19:54PM -0700, Tony Cook via RT wrote: Show quoted text
> On Wed, 19 Sep 2018 17:37:48 -0700, tonyc wrote:
> > On Wed, 29 Nov 2017 10:53:43 -0800, public@khwilliamson.com wrote:
> > > On 11/29/2017 02:46 AM, demerphq wrote:
> > > > I am waiting on Karl to reply. As far as I know my patch does fix the > > > > issue. > > > > > > > > Yves > > > >
> > > > > > I think his patch solves the issue, but my writing a grammar comes > > > after > > > the code freeze
> > > > Which patch? > > > > I don't see a yves/fix_131649 branch anywhere.
> > It turns out this was fixed in 19a498a461d7c81ae3507c450953d1148efecf4f, which is included in 5.28. > > This doesn't cherry-pick cleanly to 5.26, the attached is a merged cherry-pick of the fix.
And this was corrupted apparently at my end, turning all of the non-UTF-8 bytes into the replacement character. Attached a gzipped patch. Tony

Message body not shown because it is not plain text.

RT-Send-CC: perl5-porters [...] perl.org
Moved to public queue with the release of 5.26.3.
RT-Send-CC: perl5-porters [...] perl.org
Download (untitled) / with headers
text/plain 281b
This has been fixed in blead and all maintenance releases, so resolving A new ticket has been created to track creating a formal grammar for regex sets, mentioned originally in this ticket. The new ticket is https://rt.perl.org/Ticket/Display.html?id=133707 -- Karl Williamson


This service is sponsored and maintained by Best Practical Solutions and runs on Perl.org infrastructure.

For issues related to this RT instance (aka "perlbug"), please contact perlbug-admin at perl.org