Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CVE-2017-12837]Heap overflow in Perl__to_fold_latin1 when compiling case-insensitive regexp #16021

Closed
p5pRT opened this issue Jun 16, 2017 · 25 comments

Comments

@p5pRT
Copy link

p5pRT commented Jun 16, 2017

Migrated from rt.perl.org#131582 (status was 'resolved')

Searchable as RT131582$

@p5pRT
Copy link
Author

p5pRT commented Jun 16, 2017

From @jwilk

The following program makes Perl (v5.26.0 on i686-linux) crash​:

$pat = "\x800000000\xd4\xd4\xd4\xd4\xd4\xd4\xd4\xd1\xf2\xf2\xf2\xf2\xf2\xf2\xf2\xf2\xf2\xf2\xf2\xf2\xf2\xf2\xf2\xf2\xf2\xf2\xf2\xf2\xf2\xf2\xf2\xf2\xf2\x970\xdf0\xe90\\N{U+4000}0000000\xd4\xd4\xd4\xd4\xd4\xd4\xd4\xd4\xd4\xd4\xd4\xd4\xd4\xd4\xd4\xd4A000000";
qr/$pat/i;
__END__

ASAN says it's a heap overflow​:

=================================================================
==1995==ERROR​: AddressSanitizer​: heap-buffer-overflow on address 0xb3801180 at pc 0x8536bcf bp 0xbff8d288 sp 0xbff8d27c
WRITE of size 1 at 0xb3801180 thread T0
  #0 0x8536bce in Perl__to_fold_latin1 /home/jwilk/perl/utf8.c​:2399
  #1 0x8536d80 in Perl__to_uni_fold_flags /home/jwilk/perl/utf8.c​:2432
  #2 0x8258642 in S_regatom /home/jwilk/perl/regcomp.c​:13576
  #3 0x8241470 in S_regpiece /home/jwilk/perl/regcomp.c​:11669
  #4 0x8240d0e in S_regbranch /home/jwilk/perl/regcomp.c​:11594
  #5 0x823e319 in S_reg /home/jwilk/perl/regcomp.c​:11332
  #6 0x8213b47 in Perl_re_op_compile /home/jwilk/perl/regcomp.c​:7309
  #7 0x8435c1a in Perl_pp_regcomp /home/jwilk/perl/pp_ctl.c​:107
  #8 0x832e3af in Perl_runops_standard /home/jwilk/perl/run.c​:41
  #9 0x80da894 in S_run_body /home/jwilk/perl/perl.c​:2524
  #10 0x80d93a1 in perl_run /home/jwilk/perl/perl.c​:2447
  #11 0x8061f13 in main /home/jwilk/perl/perlmain.c​:123
  #12 0xb7036722 in __libc_start_main (/lib/i386-linux-gnu/libc.so.6+0x19722)
  #13 0x8061c90 (/home/jwilk/perl/perl+0x8061c90)

0xb3801180 is located 0 bytes to the right of 80-byte region [0xb3801130,0xb3801180)
allocated by thread T0 here​:
  #0 0xb72946e4 in malloc (/usr/lib/i386-linux-gnu/libasan.so.1+0x4e6e4)
  #1 0x82a72a5 in Perl_safesysmalloc /home/jwilk/perl/util.c​:153
  #2 0x8212265 in Perl_re_op_compile /home/jwilk/perl/regcomp.c​:7155
  #3 0x8435c1a in Perl_pp_regcomp /home/jwilk/perl/pp_ctl.c​:107
  #4 0x832e3af in Perl_runops_standard /home/jwilk/perl/run.c​:41
  #5 0x80da894 in S_run_body /home/jwilk/perl/perl.c​:2524
  #6 0x80d93a1 in perl_run /home/jwilk/perl/perl.c​:2447
  #7 0x8061f13 in main /home/jwilk/perl/perlmain.c​:123
  #8 0xb7036722 in __libc_start_main (/lib/i386-linux-gnu/libc.so.6+0x19722)

SUMMARY​: AddressSanitizer​: heap-buffer-overflow /home/jwilk/perl/utf8.c​:2399 Perl__to_fold_latin1
Shadow bytes around the buggy address​:
  0x367001e0​: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x367001f0​: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x36700200​: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x36700210​: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x36700220​: fa fa fa fa fa fa 00 00 00 00 00 00 00 00 00 00
=>0x36700230​:[fa]fa fa fa 00 00 00 00 00 00 00 00 04 fa fa fa
  0x36700240​: fa fa fd fd fd fd fd fd fd fd fd fa fa fa fa fa
  0x36700250​: 00 00 00 00 00 00 00 00 04 fa fa fa fa fa fd fd
  0x36700260​: fd fd fd fd fd fd fd fa fa fa fa fa 00 00 00 00
  0x36700270​: 00 00 00 00 04 fa fa fa fa fa 00 00 00 00 00 00
  0x36700280​: 00 00 04 fa fa fa fa fa 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes)​:
  Addressable​: 00
  Partially addressable​: 01 02 03 04 05 06 07
  Heap left redzone​: fa
  Heap right redzone​: fb
  Freed heap region​: fd
  Stack left redzone​: f1
  Stack mid redzone​: f2
  Stack right redzone​: f3
  Stack partial redzone​: f4
  Stack after return​: f5
  Stack use after scope​: f8
  Global redzone​: f9
  Global init order​: f6
  Poisoned by user​: f7
  Contiguous container OOB​:fc
  ASan internal​: fe
==1995==ABORTING

--
Jakub Wilk

@p5pRT
Copy link
Author

p5pRT commented Jun 16, 2017

From @khwilliamson

On 06/15/2017 06​:37 PM, Jakub Wilk (via RT) wrote​:

# New Ticket Created by Jakub Wilk
# Please include the string​: [perl #131582]
# in the subject line of all future correspondence about this issue.
# <URL​: https://rt-archive.perl.org/perl5/Ticket/Display.html?id=131582 >

The following program makes Perl (v5.26.0 on i686-linux) crash​:

$pat = "\x800000000\xd4\xd4\xd4\xd4\xd4\xd4\xd4\xd1\xf2\xf2\xf2\xf2\xf2\xf2\xf2\xf2\xf2\xf2\xf2\xf2\xf2\xf2\xf2\xf2\xf2\xf2\xf2\xf2\xf2\xf2\xf2\xf2\xf2\x970\xdf0\xe90\\N{U+4000}0000000\xd4\xd4\xd4\xd4\xd4\xd4\xd4\xd4\xd4\xd4\xd4\xd4\xd4\xd4\xd4\xd4A000000";
qr/$pat/i;
__END__

ASAN says it's a heap overflow​:

=================================================================
==1995==ERROR​: AddressSanitizer​: heap-buffer-overflow on address 0xb3801180 at pc 0x8536bcf bp 0xbff8d288 sp 0xbff8d27c
WRITE of size 1 at 0xb3801180 thread T0
#0 0x8536bce in Perl__to_fold_latin1 /home/jwilk/perl/utf8.c​:2399
#1 0x8536d80 in Perl__to_uni_fold_flags /home/jwilk/perl/utf8.c​:2432
#2 0x8258642 in S_regatom /home/jwilk/perl/regcomp.c​:13576
#3 0x8241470 in S_regpiece /home/jwilk/perl/regcomp.c​:11669
#4 0x8240d0e in S_regbranch /home/jwilk/perl/regcomp.c​:11594
#5 0x823e319 in S_reg /home/jwilk/perl/regcomp.c​:11332
#6 0x8213b47 in Perl_re_op_compile /home/jwilk/perl/regcomp.c​:7309
#7 0x8435c1a in Perl_pp_regcomp /home/jwilk/perl/pp_ctl.c​:107
#8 0x832e3af in Perl_runops_standard /home/jwilk/perl/run.c​:41
#9 0x80da894 in S_run_body /home/jwilk/perl/perl.c​:2524
#10 0x80d93a1 in perl_run /home/jwilk/perl/perl.c​:2447
#11 0x8061f13 in main /home/jwilk/perl/perlmain.c​:123
#12 0xb7036722 in __libc_start_main (/lib/i386-linux-gnu/libc.so.6+0x19722)
#13 0x8061c90 (/home/jwilk/perl/perl+0x8061c90)

0xb3801180 is located 0 bytes to the right of 80-byte region [0xb3801130,0xb3801180)
allocated by thread T0 here​:
#0 0xb72946e4 in malloc (/usr/lib/i386-linux-gnu/libasan.so.1+0x4e6e4)
#1 0x82a72a5 in Perl_safesysmalloc /home/jwilk/perl/util.c​:153
#2 0x8212265 in Perl_re_op_compile /home/jwilk/perl/regcomp.c​:7155
#3 0x8435c1a in Perl_pp_regcomp /home/jwilk/perl/pp_ctl.c​:107
#4 0x832e3af in Perl_runops_standard /home/jwilk/perl/run.c​:41
#5 0x80da894 in S_run_body /home/jwilk/perl/perl.c​:2524
#6 0x80d93a1 in perl_run /home/jwilk/perl/perl.c​:2447
#7 0x8061f13 in main /home/jwilk/perl/perlmain.c​:123
#8 0xb7036722 in __libc_start_main (/lib/i386-linux-gnu/libc.so.6+0x19722)

SUMMARY​: AddressSanitizer​: heap-buffer-overflow /home/jwilk/perl/utf8.c​:2399 Perl__to_fold_latin1
Shadow bytes around the buggy address​:
0x367001e0​: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x367001f0​: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x36700200​: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x36700210​: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x36700220​: fa fa fa fa fa fa 00 00 00 00 00 00 00 00 00 00
=>0x36700230​:[fa]fa fa fa 00 00 00 00 00 00 00 00 04 fa fa fa
0x36700240​: fa fa fd fd fd fd fd fd fd fd fd fa fa fa fa fa
0x36700250​: 00 00 00 00 00 00 00 00 04 fa fa fa fa fa fd fd
0x36700260​: fd fd fd fd fd fd fd fa fa fa fa fa 00 00 00 00
0x36700270​: 00 00 00 00 04 fa fa fa fa fa 00 00 00 00 00 00
0x36700280​: 00 00 04 fa fa fa fa fa 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes)​:
Addressable​: 00
Partially addressable​: 01 02 03 04 05 06 07
Heap left redzone​: fa
Heap right redzone​: fb
Freed heap region​: fd
Stack left redzone​: f1
Stack mid redzone​: f2
Stack right redzone​: f3
Stack partial redzone​: f4
Stack after return​: f5
Stack use after scope​: f8
Global redzone​: f9
Global init order​: f6
Poisoned by user​: f7
Contiguous container OOB​:fc
ASan internal​: fe
==1995==ABORTING

I have a one line fix for this. The problem is a pointer was getting
changed temporarily, but the restoral was getting skipped sometimes.

I don't know how exploitable this is. It requires matching an exact
string (under /i) with a \N{} in it and the exactish string has to be
longer than 127 bytes, and the character at that edge must be one that
is in some Unicode fold at the non-final position of that fold. In this
example, that character is "A', and the fold of
  U+1E9A # LATIN SMALL LETTER A WITH RIGHT HALF RING
starts with 'a' and ends with a combining character. (We can't split
potential folds across regnode boundaries, since the code that does the
matching can't cope with that.)

I don't know for sure what the other preconditions are. If I change the
\N{} value to something else, the problem appears to go away.

So, an attacker can craft a pattern, that if compiled will crash the
interpreter. It might be able to write out-of-bounds to the attacker's
choosing, but I don't know how far and if this can happen without crashing.

So what to do? I can patch blead, and since this crashes perl, it
becomes a candidate for a maintenance release. But is it something
beyond a DOS attack, I don't know.

@p5pRT
Copy link
Author

p5pRT commented Jun 16, 2017

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Jun 20, 2017

From @xsawyerx

On 16 June 2017 at 15​:06, Karl Williamson <public@​khwilliamson.com> wrote​:

On 06/15/2017 06​:37 PM, Jakub Wilk (via RT) wrote​:

# New Ticket Created by Jakub Wilk
# Please include the string​: [perl #131582]
# in the subject line of all future correspondence about this issue.
# <URL​: https://rt-archive.perl.org/perl5/Ticket/Display.html?id=131582 >

The following program makes Perl (v5.26.0 on i686-linux) crash​:

$pat = "\x800000000\xd4\xd4\xd4\xd4\xd4\xd4\xd4\xd1\xf2\xf2\xf2\xf2
\xf2\xf2\xf2\xf2\xf2\xf2\xf2\xf2\xf2\xf2\xf2\xf2\xf2\xf2\
xf2\xf2\xf2\xf2\xf2\xf2\xf2\x970\xdf0\xe90\\N{U+4000}00000
00\xd4\xd4\xd4\xd4\xd4\xd4\xd4\xd4\xd4\xd4\xd4\xd4\xd4\
xd4\xd4\xd4A000000";
qr/$pat/i;
__END__

ASAN says it's a heap overflow​:

=================================================================
==1995==ERROR​: AddressSanitizer​: heap-buffer-overflow on address
0xb3801180 at pc 0x8536bcf bp 0xbff8d288 sp 0xbff8d27c
WRITE of size 1 at 0xb3801180 thread T0
#0 0x8536bce in Perl__to_fold_latin1 /home/jwilk/perl/utf8.c​:2399
#1 0x8536d80 in Perl__to_uni_fold_flags /home/jwilk/perl/utf8.c​:2432
#2 0x8258642 in S_regatom /home/jwilk/perl/regcomp.c​:13576
#3 0x8241470 in S_regpiece /home/jwilk/perl/regcomp.c​:11669
#4 0x8240d0e in S_regbranch /home/jwilk/perl/regcomp.c​:11594
#5 0x823e319 in S_reg /home/jwilk/perl/regcomp.c​:11332
#6 0x8213b47 in Perl_re_op_compile /home/jwilk/perl/regcomp.c​:7309
#7 0x8435c1a in Perl_pp_regcomp /home/jwilk/perl/pp_ctl.c​:107
#8 0x832e3af in Perl_runops_standard /home/jwilk/perl/run.c​:41
#9 0x80da894 in S_run_body /home/jwilk/perl/perl.c​:2524
#10 0x80d93a1 in perl_run /home/jwilk/perl/perl.c​:2447
#11 0x8061f13 in main /home/jwilk/perl/perlmain.c​:123
#12 0xb7036722 in __libc_start_main (/lib/i386-linux-gnu/libc.so.6
+0x19722)
#13 0x8061c90 (/home/jwilk/perl/perl+0x8061c90)

0xb3801180 is located 0 bytes to the right of 80-byte region
[0xb3801130,0xb3801180)
allocated by thread T0 here​:
#0 0xb72946e4 in malloc (/usr/lib/i386-linux-gnu/libas
an.so.1+0x4e6e4)
#1 0x82a72a5 in Perl_safesysmalloc /home/jwilk/perl/util.c​:153
#2 0x8212265 in Perl_re_op_compile /home/jwilk/perl/regcomp.c​:7155
#3 0x8435c1a in Perl_pp_regcomp /home/jwilk/perl/pp_ctl.c​:107
#4 0x832e3af in Perl_runops_standard /home/jwilk/perl/run.c​:41
#5 0x80da894 in S_run_body /home/jwilk/perl/perl.c​:2524
#6 0x80d93a1 in perl_run /home/jwilk/perl/perl.c​:2447
#7 0x8061f13 in main /home/jwilk/perl/perlmain.c​:123
#8 0xb7036722 in __libc_start_main (/lib/i386-linux-gnu/libc.so.6
+0x19722)

SUMMARY​: AddressSanitizer​: heap-buffer-overflow
/home/jwilk/perl/utf8.c​:2399 Perl__to_fold_latin1
Shadow bytes around the buggy address​:
0x367001e0​: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x367001f0​: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x36700200​: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x36700210​: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x36700220​: fa fa fa fa fa fa 00 00 00 00 00 00 00 00 00 00
=>0x36700230​:[fa]fa fa fa 00 00 00 00 00 00 00 00 04 fa fa fa
0x36700240​: fa fa fd fd fd fd fd fd fd fd fd fa fa fa fa fa
0x36700250​: 00 00 00 00 00 00 00 00 04 fa fa fa fa fa fd fd
0x36700260​: fd fd fd fd fd fd fd fa fa fa fa fa 00 00 00 00
0x36700270​: 00 00 00 00 04 fa fa fa fa fa 00 00 00 00 00 00
0x36700280​: 00 00 04 fa fa fa fa fa 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes)​:
Addressable​: 00
Partially addressable​: 01 02 03 04 05 06 07
Heap left redzone​: fa
Heap right redzone​: fb
Freed heap region​: fd
Stack left redzone​: f1
Stack mid redzone​: f2
Stack right redzone​: f3
Stack partial redzone​: f4
Stack after return​: f5
Stack use after scope​: f8
Global redzone​: f9
Global init order​: f6
Poisoned by user​: f7
Contiguous container OOB​:fc
ASan internal​: fe
==1995==ABORTING

I have a one line fix for this. The problem is a pointer was getting
changed temporarily, but the restoral was getting skipped sometimes.

I don't know how exploitable this is. It requires matching an exact
string (under /i) with a \N{} in it and the exactish string has to be
longer than 127 bytes, and the character at that edge must be one that is
in some Unicode fold at the non-final position of that fold. In this
example, that character is "A', and the fold of
U+1E9A # LATIN SMALL LETTER A WITH RIGHT HALF RING
starts with 'a' and ends with a combining character. (We can't split
potential folds across regnode boundaries, since the code that does the
matching can't cope with that.)

I don't know for sure what the other preconditions are. If I change the
\N{} value to something else, the problem appears to go away.

So, an attacker can craft a pattern, that if compiled will crash the
interpreter. It might be able to write out-of-bounds to the attacker's
choosing, but I don't know how far and if this can happen without crashing.

So what to do? I can patch blead, and since this crashes perl, it becomes
a candidate for a maintenance release. But is it something beyond a DOS
attack, I don't know.

Let's patch blead. We can add it to the votes file and I believe we could
backport it at least to 5.24.3.

@p5pRT
Copy link
Author

p5pRT commented Jun 20, 2017

From @khwilliamson

On 06/20/2017 12​:18 PM, Sawyer X wrote​:

On 16 June 2017 at 15​:06, Karl Williamson <public@​khwilliamson.com
<mailto​:public@​khwilliamson.com>> wrote​:

On 06/15/2017 06&#8203;:37 PM\, Jakub Wilk \(via RT\) wrote&#8203;:

    \# New Ticket Created by  Jakub Wilk
    \# Please include the string&#8203;:  \[perl \#131582\]
    \# in the subject line of all future correspondence about this issue\.
    \# \<URL&#8203;: https://rt-archive.perl.org/perl5/Ticket/Display.html?id=131582
    \<https://rt-archive.perl.org/perl5/Ticket/Display.html?id=131582> >


    The following program makes Perl \(v5\.26\.0 on i686\-linux\) crash&#8203;:

    $pat =
    "\\x800000000\\xd4\\xd4\\xd4\\xd4\\xd4\\xd4\\xd4\\xd1\\xf2\\xf2\\xf2\\xf2\\xf2\\xf2\\xf2\\xf2\\xf2\\xf2\\xf2\\xf2\\xf2\\xf2\\xf2\\xf2\\xf2\\xf2\\xf2\\xf2\\xf2\\xf2\\xf2\\xf2\\xf2\\x970\\xdf0\\xe90\\\\N\{U\+4000\}0000000\\xd4\\xd4\\xd4\\xd4\\xd4\\xd4\\xd4\\xd4\\xd4\\xd4\\xd4\\xd4\\xd4\\xd4\\xd4\\xd4A000000";
    qr/$pat/i;
    \_\_END\_\_


    ASAN says it's a heap overflow&#8203;:

    =================================================================
    ==1995==ERROR&#8203;: AddressSanitizer&#8203;: heap\-buffer\-overflow on address
    0xb3801180 at pc 0x8536bcf bp 0xbff8d288 sp 0xbff8d27c
    WRITE of size 1 at 0xb3801180 thread T0
          \#0 0x8536bce in Perl\_\_to\_fold\_latin1
    /home/jwilk/perl/utf8\.c&#8203;:2399
          \#1 0x8536d80 in Perl\_\_to\_uni\_fold\_flags
    /home/jwilk/perl/utf8\.c&#8203;:2432
          \#2 0x8258642 in S\_regatom /home/jwilk/perl/regcomp\.c&#8203;:13576
          \#3 0x8241470 in S\_regpiece /home/jwilk/perl/regcomp\.c&#8203;:11669
          \#4 0x8240d0e in S\_regbranch /home/jwilk/perl/regcomp\.c&#8203;:11594
          \#5 0x823e319 in S\_reg /home/jwilk/perl/regcomp\.c&#8203;:11332
          \#6 0x8213b47 in Perl\_re\_op\_compile
    /home/jwilk/perl/regcomp\.c&#8203;:7309
          \#7 0x8435c1a in Perl\_pp\_regcomp /home/jwilk/perl/pp\_ctl\.c&#8203;:107
          \#8 0x832e3af in Perl\_runops\_standard /home/jwilk/perl/run\.c&#8203;:41
          \#9 0x80da894 in S\_run\_body /home/jwilk/perl/perl\.c&#8203;:2524
          \#10 0x80d93a1 in perl\_run /home/jwilk/perl/perl\.c&#8203;:2447
          \#11 0x8061f13 in main /home/jwilk/perl/perlmain\.c&#8203;:123
          \#12 0xb7036722 in \_\_libc\_start\_main
    \(/lib/i386\-linux\-gnu/libc\.so\.6\+0x19722\)
          \#13 0x8061c90 \(/home/jwilk/perl/perl\+0x8061c90\)

    0xb3801180 is located 0 bytes to the right of 80\-byte region
    \[0xb3801130\,0xb3801180\)
    allocated by thread T0 here&#8203;:
          \#0 0xb72946e4 in malloc
    \(/usr/lib/i386\-linux\-gnu/libasan\.so\.1\+0x4e6e4\)
          \#1 0x82a72a5 in Perl\_safesysmalloc /home/jwilk/perl/util\.c&#8203;:153
          \#2 0x8212265 in Perl\_re\_op\_compile
    /home/jwilk/perl/regcomp\.c&#8203;:7155
          \#3 0x8435c1a in Perl\_pp\_regcomp /home/jwilk/perl/pp\_ctl\.c&#8203;:107
          \#4 0x832e3af in Perl\_runops\_standard /home/jwilk/perl/run\.c&#8203;:41
          \#5 0x80da894 in S\_run\_body /home/jwilk/perl/perl\.c&#8203;:2524
          \#6 0x80d93a1 in perl\_run /home/jwilk/perl/perl\.c&#8203;:2447
          \#7 0x8061f13 in main /home/jwilk/perl/perlmain\.c&#8203;:123
          \#8 0xb7036722 in \_\_libc\_start\_main
    \(/lib/i386\-linux\-gnu/libc\.so\.6\+0x19722\)

    SUMMARY&#8203;: AddressSanitizer&#8203;: heap\-buffer\-overflow
    /home/jwilk/perl/utf8\.c&#8203;:2399 Perl\_\_to\_fold\_latin1
    Shadow bytes around the buggy address&#8203;:
        0x367001e0&#8203;: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
        0x367001f0&#8203;: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
        0x36700200&#8203;: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
        0x36700210&#8203;: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
        0x36700220&#8203;: fa fa fa fa fa fa 00 00 00 00 00 00 00 00 00 00
    =>0x36700230&#8203;:\[fa\]fa fa fa 00 00 00 00 00 00 00 00 04 fa fa fa
        0x36700240&#8203;: fa fa fd fd fd fd fd fd fd fd fd fa fa fa fa fa
        0x36700250&#8203;: 00 00 00 00 00 00 00 00 04 fa fa fa fa fa fd fd
        0x36700260&#8203;: fd fd fd fd fd fd fd fa fa fa fa fa 00 00 00 00
        0x36700270&#8203;: 00 00 00 00 04 fa fa fa fa fa 00 00 00 00 00 00
        0x36700280&#8203;: 00 00 04 fa fa fa fa fa 00 00 00 00 00 00 00 00
    Shadow byte legend \(one shadow byte represents 8 application bytes\)&#8203;:
        Addressable&#8203;:           00
        Partially addressable&#8203;: 01 02 03 04 05 06 07
        Heap left redzone&#8203;:       fa
        Heap right redzone&#8203;:      fb
        Freed heap region&#8203;:       fd
        Stack left redzone&#8203;:      f1
        Stack mid redzone&#8203;:       f2
        Stack right redzone&#8203;:     f3
        Stack partial redzone&#8203;:   f4
        Stack after return&#8203;:      f5
        Stack use after scope&#8203;:   f8
        Global redzone&#8203;:          f9
        Global init order&#8203;:       f6
        Poisoned by user&#8203;:        f7
        Contiguous container OOB&#8203;:fc
        ASan internal&#8203;:           fe
    ==1995==ABORTING


I have a one line fix for this\.  The problem is a pointer was
getting changed temporarily\, but the restoral was getting skipped
sometimes\.

I don't know how exploitable this is\.  It requires matching an exact
string \(under /i\) with a \\N\{\} in it and the exactish string has to
be longer than 127 bytes\, and the character at that edge must be one
that is in some Unicode fold at the non\-final position of that
fold\.  In this example\, that character is "A'\, and the fold of
         U\+1E9A \# LATIN SMALL LETTER A WITH RIGHT HALF RING
starts with 'a' and ends with a combining character\.  \(We can't
split potential folds across regnode boundaries\, since the code that
does the matching can't cope with that\.\)

I don't know for sure what the other preconditions are\.  If I change
the \\N\{\} value to something else\, the problem appears to go away\.

So\, an attacker can craft a pattern\, that if compiled will crash the
interpreter\.  It might be able to write out\-of\-bounds to the
attacker's choosing\, but I don't know how far and if this can happen
without crashing\.

So what to do?  I can patch blead\, and since this crashes perl\, it
becomes a candidate for a maintenance release\.  But is it something
beyond a DOS attack\, I don't know\.

Let's patch blead. We can add it to the votes file and I believe we
could backport it at least to 5.24.3.

And what do we do about this ticket, and about the commit message?

@p5pRT
Copy link
Author

p5pRT commented Jun 21, 2017

From @iabyn

On Fri, Jun 16, 2017 at 01​:06​:40PM -0600, Karl Williamson wrote​:

I have a one line fix for this. The problem is a pointer was getting
changed temporarily, but the restoral was getting skipped sometimes.

Can you show that change, please. I might help me better understand the
issue and thus be able to reply to this​:

So, an attacker can craft a pattern, that if compiled will crash the
interpreter. It might be able to write out-of-bounds to the attacker's
choosing, but I don't know how far and if this can happen without crashing.

--
Spock (or Data) is fired from his high-ranking position for not being able
to understand the most basic nuances of about one in three sentences that
anyone says to him.
  -- Things That Never Happen in "Star Trek" #19

@p5pRT
Copy link
Author

p5pRT commented Jun 21, 2017

From @khwilliamson

On 06/21/2017 08​:15 AM, Dave Mitchell wrote​:

On Fri, Jun 16, 2017 at 01​:06​:40PM -0600, Karl Williamson wrote​:

I have a one line fix for this. The problem is a pointer was getting
changed temporarily, but the restoral was getting skipped sometimes.

Can you show that change, please. I might help me better understand the
issue and thus be able to reply to this​:

So, an attacker can craft a pattern, that if compiled will crash the
interpreter. It might be able to write out-of-bounds to the attacker's
choosing, but I don't know how far and if this can happen without crashing.

Attached. I'm unsure if the setting of RExC_parse not changed by this
patch is correct; I haven't looked at it. I think it's wrong, but it
might be harmless.

@p5pRT
Copy link
Author

p5pRT commented Jun 21, 2017

From @khwilliamson

0100-perl-131582.patch
From d78e88add522aff0d218d7991bddb672971cbf21 Mon Sep 17 00:00:00 2001
From: Karl Williamson <khw@cpan.org>
Date: Wed, 21 Jun 2017 11:33:37 -0600
Subject: [PATCH 100/100] [perl #131582]

---
 regcomp.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/regcomp.c b/regcomp.c
index d10dea0fbf..5c2e43a2d2 100644
--- a/regcomp.c
+++ b/regcomp.c
@@ -13285,6 +13285,7 @@ S_regatom(pTHX_ RExC_state_t *pRExC_state, I32 *flagp, U32 depth)
                             goto loopdone;
                         }
                         p = RExC_parse;
+                        RExC_parse = parse_start;
                         if (ender > 0xff) {
                             REQUIRE_UTF8(flagp);
                         }
-- 
2.11.0

@p5pRT
Copy link
Author

p5pRT commented Jun 22, 2017

From @iabyn

On Wed, Jun 21, 2017 at 11​:38​:29AM -0600, Karl Williamson wrote​:

On 06/21/2017 08​:15 AM, Dave Mitchell wrote​:

On Fri, Jun 16, 2017 at 01​:06​:40PM -0600, Karl Williamson wrote​:

I have a one line fix for this. The problem is a pointer was getting
changed temporarily, but the restoral was getting skipped sometimes.

Can you show that change, please. I might help me better understand the
issue and thus be able to reply to this​:

So, an attacker can craft a pattern, that if compiled will crash the
interpreter. It might be able to write out-of-bounds to the attacker's
choosing, but I don't know how far and if this can happen without crashing.

Attached. I'm unsure if the setting of RExC_parse not changed by this patch
is correct; I haven't looked at it. I think it's wrong, but it might be
harmless.

I don;'t understand that sentence :-(

Also, even after looking at the code for a bit, I still don't understand the
severity of this bug.

So from valgrind, I see its writing bytes at locations 0,1,21,22,23 beyond
the end of the malloced block (at which point valgrind crashes, so it
might otherwise have written further). I'm guessing that the bytes it
writes are in some way determined by the contents of the pattern.

So that seems rather serious.

Maybe combine this and the other regex compilation bug (#131598) into
one CVE?

--
All wight. I will give you one more chance. This time, I want to hear
no Wubens. No Weginalds. No Wudolf the wed-nosed weindeers.
  -- Life of Brian

@p5pRT
Copy link
Author

p5pRT commented Jun 22, 2017

From @khwilliamson

On 06/22/2017 04​:07 AM, Dave Mitchell wrote​:

On Wed, Jun 21, 2017 at 11​:38​:29AM -0600, Karl Williamson wrote​:

On 06/21/2017 08​:15 AM, Dave Mitchell wrote​:

On Fri, Jun 16, 2017 at 01​:06​:40PM -0600, Karl Williamson wrote​:

I have a one line fix for this. The problem is a pointer was getting
changed temporarily, but the restoral was getting skipped sometimes.

Can you show that change, please. I might help me better understand the
issue and thus be able to reply to this​:

So, an attacker can craft a pattern, that if compiled will crash the
interpreter. It might be able to write out-of-bounds to the attacker's
choosing, but I don't know how far and if this can happen without crashing.

Attached. I'm unsure if the setting of RExC_parse not changed by this patch
is correct; I haven't looked at it. I think it's wrong, but it might be
harmless.

I don;'t understand that sentence :-(

I looked, and it is harmless. In thinking about this, I got new insight
into how patterns are parsed.

The parser has the equivalent of the topic variable $_, but it is a
struct with a bunch of entries. One of those is RExC_parse, the pointer
to the current place in the input. However, in a few places, the code,
for whatever reason, has a local pointer, updating RExC_parse only on
exit. The problem is if that code calls something else, that something
else is usually expecting RExC_parse to be the current place. So,
RExC_parse has to be set to the current position for that call, and
restored upon return. Sometimes, the restoral isn't necessary, because
RExC_parse is going to immediately get updated anyway.

The lines that get patched are

  RExC_parse = p = oldp;
  goto loopdone;
  }
  p = RExC_parse;

The "}; is the end of an 'if'

What I meant in the sentence you didn't understand is that the lines
that say

RExC_parse = p = oldp;
goto loopdone;

is setting RExC_parse, but this isn't the proper value to restore it to.
  But it doesn't matter, because at loopdone, it gets set properly.

The other branch updates p, which is the parse pointer local to this
function, but doesn't restore RExC_parse, the global pointer. In some
cases it doesn't matter, the function absorbs things and RExC_parse gets
updated upon termination. But in some cases, the function needs the
original value of RExC_parse to work properly. In this case, it is
because the pattern won't fit into a single node, and so it uses the
original value to calculate things.

Also, even after looking at the code for a bit, I still don't understand the
severity of this bug.

So from valgrind, I see its writing bytes at locations 0,1,21,22,23 beyond
the end of the malloced block (at which point valgrind crashes, so it
might otherwise have written further). I'm guessing that the bytes it
writes are in some way determined by the contents of the pattern.

So that seems rather serious.

Maybe combine this and the other regex compilation bug (#131598) into
one CVE?

Maybe, I guess Sawyer decides.

@p5pRT
Copy link
Author

p5pRT commented Jun 23, 2017

From @demerphq

On 22 June 2017 at 22​:58, Karl Williamson <public@​khwilliamson.com> wrote​:

On 06/22/2017 04​:07 AM, Dave Mitchell wrote​:

On Wed, Jun 21, 2017 at 11​:38​:29AM -0600, Karl Williamson wrote​:

On 06/21/2017 08​:15 AM, Dave Mitchell wrote​:

On Fri, Jun 16, 2017 at 01​:06​:40PM -0600, Karl Williamson wrote​:

I have a one line fix for this. The problem is a pointer was getting
changed temporarily, but the restoral was getting skipped sometimes.

Can you show that change, please. I might help me better understand the
issue and thus be able to reply to this​:

So, an attacker can craft a pattern, that if compiled will crash the
interpreter. It might be able to write out-of-bounds to the attacker's
choosing, but I don't know how far and if this can happen without
crashing.

Attached. I'm unsure if the setting of RExC_parse not changed by this
patch
is correct; I haven't looked at it. I think it's wrong, but it might be
harmless.

I don;'t understand that sentence :-(

I looked, and it is harmless. In thinking about this, I got new insight
into how patterns are parsed.

The parser has the equivalent of the topic variable $_, but it is a struct
with a bunch of entries. One of those is RExC_parse, the pointer to the
current place in the input. However, in a few places, the code, for
whatever reason, has a local pointer, updating RExC_parse only on exit. The
problem is if that code calls something else, that something else is usually
expecting RExC_parse to be the current place. So, RExC_parse has to be set
to the current position for that call, and restored upon return. Sometimes,
the restoral isn't necessary, because RExC_parse is going to immediately get
updated anyway.

IIRC this is happens in some places where we aren't sure what we are
parsing yet.

Yves

@p5pRT
Copy link
Author

p5pRT commented Jul 12, 2017

From @xsawyerx

On Thu, 22 Jun 2017 13​:58​:37 -0700, public@​khwilliamson.com wrote​:

On 06/22/2017 04​:07 AM, Dave Mitchell wrote​:

On Wed, Jun 21, 2017 at 11​:38​:29AM -0600, Karl Williamson wrote​:

On 06/21/2017 08​:15 AM, Dave Mitchell wrote​:

On Fri, Jun 16, 2017 at 01​:06​:40PM -0600, Karl Williamson wrote​:

I have a one line fix for this. The problem is a pointer was
getting
changed temporarily, but the restoral was getting skipped
sometimes.

Can you show that change, please. I might help me better understand
the
issue and thus be able to reply to this​:

So, an attacker can craft a pattern, that if compiled will crash
the
interpreter. It might be able to write out-of-bounds to the
attacker's
choosing, but I don't know how far and if this can happen without
crashing.

Attached. I'm unsure if the setting of RExC_parse not changed by
this patch
is correct; I haven't looked at it. I think it's wrong, but it
might be
harmless.

I don;'t understand that sentence :-(

I looked, and it is harmless. In thinking about this, I got new
insight
into how patterns are parsed.

The parser has the equivalent of the topic variable $_, but it is a
struct with a bunch of entries. One of those is RExC_parse, the
pointer
to the current place in the input. However, in a few places, the
code,
for whatever reason, has a local pointer, updating RExC_parse only on
exit. The problem is if that code calls something else, that
something
else is usually expecting RExC_parse to be the current place. So,
RExC_parse has to be set to the current position for that call, and
restored upon return. Sometimes, the restoral isn't necessary,
because
RExC_parse is going to immediately get updated anyway.

The lines that get patched are

RExC_parse = p = oldp;
goto loopdone;
}
p = RExC_parse;

The "}; is the end of an 'if'

What I meant in the sentence you didn't understand is that the lines
that say

RExC_parse = p = oldp;
goto loopdone;

is setting RExC_parse, but this isn't the proper value to restore it
to.
But it doesn't matter, because at loopdone, it gets set properly.

The other branch updates p, which is the parse pointer local to this
function, but doesn't restore RExC_parse, the global pointer. In some
cases it doesn't matter, the function absorbs things and RExC_parse
gets
updated upon termination. But in some cases, the function needs the
original value of RExC_parse to work properly. In this case, it is
because the pattern won't fit into a single node, and so it uses the
original value to calculate things.

Also, even after looking at the code for a bit, I still don't
understand the
severity of this bug.

So from valgrind, I see its writing bytes at locations 0,1,21,22,23
beyond
the end of the malloced block (at which point valgrind crashes, so it
might otherwise have written further). I'm guessing that the bytes it
writes are in some way determined by the contents of the pattern.

So that seems rather serious.

Maybe combine this and the other regex compilation bug (#131598) into
one CVE?

Maybe, I guess Sawyer decides.

I think a single CVE for those, as long as they share a common fix or a common category (something more narrow than "regex compilation bug").

Do they fit together or are they categorically different in behavior, bug, or fix?

@p5pRT
Copy link
Author

p5pRT commented Aug 11, 2017

From @tonycoz

On Wed, 12 Jul 2017 13​:42​:50 -0700, xsawyerx@​cpan.org wrote​:

On Thu, 22 Jun 2017 13​:58​:37 -0700, public@​khwilliamson.com wrote​:

On 06/22/2017 04​:07 AM, Dave Mitchell wrote​:

On Wed, Jun 21, 2017 at 11​:38​:29AM -0600, Karl Williamson wrote​:

On 06/21/2017 08​:15 AM, Dave Mitchell wrote​:

On Fri, Jun 16, 2017 at 01​:06​:40PM -0600, Karl Williamson wrote​:

I have a one line fix for this. The problem is a pointer was
getting
changed temporarily, but the restoral was getting skipped
sometimes.

Can you show that change, please. I might help me better
understand
the
issue and thus be able to reply to this​:

So, an attacker can craft a pattern, that if compiled will crash
the
interpreter. It might be able to write out-of-bounds to the
attacker's
choosing, but I don't know how far and if this can happen
without
crashing.

Attached. I'm unsure if the setting of RExC_parse not changed by
this patch
is correct; I haven't looked at it. I think it's wrong, but it
might be
harmless.

I don;'t understand that sentence :-(

I looked, and it is harmless. In thinking about this, I got new
insight
into how patterns are parsed.

The parser has the equivalent of the topic variable $_, but it is a
struct with a bunch of entries. One of those is RExC_parse, the
pointer
to the current place in the input. However, in a few places, the
code,
for whatever reason, has a local pointer, updating RExC_parse only on
exit. The problem is if that code calls something else, that
something
else is usually expecting RExC_parse to be the current place. So,
RExC_parse has to be set to the current position for that call, and
restored upon return. Sometimes, the restoral isn't necessary,
because
RExC_parse is going to immediately get updated anyway.

The lines that get patched are

RExC_parse = p = oldp;
goto loopdone;
}
p = RExC_parse;

The "}; is the end of an 'if'

What I meant in the sentence you didn't understand is that the lines
that say

RExC_parse = p = oldp;
goto loopdone;

is setting RExC_parse, but this isn't the proper value to restore it
to.
But it doesn't matter, because at loopdone, it gets set properly.

The other branch updates p, which is the parse pointer local to this
function, but doesn't restore RExC_parse, the global pointer. In
some
cases it doesn't matter, the function absorbs things and RExC_parse
gets
updated upon termination. But in some cases, the function needs the
original value of RExC_parse to work properly. In this case, it is
because the pattern won't fit into a single node, and so it uses the
original value to calculate things.

Also, even after looking at the code for a bit, I still don't
understand the
severity of this bug.

So from valgrind, I see its writing bytes at locations 0,1,21,22,23
beyond
the end of the malloced block (at which point valgrind crashes, so
it
might otherwise have written further). I'm guessing that the bytes
it
writes are in some way determined by the contents of the pattern.

So that seems rather serious.

Maybe combine this and the other regex compilation bug (#131598)
into
one CVE?

Maybe, I guess Sawyer decides.

I think a single CVE for those, as long as they share a common fix or
a common category (something more narrow than "regex compilation
bug").

Do they fit together or are they categorically different in behavior,
bug, or fix?

As requested on IRC I've applied for a CVE ID for this issue.

Tony

@p5pRT
Copy link
Author

p5pRT commented Aug 12, 2017

From @tonycoz

On Fri, 11 Aug 2017 03​:16​:45 -0700, tonyc wrote​:

As requested on IRC I've applied for a CVE ID for this issue.

This is CVE-2017-12837.

Tony

@p5pRT
Copy link
Author

p5pRT commented Aug 24, 2017

From @khwilliamson

On Fri, 11 Aug 2017 17​:53​:41 -0700, tonyc wrote​:

On Fri, 11 Aug 2017 03​:16​:45 -0700, tonyc wrote​:

As requested on IRC I've applied for a CVE ID for this issue.

This is CVE-2017-12837.

Tony

This should go into 5.26.1; I will attach a patch.
--
Karl Williamson

@p5pRT
Copy link
Author

p5pRT commented Aug 25, 2017

From @khwilliamson

On Thu, 24 Aug 2017 13​:10​:14 -0700, khw wrote​:

On Fri, 11 Aug 2017 17​:53​:41 -0700, tonyc wrote​:

On Fri, 11 Aug 2017 03​:16​:45 -0700, tonyc wrote​:

As requested on IRC I've applied for a CVE ID for this issue.

This is CVE-2017-12837.

Tony

This should go into 5.26.1; I will attach a patch.

--
Karl Williamson

@p5pRT
Copy link
Author

p5pRT commented Aug 25, 2017

From @khwilliamson

0001-regcomp-perl-131582.patch
From 4b116c2495aacf6f869c1ea4416858680f76ca17 Mon Sep 17 00:00:00 2001
From: Karl Williamson <khw@cpan.org>
Date: Wed, 21 Jun 2017 11:33:37 -0600
Subject: [PATCH] regcomp [perl #131582]

---
 regcomp.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/regcomp.c b/regcomp.c
index 5a9e56b080..6a07bf2c70 100644
--- a/regcomp.c
+++ b/regcomp.c
@@ -13321,6 +13321,7 @@ S_regatom(pTHX_ RExC_state_t *pRExC_state, I32 *flagp, U32 depth)
                             goto loopdone;
                         }
                         p = RExC_parse;
+                        RExC_parse = parse_start;
                         if (ender > 0xff) {
                             REQUIRE_UTF8(flagp);
                         }
-- 
2.11.0

@p5pRT
Copy link
Author

p5pRT commented Sep 10, 2017

From @steve-m-hay

Now in blead as commit 96c83ed. Will shortly also be in 5.24.3-RC1 and 5.26.1-RC1...

1 similar comment
@p5pRT
Copy link
Author

p5pRT commented Sep 10, 2017

From @steve-m-hay

Now in blead as commit 96c83ed. Will shortly also be in 5.24.3-RC1 and 5.26.1-RC1...

@p5pRT
Copy link
Author

p5pRT commented Sep 11, 2017

From @xsawyerx

This was set to be disclosed on September 22nd. Disclosure list was informed.

On 10 September 2017 at 16​:16, Steve Hay via RT <rt-comment@​perl.org> wrote​:

Now in blead as commit 96c83ed. Will shortly also be in 5.24.3-RC1 and 5.26.1-RC1...

@p5pRT
Copy link
Author

p5pRT commented Sep 11, 2017

From @tonycoz

On Fri, 11 Aug 2017 17​:53​:41 -0700, tonyc wrote​:

On Fri, 11 Aug 2017 03​:16​:45 -0700, tonyc wrote​:

As requested on IRC I've applied for a CVE ID for this issue.

This is CVE-2017-12837.

The details I entered when requesting the CVE ID​:

[Suggested description]
Heap buffer write overflow compiling tailored regular expression
------------------------------------------
[Vulnerability Type]
Buffer Overflow

------------------------------------------
[Vendor of Product]
Perl5 Porters
------------------------------------------
[Affected Product Code Base]
perl - 5.26.0
------------------------------------------
[Affected Component]

------------------------------------------
[Attack Type]
Local

------------------------------------------
[Impact ]

[-] CVE_Request.Impact_Code_execution
[-] CVE_Request.Impact_Denial_of_Service
[-] CVE_Request.Impact_Escalation_of_Privileges
[-] CVE_Request.Impact_Information_Disclosure
------------------------------------------
[Attack Vectors]

------------------------------------------
[Reference ]

------------------------------------------
[Discoverer ]
Jakub Wilk <jwilk@​jwilk.net>

Proposed update for the CVE entry once the issue is public (the field names are from the CVE allocation form)​:

Affected components​:

Regular expression compiler, S_regatom() in regcomp.c.

Attack vector​:

An attacker can provide a crafted regular expression with a
case-insensitive match with an "exactish" string match over 127
bytes including a \N{} escape and overwrite memory beyond the end
of a heap allocated block, possibly crashing perl.

Discoverer​:

Jakub Wilk <jwilk@​jwilk.net> (no change)

Affected Product Code Base​:

perl - 5.26.0, fixed in 5.26.1
perl - 5.24.x, fixed in 5.24.3
perl - 5.22.x and earlier

References​:

https://rt.perl.org/Public/Bug/Display.html?id=131582

Impact​:

[+] CVE_Request.Impact_Denial_of_Service

Additional information​:

(none)

Tony

@p5pRT
Copy link
Author

p5pRT commented Sep 24, 2017

From @tonycoz

On Mon, 11 Sep 2017 16​:35​:32 -0700, tonyc wrote​:

Affected Product Code Base​:

perl - 5.26.0, fixed in 5.26.1
perl - 5.24.x, fixed in 5.24.3
perl - 5.22.x and earlier

perl - 5.26.0, fixed in 5.26.1
perl - 5.24.x, fixed in 5.24.3
perl - 5.18.0 through 5.22.x

Tony

@p5pRT
Copy link
Author

p5pRT commented Sep 25, 2017

@xsawyerx - Status changed from 'open' to 'resolved'

@p5pRT p5pRT closed this as completed Sep 25, 2017
@p5pRT
Copy link
Author

p5pRT commented Sep 25, 2017

From @xsawyerx

Now open.

@p5pRT
Copy link
Author

p5pRT commented Sep 25, 2017

From @tonycoz

On Mon, 25 Sep 2017 03​:11​:22 -0700, xsawyerx@​cpan.org wrote​:

Now open.

Update to CVE requested.

Tony

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant