Skip Menu |
Report information
Id: 129342
Status: resolved
Priority: 0/
Queue: perl5

Owner: Nobody
Requestors: brian.carpenter [at] gmail.com
Cc:
AdminCc:

Operating System: (no value)
PatchStatus: (no value)
Severity: (no value)
Type: (no value)
Perl Version: (no value)
Fixed In: (no value)



Date: Fri, 23 Sep 2016 14:16:05 -0500
To: perl5-security-report [...] perl.org
From: "Brian 'geeknik' Carpenter" <brian.carpenter [...] gmail.com>
Subject: heap-buffer-overflow S_scan_const (toke.c:2975)
Download (untitled) / with headers
text/plain 2.5k
Triggered in Perl v5.25.5-8-g3c42ae1 with AFL+ASAN.

./perl -e 'y%\o-0%%'

=================================================================
==7216==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x60200000e1af at pc 0x000000681ac7 bp 0x7fff7a36dd50 sp 0x7fff7a36dd48
READ of size 1 at 0x60200000e1af thread T0
    #0 0x681ac6 in S_scan_const /root/perl/toke.c:2975:21
    #1 0x615b1f in Perl_yylex /root/perl/toke.c:4774:10
    #2 0x6addee in Perl_yyparse /root/perl/perly.c:334:19
    #3 0x59c491 in S_parse_body /root/perl/perl.c:2374:9
    #4 0x59282c in perl_parse /root/perl/perl.c:1689:2
    #5 0x4de5d5 in main /root/perl/perlmain.c:121:18
    #6 0x7f15f31aab44 in __libc_start_main /build/glibc-uPj9cH/glibc-2.19/csu/libc-start.c:287
    #7 0x4de26c in _start (/root/perl/perl+0x4de26c)

0x60200000e1af is located 1 bytes to the left of 10-byte region [0x60200000e1b0,0x60200000e1ba)
allocated by thread T0 here:
    #0 0x4c0beb in malloc (/root/perl/perl+0x4c0beb)
    #1 0x7f8617 in Perl_safesysmalloc /root/perl/util.c:153:21

SUMMARY: AddressSanitizer: heap-buffer-overflow /root/perl/toke.c:2975 S_scan_const
Shadow bytes around the buggy address:
  0x0c047fff9be0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff9bf0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff9c00: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff9c10: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff9c20: fa fa fa fa fa fa fa fa fa fa fa fa fa fa 00 00
=>0x0c047fff9c30: fa fa fd fd fa[fa]00 02 fa fa 00 04 fa fa 01 fa
  0x0c047fff9c40: fa fa 05 fa fa fa 00 02 fa fa 00 02 fa fa 00 04
  0x0c047fff9c50: fa fa 02 fa fa fa 00 02 fa fa 00 07 fa fa 00 fa
  0x0c047fff9c60: fa fa 00 02 fa fa 05 fa fa fa 00 02 fa fa 06 fa
  0x0c047fff9c70: fa fa 00 02 fa fa 05 fa fa fa 00 05 fa fa 04 fa
  0x0c047fff9c80: fa fa 05 fa fa fa 05 fa fa fa 00 00 fa fa 00 02
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Heap right redzone:      fb
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack partial redzone:   f4
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  ASan internal:           fe
==7216==ABORTING

A non-ASAN build of 5.20.2 errors out:
Missing braces on \o{} at -e line 1, within string
Execution of -e aborted due to compilation errors.
RT-Send-CC: rt-deliver-to-perl5-security-report [...] rt.perl.org
Download (untitled) / with headers
text/plain 553b
The tr/// range constructor in scan_const knows that the characters representing start and end of the range have been written to sv; however on parse errors we don't do that but still continue. The effect is to read a (possibly utf8) character before the start of sv's PV, but then still give the parse error. I don't think this is a security concern; if someone gives a +1 I'll push the patch and open up the ticket. The attached patch fixes it, so valgrind shows no error on 'y//\o-x/' nor on 'y//x-\o/', but I can't think how to write a test. Hugo
Subject: perl-129342
Download perl-129342
application/octet-stream 988b

Message body not shown because it is not plain text.

RT-Send-CC: rt-deliver-to-perl5-security-report [...] rt.perl.org
Download (untitled) / with headers
text/plain 679b
On Wed, 05 Oct 2016 07:08:23 -0700, hv wrote: Show quoted text
> The tr/// range constructor in scan_const knows that the characters > representing start and end of the range have been written to sv; > however on parse errors we don't do that but still continue. > > The effect is to read a (possibly utf8) character before the start of > sv's PV, but then still give the parse error. I don't think this is a > security concern; if someone gives a +1 I'll push the patch and open > up the ticket. > > The attached patch fixes it, so valgrind shows no error on 'y//\o-x/' > nor on 'y//x-\o/', but I can't think how to write a test.
Your fix looks fine to me. Patch with a test attached. Tony
Subject: 0001-perl-129342-test-for-buffer-overflow.patch
From d8435d6bd3614c149b02ee591dba8d7f769843fc Mon Sep 17 00:00:00 2001 From: Tony Cook <tony@develop-help.com> Date: Tue, 17 Jan 2017 11:52:53 +1100 Subject: (perl #129342) test for buffer overflow --- t/lib/croak/toke | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/t/lib/croak/toke b/t/lib/croak/toke index d35eab6..f1817b3 100644 --- a/t/lib/croak/toke +++ b/t/lib/croak/toke @@ -350,3 +350,10 @@ EXPECT syntax error at - line 4, near "Ï¡ time" (Might be a runaway multi-line Ï¡Ï¡ string starting on line 3) Execution of - aborted due to compilation errors. +######## +# NAME tr/// handling of mis-formatted \o characters +# may only fail with ASAN +tr/\o-0//; +EXPECT +Missing braces on \o{} at - line 2, within string +Execution of - aborted due to compilation errors. -- 2.1.4
RT-Send-CC: rt-deliver-to-perl5-security-report [...] rt.perl.org
Download (untitled) / with headers
text/plain 237b
On Mon, 16 Jan 2017 16:53:51 -0800, tonyc wrote: Show quoted text
> Your fix looks fine to me. > > Patch with a test attached.
Thanks Tony, I've pushed my fix and your test as 3dd4eaeb8a and 59143e29a7, and will move the ticket to the open queue. Hugo
RT-Send-CC: perl5-porters [...] perl.org
Download (untitled) / with headers
text/plain 1023b
I now think this is the wrong solution. There are other occurrences of where there is an error in the parse and we continue on, so that the tr code will have invalid inputs. Rather than fix them all (and we might miss some), why not have the tr code check if there has been an error, and if so, quit now. instead of forging ahead and getting into trouble? Trying to continue parsing in the face of errors is problematic. There are times when we can't obviously continue, and there are times when we can continue, so as to find the most number of problems in a single run. (I remember how frustrating it was to work with SNOBOL that quit on the first error. This was compounded by the fact that it was punch cards, and the turnaround was 8 hours per job, except in the middle of the night.) And there are times when we think we can continue, but it creates problems. This is one of them. And we could continue except in the case of tr. So I think the tr code should be the part that changes. -- Karl Williamson
RT-Send-CC: perl5-porters [...] perl.org
Download (untitled) / with headers
text/plain 1.6k
On Fri, 20 Jan 2017 10:49:40 -0800, khw wrote: Show quoted text
> I now think this is the wrong solution. > > There are other occurrences of where there is an error in the parse > and we continue on, so that the tr code will have invalid inputs. > Rather than fix them all (and we might miss some), why not have the tr > code check if there has been an error, and if so, quit now. instead of > forging ahead and getting into trouble? > > Trying to continue parsing in the face of errors is problematic. > There are times when we can't obviously continue, and there are times > when we can continue, so as to find the most number of problems in a > single run. (I remember how frustrating it was to work with SNOBOL > that quit on the first error. This was compounded by the fact that it > was punch cards, and the turnaround was 8 hours per job, except in the > middle of the night.) And there are times when we think we can > continue, but it creates problems. This is one of them. And we could > continue except in the case of tr. So I think the tr code should be > the part that changes.
I'm happy to consider replacing my fix with such an approach - do you have an implementation to show? "Wrong" feels like an overstatement though, and I don't think further improvements belong on this ticket. More generally, scan_const as a whole has become unwieldy enough to merit some refactoring. And in considering the current complexity, I'm also looking askance at your recently added cases for range_min == range_max and range_min + 1 == range_max: a small compile-time benefit for code that nobody would normally write does not seem worth any added complexity to the implementation. Hugo
From: Karl Williamson <public [...] khwilliamson.com>
To: perlbug-followup [...] perl.org
Date: Sat, 21 Jan 2017 15:43:50 -0700
Subject: Re: [perl #129342] heap-buffer-overflow S_scan_const (toke.c:2975)
CC: perl5-porters [...] perl.org
Download (untitled) / with headers
text/plain 3.6k
On 01/21/2017 03:29 AM, Hugo van der Sanden via RT wrote: Show quoted text
> On Fri, 20 Jan 2017 10:49:40 -0800, khw wrote:
>> I now think this is the wrong solution. >> >> There are other occurrences of where there is an error in the parse >> and we continue on, so that the tr code will have invalid inputs. >> Rather than fix them all (and we might miss some), why not have the tr >> code check if there has been an error, and if so, quit now. instead of >> forging ahead and getting into trouble? >> >> Trying to continue parsing in the face of errors is problematic. >> There are times when we can't obviously continue, and there are times >> when we can continue, so as to find the most number of problems in a >> single run. (I remember how frustrating it was to work with SNOBOL >> that quit on the first error. This was compounded by the fact that it >> was punch cards, and the turnaround was 8 hours per job, except in the >> middle of the night.) And there are times when we think we can >> continue, but it creates problems. This is one of them. And we could >> continue except in the case of tr. So I think the tr code should be >> the part that changes.
> > I'm happy to consider replacing my fix with such an approach - do you have an implementation to show? "Wrong" feels like an overstatement though, and I don't think further improvements belong on this ticket.
Perhaps you take the word "wrong" more seriously than I meant it. The fix fixes the bug reported in this ticket. In that sense it's right. But in thinking about it, there appear to be similar issues that weren't addressed by this ticket,l and I think the approach you took could be extended to fix these, but we might miss some, or some might get added later by someone not familiar with the nuances. I was suggesting a different approach, as I believe that the tr code is the only part of this routine that relies on reading the results of previous loop iterations. I hadn't come up with an implementation, but it would essentially be done by looking at the syntax error count, at the point where it otherwise would be looking at the results of the previous iterations, and if that count was not zero, bailing out immediately. Perhaps this could be abstracted into a macro or function. I don't like the asymmetry currently where it calls yyerror or Perl_croak depending on whether to bail out later or now. I would prefer there to be something like yyerror_quit_now(), and it might display a message to the effect of not being able to continue parsing. Show quoted text
> > More generally, scan_const as a whole has become unwieldy enough to merit some refactoring.
It's been unwieldy longer than I've been involved in this project; and in fact a comment last touched by Dave Mitchell says it is "terrifying code". I've tried to help by adding comments. I haven't spent any time considering how to refactor, other than to extract out some code that was common to regcomp.c and this function (and with subtle variations) into separate functions. As always, patches welcome. And in considering the current complexity, I'm also looking askance at your recently added cases for range_min == range_max and range_min + 1 == range_max: a small compile-time benefit for code that nobody would normally write does not seem worth any added complexity to the implementation. You're right, but the reason I did this is that there is now code below that doesn't work correctly unless the range is at least 3 elements long. Rather than special casing that, it seemed to me preferable to just handle the cases earlier when it actually would be of some (albeit uncommon) benefit. Show quoted text
> > Hugo > > --- > via perlbug: queue: perl5 status: pending release > https://rt.perl.org/Ticket/Display.html?id=129342 >
Date: Tue, 14 Feb 2017 14:23:54 -0700
CC: perl5-porters [...] perl.org
To: perlbug-followup [...] perl.org
Subject: Re: [perl #129342] heap-buffer-overflow S_scan_const (toke.c:2975)
From: Karl Williamson <public [...] khwilliamson.com>
Download (untitled) / with headers
text/plain 6.2k
On 01/21/2017 03:43 PM, Karl Williamson wrote: Show quoted text
> On 01/21/2017 03:29 AM, Hugo van der Sanden via RT wrote:
>> On Fri, 20 Jan 2017 10:49:40 -0800, khw wrote:
>>> I now think this is the wrong solution. >>> >>> There are other occurrences of where there is an error in the parse >>> and we continue on, so that the tr code will have invalid inputs. >>> Rather than fix them all (and we might miss some), why not have the tr >>> code check if there has been an error, and if so, quit now. instead of >>> forging ahead and getting into trouble? >>> >>> Trying to continue parsing in the face of errors is problematic. >>> There are times when we can't obviously continue, and there are times >>> when we can continue, so as to find the most number of problems in a >>> single run. (I remember how frustrating it was to work with SNOBOL >>> that quit on the first error. This was compounded by the fact that it >>> was punch cards, and the turnaround was 8 hours per job, except in the >>> middle of the night.) And there are times when we think we can >>> continue, but it creates problems. This is one of them. And we could >>> continue except in the case of tr. So I think the tr code should be >>> the part that changes.
>> >> I'm happy to consider replacing my fix with such an approach - do you >> have an implementation to show? "Wrong" feels like an overstatement >> though, and I don't think further improvements belong on this ticket.
> > Perhaps you take the word "wrong" more seriously than I meant it. The > fix fixes the bug reported in this ticket. In that sense it's right. > But in thinking about it, there appear to be similar issues that weren't > addressed by this ticket,l and I think the approach you took could be > extended to fix these, but we might miss some, or some might get added > later by someone not familiar with the nuances. I was suggesting a > different approach, as I believe that the tr code is the only part of > this routine that relies on reading the results of previous loop > iterations. > > I hadn't come up with an implementation, but it would essentially be > done by looking at the syntax error count, at the point where it > otherwise would be looking at the results of the previous iterations, > and if that count was not zero, bailing out immediately. Perhaps this > could be abstracted into a macro or function. I don't like the > asymmetry currently where it calls yyerror or Perl_croak depending on > whether to bail out later or now. I would prefer there to be something > like yyerror_quit_now(), and it might display a message to the effect of > not being able to continue parsing.
>> >> More generally, scan_const as a whole has become unwieldy enough to >> merit some refactoring.
> > It's been unwieldy longer than I've been involved in this project; and > in fact a comment last touched by Dave Mitchell says it is "terrifying > code". I've tried to help by adding comments. I haven't spent any time > considering how to refactor, other than to extract out some code that > was common to regcomp.c and this function (and with subtle variations) > into separate functions. As always, patches welcome. > And in considering the current complexity, I'm also looking askance at > your recently added cases for range_min == range_max and range_min + 1 > == range_max: a small compile-time benefit for code that nobody would > normally write does not seem worth any added complexity to the > implementation. > > You're right, but the reason I did this is that there is now code below > that doesn't work correctly unless the range is at least 3 elements > long. Rather than special casing that, it seemed to me preferable to > just handle the cases earlier when it actually would be of some (albeit > uncommon) benefit. >
>>
In the end, I came to the conclusion that Hugo's original approach is probably the best. I pushed 6b58f9be8c82394d776908727ff4e9d582f87f12 to that effect: commit 6b58f9be8c82394d776908727ff4e9d582f87f12 Author: Karl Williamson <khw@cpan.org> Date: Mon Feb 13 19:22:59 2017 -0700 * toke.c: Make sure things are initialized Commit 3dd4eaeb8ac39e08179145b86aedda36584a3509 fixed a bug wherein the tr/// operator parsing code could be looking at uninitialized data. This happens only because we try to carry on when we find errors, so as to find as many errors as possible in a single run, as a convenience to the person debugging the script being compiled. And we failed to initialize stuff upon getting an error; stuff that was later looked at by tr///. That commit fixed the ticket by making sure the things mentioned there got initialized upon error, but didn't handle the various other places in the loop where the same thing could happen. At the time, I thought it would be easier to instead change the tr/// handling code to know that its inputs were problematic, and to avoid looking at them in that case. This is easily done, and would automatically catch all the cases in the loop, now and any added in the future. But then I thought, maybe tr/// isn't the only operator that could be thrown off by this. It is the most obvious one, to someone who knows how it goes about getting compiled; but there may be other operators that I don't know how they get compiled and have the same or a similar problem. The better solution then would be to extend 3dd4eaeb8ac39e08179145b86aedda36584a3509 to make sure everything gets initialized when there is an error. That is what this current commit does. The previous few commits have refactored things so as to minimize the number of places that need to be handled here, down to three. I kinda doubt that new constructs will be added, at this stage in the language development, that would require the same initialization handling. But, if they were, hopefully those doing it would follow the existing paradigm that this commit and 3dd4eaeb8ac39e08179145b86aedda36584a3509 establish. Another way to handle this would have been to, instead of doing an initialize-and-'continue', to instead jump to a common label at the bottom of the loop which does the initialization. I think it doesn't matter much which, so left it as this.
Download (untitled) / with headers
text/plain 313b
Thank you for filing this report. You have helped make Perl better. With the release today of Perl 5.26.0, this and 210 other issues have been resolved. Perl 5.26.0 may be downloaded via: https://metacpan.org/release/XSAWYERX/perl-5.26.0 If you find that the problem persists, feel free to reopen this ticket.


This service is sponsored and maintained by Best Practical Solutions and runs on Perl.org infrastructure.

For issues related to this RT instance (aka "perlbug"), please contact perlbug-admin at perl.org