New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Heap buffer overflow #17116
Comments
From nguyenmanhdung1710@gmail.comHi All, Thanks, ====================================== - Crafted PoC: ASAN says: 0x7fdd62cd4e35 is located 13877 bytes inside of 617548-byte region previously allocated by thread T0 here:
|
From nguyenmanhdung1710@gmail.comHi All, Thanks, ====================================== - Crafted PoC: ASAN says: 0x7f0d2e16a7e5 is located 27 bytes to the left of 1197244-byte region previously allocated by thread T0 here:
|
From @jkeenanOn Thu, 01 Aug 2019 07:15:46 GMT, nguyenmanhdung1710@gmail.com wrote:
Thank you for this report. However, in this report -- as well as the five other reports you filed today -- it's not clear what the perl program was that you used to generated the error or crash. Could you please *attach* the program you used to this ticket (or to your email response)? And do so likewise for the other five tickets? Thank you very much. -- |
The RT System itself - Status changed from 'new' to 'open' |
From nguyenmanhdung1710@gmail.comOn Thu, 01 Aug 2019 11:25:14 -0700, jkeenan wrote:
Hi James, git clone git://perl5.git.perl.org/perl.git perl-headgit log -n 1 HEAD autodoc.pl: Forget heuristics, we have a flag
|
From nguyenmanhdung1710@gmail.comOn Thu, 01 Aug 2019 00:16:00 -0700, nguyenmanhdung1710@gmail.com wrote:
As requested by James E Keenan, I add the binaries of Perl (commit 45f8e7b on the branch blead): |
From nguyenmanhdung1710@gmail.comHi James, git clone git://perl5.git.perl.org/perl.git perl-head*git log -n 1 HEAD* autodoc.pl: Forget heuristics, we have a flag We know if something is a macro because of the 'm' flag. Don't use the
|
From nguyenmanhdung1710@gmail.comOn Thu, 01 Aug 2019 21:21:28 -0700, nguyenmanhdung1710@gmail.com wrote:
Hi James, Sorry, I misunderstood your question. I found these bugs using fuzzing, hence the PoC inputs are crafted. I will try to create Perl programs triggering these bugs, but it is not easy for me. Also, by looking at the stacktrace, I think we can understand the root cause and propose patches. Thanks, |
From @tonycozAttached the PoC to save people time. |
The RT System itself - Status changed from 'new' to 'open' |
From @hvdsOn Sun, 11 Aug 2019 23:09:07 -0700, tonyc wrote:
Here's a shorter version. I was unaware of the compounding effect of multiple \Q - I would have expected "\Q\Q+" to result in '\Q\+' rather than '\\\+' - and I suspect the long string resulting is a prime mover in this bug. perl -e ' Hugo |
From @hvdsOn Mon, 12 Aug 2019 05:46:19 -0700, hv wrote:
Er, I would have expected '\\Q\+', I mean. |
From @khwilliamsonI vaguely recall an issue with \Q compounding years ago Sent from my iPhone
|
From @hvdsOn Mon, 12 Aug 2019 05:46:19 -0700, hv wrote:
We can simplify the printf pattern to "/\\1|%s%s%s /i", and we can also replace "\xff" with "\\xff" leaving us just simple ascii to pass around. Starting to trace this through, I note that the removal of the sizing pass appears to lead to under-allocation of RExC_rxi up front - RExC_size is counted in regnodes, but we allocate only that many chars. Fixing that doesn't solve this out-of-bounds read, but I imagine we probably want it in any case . I have a feeling Tony commented on this already, but I can't find the message - Karl, have you looked at this already? Hugo Inline Patchdiff --git a/regcomp.c b/regcomp.c
index cf9246473f..f8bef77f0b 100644
--- a/regcomp.c
+++ b/regcomp.c
@@ -7671,11 +7671,11 @@ Perl_re_op_compile(pTHX_ SV ** const patternp, int pat_count,
RExC_size = STR_SZ(RExC_end - RExC_start);
}
- Newxc(RExC_rxi, sizeof(regexp_internal) + RExC_size, char, regexp_internal);
+ Newxc(RExC_rxi, sizeof(regexp_internal) + RExC_size * sizeof(regnode), char, regexp_internal);
if ( RExC_rxi == NULL )
FAIL("Regexp out of space");
- Zero(RExC_rxi, sizeof(regexp_internal) + RExC_size, char);
+ Zero(RExC_rxi, sizeof(regexp_internal) + RExC_size * sizeof(regnode), char);
RXi_SET( RExC_rx, RExC_rxi );
/* We start from 0 (over from 0 in the case this is a reparse. The first |
From @tonycozOn Mon, Aug 12, 2019 at 07:35:59AM -0700, Hugo van der Sanden via RT wrote:
That was against #134329, which feels similar to this. After some discussion with Karl in IRC the bytes vs regnodes shouldn't Tony |
From @khwilliamsonOn Mon, 12 Aug 2019 05:47:29 -0700, hv wrote:
Using Tony's, case, this is not the same thing as #134329. But again, it's caused by the same lack of switching to long jumps. To reduce the number of passes, if we're already having to do a pass because we need to count the parenthetical groups, the code doesn't restart immediately when it decides it needs to use long jumps. This works, except in a few cases, it can't really keep going. And this ticket is one of them. The easiest thing is to restart immediately even when counting parens. This adds a pass. The other way would be to restart immediately only in those cases where it might be needed. I have to examine the code to decide how clear cut that is. If it isn't, err on the side of safety and always restart immediately. |
From @khwilliamsonOn Mon, 12 Aug 2019 07:49:58 -0700, tonyc wrote:
No this isn't the cause of the problems. I had thought the comments explained it adequately: /* On the first pass of the parse, we guess how big this will be. Then The point of this is to allocate a bunch of memory in one gulp. Then we give it back, and hope that the system isn't so busy that other processes gobble it up, as we allocate what we actually need as we go along parsing. I used the guess of one output byte + overhead per one input byte. That's the only thing we know about the string at this point: its length. I have changed other places in the core to do the same. But better heuristics are welcome. On small patterns, whatever we do won't matter much, but on larger patterns, my guess is that much of that space is literal characters that need to be matched, like in DNA sequences. So I think this is a reasonable guess. We could omit this entirely, of course. But by doing it, we make sure that the system has a sufficient amount of memory available at the beginning, and I think we reduce the number of mallocs that actually go out and have to consolidate free space. -- |
From @tonycozOn Fri, 23 Aug 2019 13:37:55 -0700, khw wrote:
If I understand a minimal exact match regexp will have 3 nodes: - the REG_MAGIC node at the front (regcomp.c:7717), but this is included in the size of the regexp_internal structure, so we don't need to count it) so a better initial allocation might be to allocate 2 nodes plus the nodes needed for the text. Right now for a 1 to 4 length input we allocate only enough for regexp_internal with it's built-in zeroth node, and 1 byte more. As soon as we try to emit something that will be reallocated, and if the end of program[] is on an allocation boundary that requires a new block to be allocated (as opposed to there being enough free space in the block). All that said, the performance impact is a single realloc() per compiled regexp, so it shouldn't be a big deal. I suspect the structure of change_engine_size() is more of a problem, since it (and its callers) mostly seem to allocate what's needed right now, so emitting 100 nodes might end up doing 100 realloc calls. $ gdb --args ./miniperl -Dr -e '/\w\W+\w+/' Breakpoint 1, S_change_engine_size (pRExC_state=0x7fffffffdd10, size=1) Breakpoint 1, S_change_engine_size (pRExC_state=0x7fffffffdd10, size=1) Breakpoint 1, S_change_engine_size (pRExC_state=0x7fffffffdd10, size=1) Breakpoint 1, S_change_engine_size (pRExC_state=0x7fffffffdd10, size=1) Breakpoint 1, S_change_engine_size (pRExC_state=0x7fffffffdd10, size=1) Breakpoint 1, S_change_engine_size (pRExC_state=0x7fffffffdd10, size=1) (exactish nodes allocate 64 nodes to start with and cause less churn.)
I think allocating a bit more space wouldn't hurt, any extra space is returned if the realloc implementation supports it. Tony |
From @khwilliamsonFixed by 439a3bf Thanks for finding and reporting this |
@khwilliamson - Status changed from 'open' to 'pending release' |
From nguyenmanhdung1710@gmail.comOn Mon, 26 Aug 2019 21:01:25 -0700, khw wrote:
Thanks for the patch. Do you think it is an exploitable bug? Can I request a CVE for this bug? |
From @khwilliamsonOn Thu, 01 Aug 2019 22:31:26 -0700, nguyenmanhdung1710@gmail.com wrote:
This was fixed by commit 439a3bf PATCH: [perl #134325] Heap buffer overflow so I'm merging this ticket into that ticket |
From nguyenmanhdung1710@gmail.comThanks for the patch. Do you think it is an exploitable bug? Can I request Le mar. 27 août 2019 à 06:01, Karl Williamson via RT <
|
From @xsawyerxOn Sat, 31 Aug 2019 11:15:36 -0700, nguyenmanhdung1710@gmail.com wrote:
Hi, I'm quoting Tony Cook here: All cases for both tickets are bad reads, either of freed memory, or None of the reads result in returning data to a potential attacker According to our usual criteria such reads aren't a security issue. Can an attacker craft a regexp with data at the offset 65535 point to I'm not sure. While we are looking into this, we would appreciate any help in proving this. If we can answer Tony's questions, we can discern better if this suits as a security issue. |
Migrated from rt.perl.org#134325 (status was 'pending release')
Searchable as RT134325$
The text was updated successfully, but these errors were encountered: