New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simple Regex causes SEGV when run on specific data #8231
Comments
From ralphbolton@mail2Sexy.comCreated by ralphbolton@mail2sexy.comI've managed to track down a problem with Perl 5.8.6 (perl-5.8.6-18, RPM The problem seems to be in running a fairly simple regex on a specific In case Perlbug has any transmission issues, see Perl Info
|
From @iabynOn Sun, Dec 04, 2005 at 03:26:23PM -0800, Ralph Bolton wrote:
The code can be reduced to the following (reading from a var rather than a my $s = "\xa2\xf8"; open F, "<:utf8",\$s; outputs: utf8 "\xA2" does not map to Unicode at /tmp/p3 line 6, <F> line 1. Presumably feeding in malformed utf8 is tripping something over. I haven't -- |
The RT System itself - Status changed from 'new' to 'open' |
From @nwc10On Mon, Dec 05, 2005 at 12:33:18PM +0000, Dave Mitchell wrote:
I can't get it to reliably crash with that minimal input on FreeBSD. With Program received signal SIGBUS, Bus error. That Move will expand to a call to memmove(d, s, i+1) - clearly a length of -1 Nicholas Clark |
From BQW10602@nifty.comOn Mon, 5 Dec 2005 12:41:18 +0000, Nicholas Clark <nick@ccl4.org> wrote
utf8n_to_uvchr returns 0 for the malformed utf-8. Therefore regexec.c:S_reginclass (falsely?) matches [\000] #!perl Perhaps it may be better reginclass() croaks malformed utf-8. Regards, But the error message is not good, Inline Patch--- regexec.c~ Wed Nov 30 23:24:19 2005
+++ regexec.c Tue Dec 06 00:14:56 2005
@@ -4710,9 +4710,13 @@
STRLEN len = 0;
STRLEN plen;
- if (do_utf8 && !UTF8_IS_INVARIANT(c))
- c = utf8n_to_uvchr(p, UTF8_MAXBYTES, &len,
- ckWARN(WARN_UTF8) ? 0 : UTF8_ALLOW_ANY);
+ if (do_utf8 && !UTF8_IS_INVARIANT(c)) {
+ c = utf8n_to_uvchr(p, UTF8_MAXBYTES, &len,
+ ckWARN(WARN_UTF8) ? UTF8_CHECK_ONLY :
+ UTF8_ALLOW_ANYUV|UTF8_CHECK_ONLY);
+ if (len == (STRLEN)-1)
+ Perl_croak(aTHX_ "Malformed UTF-8 character (fatal)");
+ }
plen = lenp ? *lenp : UNISKIP(NATIVE_TO_UNI(c));
if (do_utf8 || (flags & ANYOF_UNICODE)) {
### END OF PATCH |
From BQW10602@nifty.comOops, I reply to the prev mail of myself..
For UTF8_ALLOW_ANY allows all kinds of malformed utf8,
Regards, |
From @rgsSADAHIRO Tomoyuki wrote:
OK, I've applied this as #26258, thanks.
It should even be (F) (S utf8) since it's enabled by default.
|
@rgs - Status changed from 'open' to 'resolved' |
From BQW10602@nifty.com
[perl #37836] has been resolved by the change 26258. In the above perl script, reginclass() against the string was called UTF8SKIP at "\xa2" is 1 and UTF8SKIP at "\xf8" is 5 (see utf8.h). In find_byclass(), "\xa2" passed the test of (s + uskip <= strend). In regrepeat() which was called from the regtry() indirectly, //// regexec.c#find_byclass case ANYOF: //// regexec.c#regrepeat case ANYOF: At last pp_subst() tried to Move() a very huge size of chunk. //// pp_hot.c#pp_subst /* can do inplace substitution? */ Conclusions: - UTF8SKIP is certainly fast (since it reads only the first octet) Since change 26258 reginclass() croaks malformed utf8. If reginclass() Regards, |
@khwilliamson - Status changed from 'resolved' to 'open' |
From @khwilliamsonThis bug may be fixed, but the test for it in t/re/pat_rt_report.t is I'm not sure how to fix this without further research. --Karl Williamson |
From [Unknown Contact. See original ticket]This bug may be fixed, but the test for it in t/re/pat_rt_report.t is I'm not sure how to fix this without further research. --Karl Williamson |
From @khwilliamsonThe bug had been fixed, but the test was defective, now fixed by 7602410 |
@khwilliamson - Status changed from 'open' to 'pending release' |
From @khwilliamsonThank you for filing this report. You have helped make Perl better. With the release today of Perl 5.26.0, this and 210 other issues have been Perl 5.26.0 may be downloaded via: If you find that the problem persists, feel free to reopen this ticket. |
@khwilliamson - Status changed from 'pending release' to 'resolved' |
Migrated from rt.perl.org#37836 (status was 'resolved')
Searchable as RT37836$
The text was updated successfully, but these errors were encountered: