New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UTF-8 string substitution corrupts memory #7724
Comments
From sroy@search-box.comThis is a bug report for perl from sroy@search-box.com, The following test program corrupts perl's memory: #!/usr/bin/perl use Encode; $_ = decode_utf8('title: �Ã�¿�Â��Ã�¿�Ã�¿�Â��Â��Ã�¿�Â��Â��Ã�¿�Ã�¿�Â��Ã�¿�Â��Ã�¿�Ã�¿�Â�', 1); Since it's a memory corruption, it may or may not crash when running 1. gdb debugperl (a perl compiled with debugging on) The breakpoint stops the first time perl needs to check whether a 6. fin Now the debugger is sitting at the line that corrupts prog->startp. ANALYSIS: In the middle of processing the regular expression, The regex library WORKAROUND: Adding a 'use utf8' pragma at the top of the program seems to load everything Scott Flags: Site configuration information for perl v5.8.4: Configured by Debian Project at Mon Oct 25 01:52:37 EST 2004. Summary of my perl5 (revision 5 version 8 subversion 4) configuration: Locally applied patches: @INC for perl v5.8.4: Environment for perl v5.8.4: |
From hsr@cs.stanford.eduIt looks like something in the mail chain doesn't like bytes with their high order bits set. Here's a corrected test program that's ASCII. Scott #!/usr/bin/perl use Encode; $_ = decode_utf8("\164\151\164\154\145\72\40\303\277\302\200\303\277\303\277\302\202\302\222\303\277\302\217\302\220\303\277\303\277\302\210\303\277\302\201\303\277\303\277\302\202", 1); |
The RT System itself - Status changed from 'new' to 'open' |
From @nwc10On Sat, Dec 25, 2004 at 09:50:46PM -0000, sroy @ search-box. com wrote:
I can recreate this on OS X when running with the perl debugger. I can't
Thanks for the analysis, which seems to be spot on. (Seems, because I'm no Ideally we'd really like to re-write the regexp engine sufficiently to remove Currently there are kludges to save enough state to theoretically make the /* XXX Here's a total kludge. But we need to re-enter for swash routines. */ void but what doesn't make sense to me is why PL_bostr isn't being saved (or The realistic fix is going to be to make it save and restore correctly for Nicholas Clark |
From @nwc10On Thu, Dec 30, 2004 at 06:56:49PM +0000, Nicholas Clark wrote:
OK. I underestimate my brute force and ignorance. With a watchpoint on 0 Perl_pp_match (my_perl=0x800200) at pp_hot.c:1288 where the culprit is #9, which was exercising this code *before* saving the if (!gv_fetchmeth(stash, "SWASHNEW", 8, -1)) { /* demand load utf8 */ The appended patch seems to cure the problem for me, but I'm not confident Nicholas Clark ==== //depot/perl/utf8.c#212 - /Users/nick/p4perl/perl/utf8.c ==== Inline Patch--- /tmp/tmp.26186.0 Thu Dec 30 19:46:18 2004
+++ /Users/nick/p4perl/perl/utf8.c Thu Dec 30 19:24:46 2004
@@ -1581,6 +1581,8 @@ Perl_swash_init(pTHX_ char* pkg, char* n
HV *stash = gv_stashpvn(pkg, pkg_len, FALSE);
SV* errsv_save;
+ ENTER;
+ save_re_context();
if (!gv_fetchmeth(stash, "SWASHNEW", 8, -1)) { /* demand load utf8 */
ENTER;
errsv_save = newSVsv(ERRSV);
@@ -1601,10 +1603,8 @@ Perl_swash_init(pTHX_ char* pkg, char* n
PUSHs(sv_2mortal(newSViv(minbits)));
PUSHs(sv_2mortal(newSViv(none)));
PUTBACK;
- ENTER;
SAVEI32(PL_hints);
PL_hints = 0;
- save_re_context();
if (IN_PERL_COMPILETIME) {
/* XXX ought to be handled by lex_start */
SAVEI32(PL_in_my); |
From @hvdsNicholas Clark <nick@ccl4.org> wrote: I notice that this changes the order things are stacked; I'm not sure if Hugo |
From @nwc10On Fri, Dec 31, 2004 at 03:04:49PM +0000, hv@crypt.org wrote:
Aha. I was rather hoping that someone would be able to tell me if the changes if (!gv_fetchmeth(stash, "SWASHNEW", 8, -1)) { /* demand load utf8 */ except that seems to be wasteful, as it would mean doing all the save work Nicholas Clark |
From @iabynOn Fri, Dec 31, 2004 at 03:04:49PM +0000, hv@crypt.org wrote:
I'd have thought that the better approach would be to move the if (!gv_fetchmeth(stash, "SWASHNEW", 8, -1)) { /* demand load utf8 */ block further down to just above the line if (call_method("SWASHNEW", G_SCALAR)) then the call to Perl_load_module is protected by the PUSHSTACKi. Note -- |
From @iabynOn Fri, Dec 31, 2004 at 04:11:16PM +0000, Dave Mitchell wrote:
Nicolas just reminded of this outstanding issue from December, so I've Dave. -- Change 24084 by davem@davem-splatty on 2005/03/26 21:25:47 [perl #33185] UTF-8 string substitution corrupts memory Affected files ... ... //depot/perl/utf8.c#223 edit Differences ... ==== //depot/perl/utf8.c#223 (text) ==== @@ -1578,6 +1578,11 @@ + PUSHSTACKi(PERLSI_MAGIC); |
@iabyn - Status changed from 'open' to 'resolved' |
Migrated from rt.perl.org#33185 (status was 'resolved')
Searchable as RT33185$
The text was updated successfully, but these errors were encountered: