Skip Menu |
 
Report information
Id: 62646
Status: resolved
Priority: 0/
Queue: perl5

Owner: Nobody
Requestors: fbriere [at] fbriere.net
perlbug [at] plan9.de
skylar2 [at] u.washington.edu
Cc:
AdminCc:

Operating System: Linux
PatchStatus: (no value)
Severity: low
Type: core
Perl Version: 5.8.8
Fixed In: (no value)



Subject: Maximum string length with substr
Date: Thu, 22 Jan 2009 11:11:42 -0800
To: perlbug [...] perl.org
From: "SKYLAR S. THOMPSON" <skylar2 [...] gs.washington.edu>
Download (untitled) / with headers
text/plain 6.3k
This is a bug report for perl from skylar2@u.washington.edu, generated with the help of perlbug 1.35 running under perl v5.8.8. ----------------------------------------------------------------- [Please enter your report here] e've run into a problem where substr appears to have a maximum string length of 2^31 bytes even on 64-bit hosts (or at least AMD64 hosts). We've run into this problem with the Red Hat-supplied Perl 5.8.8 on RHEL5, and also a self-compiled install of 5.10.0. Here's a script to exercise the bug: === #!/usr/bin/perl use strict; use warnings; my($file,$chars,$text,$length); $file = "/dev/zero"; open(FILE, "< $file") or die "Can't open $file for reading: $!\n"; $text = q{ }; # Make sure there's at least one character to run substr on do { read(FILE,$chars,1); $text .= $chars; } while(substr($text,1,1)); # This appears to die when $text is 2GB close(FILE); $length = length($text); print "substr died when text was $length bytes long.\n"; === This will create this output: === substr outside of string at /net/gs/vol1/home/skylar2/cfm/scripts/nick/big_substr.pl line 13. substr died when text was 2147483649 bytes long. === [Please do not change anything below this line] ----------------------------------------------------------------- --- Flags: category=core severity=low --- This perlbug was built using Perl v5.8.8 in the Red Hat build system. It is being executed now by Perl v5.8.8 - Tue Oct 23 12:21:01 EDT 2007. Site configuration information for perl v5.8.8: Configured by Red Hat, Inc. at Tue Oct 23 12:21:01 EDT 2007. Summary of my perl5 (revision 5 version 8 subversion 8) configuration: Platform: osname=linux, osvers=2.6.9-55.0.9.elsmp, archname=x86_64-linux-thread-multi uname='linux hs20-bc1-7.build.redhat.com 2.6.9-55.0.9.elsmp #1 smp tue sep 25 02:16:15 edt 2007 x86_64 x86_64 x86_64 gnulinux ' config_args='-des -Doptimize=-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -Dversion=5.8.8 -Dmyhostname=localhost -Dperladmin=root@localhost -Dcc=gcc -Dcf_by=Red Hat, Inc. -Dinstallprefix=/usr -Dprefix=/usr -Dlibpth=/usr/local/lib64 /lib64 /usr/lib64 -Dprivlib=/usr/lib/perl5/5.8.8 -Dsitelib=/usr/lib/perl5/site_perl/5.8.8 -Dvendorlib=/usr/lib/perl5/vendor_perl/5.8.8 -Darchlib=/usr/lib64/perl5/5.8.8/x86_64-linux-thread-multi -Dsitearch=/usr/lib64/perl5/site_perl/5.8.8/x86_64-linux-thread-multi -Dvendorarch=/usr/lib64/perl5/vendor_perl/5.8.8/x86_64-linux-thread-multi -Darchname=x86_64-linux -Dvendorprefix=/usr -Dsiteprefix=/usr -Duseshrplib -Dusethreads -Duseithreads -Duselargefiles -Dd_dosuid -Dd_semctl_semun -Di_db -Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio -Dinstallusrbinperl=n -Ubincompat5005 -Uversiononly -Dpager=/usr/bin/less -isr -Dd_gethostent_r_proto -Ud_endhostent_r_pr! oto -Ud_sethostent_r_proto -Ud_endprotoent_r_proto -Ud_setprotoent_r_proto -Ud_endservent_r_proto -Ud_setservent_r_proto -Dinc_version_list=5.8.7 5.8.6 5.8.5 -Dscriptdir=/usr/bin' hint=recommended, useposix=true, d_sigaction=define usethreads=define use5005threads=undef useithreads=define usemultiplicity=define useperlio=define d_sfio=undef uselargefiles=define usesocks=undef use64bitint=define use64bitall=define uselongdouble=undef usemymalloc=n, bincompat5005=undef Compiler: cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -fno-strict-aliasing -pipe -Wdeclaration-after-statement -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm', optimize='-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic', cppflags='-D_REENTRANT -D_GNU_SOURCE -fno-strict-aliasing -pipe -Wdeclaration-after-statement -I/usr/local/include -I/usr/include/gdbm' ccversion='', gccversion='4.1.1 20070105 (Red Hat 4.1.1-52)', gccosandvers='' intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16 ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8 alignbytes=8, prototype=define Linker and Libraries: ld='gcc', ldflags ='' libpth=/usr/local/lib64 /lib64 /usr/lib64 libs=-lresolv -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lpthread -lc perllibs=-lresolv -lnsl -ldl -lm -lcrypt -lutil -lpthread -lc libc=, so=so, useshrplib=true, libperl=libperl.so gnulibc_version='2.5' Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E -Wl,-rpath,/usr/lib64/perl5/5.8.8/x86_64-linux-thread-multi/CORE' cccdlflags='-fPIC', lddlflags='-shared -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic' Locally applied patches: --- @INC for perl v5.8.8: /usr/lib64/perl5/site_perl/5.8.8/x86_64-linux-thread-multi /usr/lib64/perl5/site_perl/5.8.7/x86_64-linux-thread-multi /usr/lib64/perl5/site_perl/5.8.6/x86_64-linux-thread-multi /usr/lib64/perl5/site_perl/5.8.5/x86_64-linux-thread-multi /usr/lib/perl5/site_perl/5.8.8 /usr/lib/perl5/site_perl/5.8.7 /usr/lib/perl5/site_perl/5.8.6 /usr/lib/perl5/site_perl/5.8.5 /usr/lib/perl5/site_perl /usr/lib64/perl5/vendor_perl/5.8.8/x86_64-linux-thread-multi /usr/lib64/perl5/vendor_perl/5.8.7/x86_64-linux-thread-multi /usr/lib64/perl5/vendor_perl/5.8.6/x86_64-linux-thread-multi /usr/lib64/perl5/vendor_perl/5.8.5/x86_64-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.8 /usr/lib/perl5/vendor_perl/5.8.7 /usr/lib/perl5/vendor_perl/5.8.6 /usr/lib/perl5/vendor_perl/5.8.5 /usr/lib/perl5/vendor_perl /usr/lib64/perl5/5.8.8/x86_64-linux-thread-multi /usr/lib/perl5/5.8.8 . --- Environment for perl v5.8.8: HOME=/net/gs/vol1/home/skylar2 LANG=en_US.UTF-8 LANGUAGE (unset) LC_ALL=en_US LC_CTYPE=en_US LD_LIBRARY_PATH (unset) LOGDIR (unset) PATH=/nfs/sge/bin/lx24-amd64:/nfs/sge/bin/lx24-amd64:/usr/kerberos/bin:/usr/local/bin:/bin:/usr/bin:/net/gs/vol1/home/skylar2/bin:/usr/bin:/usr/pkg/bin:/usr/pkg/sbin:/usr/local/bin:/usr/local/sbin:/sbin:/usr/sbin:/opt/csw/bin:/opt/csw/sbin:/usr/local/maui/bin:/usr/local/maui/sbin:/net/gs/vol1/home/skylar2/bin:/net/gs/vol1/home/skylar2/software/gerris/bin:/net/gs/vol1/home/skylar2/software/gts/bin:/usr/local/pgsql/bin:/net/maccoss/vol2/software/python2.4/bin/ PERL_BADLANG (unset) SHELL=/bin/bash
Subject: Re: [perl #62646] Maximum string length with substr
Date: Fri, 23 Jan 2009 14:41:44 +0000
To: perl5-porters [...] perl.org
From: Nicholas Clark <nick [...] ccl4.org>
Download (untitled) / with headers
text/plain 1.9k
On Thu, Jan 22, 2009 at 11:12:20AM -0800, skylar2@u.washington.edu (via RT) wrote: Show quoted text
> e've run into a problem where substr appears to have a maximum string > length of 2^31 bytes even on 64-bit hosts (or at least AMD64 hosts). We've > run into this problem with the Red Hat-supplied Perl 5.8.8 on RHEL5, and > also a self-compiled install of 5.10.0. Here's a script to exercise the > bug:
Thanks for the bug report. I can replicate it somewhat more tersely: $ ./perl -lwe 'print "$_ gives " . substr ("x" x $_, 1, 1) for 2147483648, 2147483649' 2147483648 gives x substr outside of string at -e line 1. Use of uninitialized value in concatenation (.) or string at -e line 1. 2147483649 gives The problem seems to be this part of the implementation of substr: int Perl_magic_getsubstr(pTHX_ SV *sv, MAGIC *mg) { STRLEN len; SV * const lsv = LvTARG(sv); const char * const tmps = SvPV_const(lsv,len); I32 offs = LvTARGOFF(sv); I32 rem = LvTARGLEN(sv); PERL_ARGS_ASSERT_MAGIC_GETSUBSTR; PERL_UNUSED_ARG(mg); if (SvUTF8(lsv)) sv_pos_u2b(lsv, &offs, &rem); if (offs > (I32)len) offs = len; if (rem + offs > (I32)len) rem = len - offs; sv_setpvn(sv, tmps + offs, (STRLEN)rem); if (SvUTF8(lsv)) SvUTF8_on(sv); return 0; } which is using variables of type I32, which will be a signed 32 bit integer on all platforms (except whichever Cray it is that only has 64 bit types) It's using I32 because the interface of sv_pos_u2b() is using I32 pointers, clearly a mistake in hindsight. sv_pos_u2b() was added in 1998: http://perl5.git.perl.org/perl.git/blame/fdf134946da249a71c49962435817212b8fa195a:/sv.c#l3236 I suspect we need to write a replacement that uses STRLEN pointers, and deprecate the old interface. (It doesn't work changing the type of a pointer on an existing function, as it will cause memory corruption when the revised function is writing 8 bytes, but old code passes in a pointer to a variable that is only 4 bytes long) Nicholas Clark
CC: perl-binary [...] plan9.de
Subject: substr not 64 bit clean
Date: Sat, 12 Sep 2009 17:51:26 +0200
To: perlbug [...] perl.org
From: Marc Lehmann <root [...] schmorp.de>
Download (untitled) / with headers
text/plain 5.1k
This is a bug report for perl from perlbug@plan9.de, generated with the help of perlbug 1.36 running under perl 5.10.0. ----------------------------------------------------------------- [Please enter your report here] I found that substr acts weirdly when confronted with larger than ~2gb strings: warn length $data; substr $data, 0, 256; 15701683970 at ... substr outside of string at ... My perl uses a uint64_t for STRLEN (standard amd64), so I would expect substr to handle this, given that I pay the memory overhead to store an 8-byte length everywhere :) It seems that pp_substr uses I32 for everything, which is of course not enough. A cursory glance over the rets of pp.c indicates that perl simply can't handle >2gb scalars, even on a 64 bit system, as it uses I32 almost everywhere (pp_index, pp_reverse, about anything that deals with string offsets uses I32). 15GB seems like a lot, but I don't think thr wish to, say, load a DVD image into memory is so far off in the future. Maybe the safe way for now would be to disallow >31 bit scalar lengths? [Please do not change anything below this line] ----------------------------------------------------------------- --- Flags: category=core severity=medium --- Site configuration information for perl 5.10.0: Configured by Marc Lehmann at Sat Feb 21 02:30:27 CET 2009. Summary of my perl5 (revision 5 version 10 subversion 0) configuration: Platform: osname=linux, osvers=2.6.24-etchnhalf.1-amd64, archname=amd64-linux uname='linux cerebro 2.6.24-etchnhalf.1-amd64 #1 smp mon jul 21 10:36:02 utc 2008 x86_64 gnulinux ' config_args='-Duselargefiles -Dxxxxuse64bitint -Uuse64bitall -Dusemymalloc=n -Dcc=gcc -Dccflags=-ggdb -gdwarf-2 -g3 -Dcppflags=-DPERL_ARENA_SIZE=16368 -D_GNU_SOURCE -I/opt/include -Doptimize=-O6 -msse2 -funroll-loops -fno-strict-aliasing -Dcccdlflags=-fPIC -Dldflags=-L/opt/perl/lib -L/opt/lib -Dlibs=-ldl -lm -lcrypt -Darchname=amd64-linux -Dprefix=/opt/perl -Dprivlib=/opt/perl/lib/perl5 -Darchlib=/opt/perl/lib/perl5 -Dvendorprefix=/opt/perl -Dvendorlib=/opt/perl/lib/perl5 -Dvendorarch=/opt/perl/lib/perl5 -Dsiteprefix=/opt/perl -Dsitelib=/opt/perl/lib/perl5 -Dsitearch=/opt/perl/lib/perl5 -Dsitebin=/opt/perl/bin -Dman1dir=/opt/perl/man/man1 -Dman3dir=/opt/perl/man/man3 -Dsiteman1dir=/opt/perl/man/man1 -Dsiteman3dir=/opt/perl/man/man3 -Dman1ext=1 -Dman3ext=3 -Dpager=/usr/bin/less -Uafs -Uusesfio -Uusenm -Uuseshrplib -Dd_dosuid -Dusethreads=undef -Duse5005threads=undef -Duseithreads=undef -Dusemultiplicity=undef -Demail=perl-binary@plan9.de -Dcf_email=perl-binary@plan9.de -Dcf_by=Marc Lehmann -Dlocincpth=/opt/perl/include /opt/include -Dmyhostname=localhost -Dmultiarch=undef -Dbin=/opt/perl/bin -Dxxxusedevel -DxxxDEBUGGING -Dxxxuse_debugging_perl -Dxxxuse_debugmalloc -des' hint=recommended, useposix=true, d_sigaction=define useithreads=undef, usemultiplicity=undef useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef use64bitint=define, use64bitall=undef, uselongdouble=undef usemymalloc=n, bincompat5005=undef Compiler: cc='gcc', ccflags ='-ggdb -gdwarf-2 -g3 -fno-strict-aliasing -pipe -I/opt/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64', optimize='-O6 -msse2 -funroll-loops -fno-strict-aliasing', cppflags='-DPERL_ARENA_SIZE=16368 -D_GNU_SOURCE -I/opt/include -ggdb -gdwarf-2 -g3 -fno-strict-aliasing -pipe -I/opt/include' ccversion='', gccversion='4.3.2', gccosandvers='' intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16 ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8 alignbytes=8, prototype=define Linker and Libraries: ld='gcc', ldflags ='-L/opt/perl/lib -L/opt/lib -L/usr/local/lib' libpth=/usr/local/lib /lib /usr/lib /lib64 /usr/lib64 libs=-ldl -lm -lcrypt perllibs=-ldl -lm -lcrypt libc=/lib/libc-2.7.so, so=so, useshrplib=false, libperl=libperl.a gnulibc_version='2.7' Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E' cccdlflags='-fPIC', lddlflags='-shared -O6 -msse2 -funroll-loops -fno-strict-aliasing -L/opt/perl/lib -L/opt/lib -L/usr/local/lib' Locally applied patches: http://public.activestate.com/cgi-bin/perlbrowse/p/34209 http://public.activestate.com/cgi-bin/perlbrowse/p/34507 http://www.gossamer-threads.com/lists/perl/porters/232549 embed.fnc:Perl_vcroak NULLOK --- @INC for perl 5.10.0: /root/src/sex /opt/perl/lib/perl5 /opt/perl/lib/perl5 /opt/perl/lib/perl5 /opt/perl/lib/perl5 /opt/perl/lib/perl5 . --- Environment for perl 5.10.0: HOME=/root LANG (unset) LANGUAGE (unset) LC_CTYPE=en_US.UTF-8 LD_LIBRARY_PATH (unset) LOGDIR (unset) PATH=/root/s2:/root/s:/opt/bin:/opt/sbin:/bin:/sbin:/usr/bin:/usr/sbin:/usr/X11/bin:/usr/games:/usr/local/bin:/usr/local/sbin:/root/pserv:. PERL5LIB=/root/src/sex PERL5_CPANPLUS_CONFIG=/root/.cpanplus/config PERLDB_OPTS=ornaments=0 PERL_ANYEVENT_DBI_TESTS=1 PERL_ANYEVENT_EDNS0=1 PERL_ANYEVENT_NET_TESTS=1 PERL_ANYEVENT_PROTOCOLS=ipv4,ipv6 PERL_ANYEVENT_STRICT=1 PERL_BADLANG (unset) PERL_UNICODE=E SHELL=/bin/bash
Subject: substr() behaves strangely for large values of LENGTH
Date: Sat, 10 Oct 2009 12:19:51 -0400 (EDT)
To: perlbug [...] perl.org
From: fbriere [...] fbriere.net (Frederic Briere)
Download (untitled) / with headers
text/plain 3.3k
This is a bug report for perl from fbriere@fbriere.net, generated with the help of perlbug 1.36 running under perl 5.10.0. ----------------------------------------------------------------- substr() is behaving strangely on values larger than an int: $ perl -le 'print substr "abcd", 0, $_ for 2**31-1, 2**31, 2**32' abcd abc (Obviously, this was run on a 32-bit machine.) ----------------------------------------------------------------- --- Flags: category=core severity=medium --- Site configuration information for perl 5.10.0: Configured by Debian Project at Sun Aug 16 22:37:28 UTC 2009. Summary of my perl5 (revision 5 version 10 subversion 0) configuration: Platform: osname=linux, osvers=2.6.26-2-amd64, archname=i486-linux-gnu-thread-multi uname='linux puccini 2.6.26-2-amd64 #1 smp fri aug 14 07:12:04 utc 2009 i686 gnulinux ' config_args='-Dusethreads -Duselargefiles -Dccflags=-DDEBIAN -Dcccdlflags=-fPIC -Darchname=i486-linux-gnu -Dprefix=/usr -Dprivlib=/usr/share/perl/5.10 -Darchlib=/usr/lib/perl/5.10 -Dvendorprefix=/usr -Dvendorlib=/usr/share/perl5 -Dvendorarch=/usr/lib/perl5 -Dsiteprefix=/usr/local -Dsitelib=/usr/local/share/perl/5.10.0 -Dsitearch=/usr/local/lib/perl/5.10.0 -Dman1dir=/usr/share/man/man1 -Dman3dir=/usr/share/man/man3 -Dsiteman1dir=/usr/local/man/man1 -Dsiteman3dir=/usr/local/man/man3 -Dman1ext=1 -Dman3ext=3perl -Dpager=/usr/bin/sensible-pager -Uafs -Ud_csh -Ud_ualarm -Uusesfio -Uusenm -DDEBUGGING=-g -Doptimize=-O2 -Duseshrplib -Dlibperl=libperl.so.5.10.0 -Dd_dosuid -des' hint=recommended, useposix=true, d_sigaction=define useithreads=define, usemultiplicity=define useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef use64bitint=undef, use64bitall=undef, uselongdouble=undef usemymalloc=n, bincompat5005=undef Compiler: cc='cc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBIAN -fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64', optimize='-O2 -g', cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBIAN -fno-strict-aliasing -pipe -I/usr/local/include' ccversion='', gccversion='4.3.4', gccosandvers='' intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12 ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8 alignbytes=4, prototype=define Linker and Libraries: ld='cc', ldflags =' -L/usr/local/lib' libpth=/usr/local/lib /lib /usr/lib /usr/lib64 libs=-lgdbm -lgdbm_compat -ldb -ldl -lm -lpthread -lc -lcrypt perllibs=-ldl -lm -lpthread -lc -lcrypt libc=/lib/libc-2.9.so, so=so, useshrplib=true, libperl=libperl.so.5.10.0 gnulibc_version='2.9' Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E' cccdlflags='-fPIC', lddlflags='-shared -O2 -g -L/usr/local/lib' Locally applied patches: --- @INC for perl 5.10.0: /etc/perl /usr/local/lib/perl/5.10.0 /usr/local/share/perl/5.10.0 /usr/lib/perl5 /usr/share/perl5 /usr/lib/perl/5.10 /usr/share/perl/5.10 /usr/local/lib/site_perl . --- Environment for perl 5.10.0: HOME=/home/fbriere LANG=en_CA.UTF-8 LANGUAGE (unset) LD_LIBRARY_PATH (unset) LOGDIR (unset) PATH=/usr/local/bin:/usr/bin:/bin:/usr/games PERL_BADLANG (unset) SHELL=/bin/bash
Subject: [perl #62646] Maximum string length with substr
Date: Tue, 5 Jan 2010 22:18:42 +0000
To: perl5-porters [...] perl.org
From: Zefram <zefram [...] fysh.org>
Attached patch should fix bug #62646. -zefram
Download d1
text/plain 1.5k

Message body is not shown because sender requested not to inline it.

CC: perl5-porters [...] perl.org
Subject: Re: [perl #62646] Maximum string length with substr
Date: Tue, 5 Jan 2010 22:24:06 +0000
To: Zefram <zefram [...] fysh.org>
From: Nicholas Clark <nick [...] ccl4.org>
Download (untitled) / with headers
text/plain 662b
On Tue, Jan 05, 2010 at 10:18:42PM +0000, Zefram wrote: Show quoted text
> Attached patch should fix bug #62646.
Show quoted text
> +++ b/t/re/substr.t
Show quoted text
> @@ -682,4 +682,19 @@ is($x, "\x{100}\x{200}\xFFb"); > is(substr($a,1,1), 'b'); > } > > +# [perl #62646] offsets exceeding 32 bits on 64-bit system > +SKIP: { > + skip("32-bit system", 4) unless ~0 > 0xffffffff; > + my $a = "abc"; > + my $r; > + $w = 0; > + $r = substr($a, 0xffffffff, 1); > + is($r, undef); > + is($w, 1); > + $w = 0; > + $r = substr($a, 0xffffffff+1, 1); > + is($r, undef); > + is($w--, 1); > +} > + > }
Any reason for $w-- right at the end, instead of just $w? Nicholas Clark
Subject: Re: [perl #62646] Maximum string length with substr
Date: Tue, 5 Jan 2010 22:25:36 +0000
To: perl5-porters [...] perl.org
From: Zefram <zefram [...] fysh.org>
Download (untitled) / with headers
text/plain 129b
Nicholas Clark wrote: Show quoted text
>Any reason for $w-- right at the end, instead of just $w?
Bah. No, that's an editing mistake. -zefram
CC: perl5-porters [...] perl.org
Subject: Re: [perl #62646] Maximum string length with substr
Date: Tue, 5 Jan 2010 17:58:28 -0500
To: Zefram <zefram [...] fysh.org>
From: Eric Brine <ikegami [...] adaelis.com>
Download (untitled) / with headers
text/plain 183b
On Tue, Jan 5, 2010 at 5:18 PM, Zefram <zefram@fysh.org> wrote: Show quoted text
> Attached patch should fix bug #62646. >
The patch doesn't change mg.c. Does that mean it doesn't fix lvalue subtr?
Subject: Re: [perl #62646] Maximum string length with substr
Date: Tue, 5 Jan 2010 23:16:53 +0000
To: perl5-porters [...] perl.org
From: Zefram <zefram [...] fysh.org>
Download (untitled) / with headers
text/plain 335b
Eric Brine wrote: Show quoted text
>The patch doesn't change mg.c. Does that mean it doesn't fix lvalue subtr?
Oh, missed that. And upon looking further, it seems that #62646 has widened in scope to consider all other string ops, which I didn't look at. Jesse, as a 5.12 blocker, are you concerned just with substr, or with all string ops? -zefram
CC: perl5-porters [...] perl.org
Subject: Re: [perl #62646] Maximum string length with substr
Date: Fri, 8 Jan 2010 10:54:00 -0800
To: Zefram <zefram [...] fysh.org>
From: Jesse Vincent <jesse [...] fsck.com>
Download (untitled) / with headers
text/plain 570b
On Tue 5.Jan'10 at 23:16:53 +0000, Zefram wrote: Show quoted text
> Eric Brine wrote:
> >The patch doesn't change mg.c. Does that mean it doesn't fix lvalue subtr?
> > Oh, missed that. And upon looking further, it seems that #62646 has > widened in scope to consider all other string ops, which I didn't look at. > Jesse, as a 5.12 blocker, are you concerned just with substr, or with > all string ops?
Actually, it looks like this was nicholas' blocker. I'd certainly rather a partial solution than no solution, so long as we're not making a full fix harder. Show quoted text
> > -zefram >
Download signature.asc
application/pgp-signature 197b

Message body not shown because it is not plain text.

CC: perl5-porters [...] perl.org
Subject: Re: [perl #62646] Maximum string length with substr
Date: Fri, 15 Jan 2010 17:17:17 +0100
To: Zefram <zefram [...] fysh.org>
From: Rafael Garcia-Suarez <rgs [...] consttype.org>
Download (untitled) / with headers
text/plain 346b
2010/1/5 Zefram <zefram@fysh.org>: Show quoted text
> Attached patch should fix bug #62646.
Thanks, applied to bleadperl as b6d1426f94a845fb8fece8b6ad0b7d9f35f2d62e. I won't mark this bug fixed, since it doesn't apply to lvalue substr, and since probably other string functions have the limitation; however we may consider taking it off the 5.12-blockers list.
Subject: Re: [perl #62646] Maximum string length with substr
Date: Fri, 15 Jan 2010 16:23:40 +0000
To: perl5-porters [...] perl.org
From: Nicholas Clark <nick [...] ccl4.org>
Download (untitled) / with headers
text/plain 528b
On Fri, Jan 15, 2010 at 05:17:17PM +0100, Rafael Garcia-Suarez wrote: Show quoted text
> 2010/1/5 Zefram <zefram@fysh.org>:
> > Attached patch should fix bug #62646.
> > Thanks, applied to bleadperl as > b6d1426f94a845fb8fece8b6ad0b7d9f35f2d62e. I won't mark this bug fixed, > since it doesn't apply to lvalue substr, and since probably other > string functions have the limitation; however we may consider taking > it off the 5.12-blockers list.
I feel that the lvalue substr bug should remain on the blockers list, for now. Nicholas Clark
Subject: BUG OF THE DAY [perl #62646] lvalue substr
Date: Thu, 11 Feb 2010 12:04:32 -0800
To: perl5-porters [...] perl.org
From: Jesse Vincent <jesse [...] fsck.com>
Download (untitled) / with headers
text/plain 157b
Can be summarized as: $ ./perl -le 'print substr "abcd", 0, $_ for 2**31-1, 2**31, 2**32' abcd abc It's one of only three known release blockers left.
Download signature.asc
application/pgp-signature 197b

Message body not shown because it is not plain text.

CC: perl5-porters [...] perl.org
Subject: Re: BUG OF THE DAY [perl #62646] lvalue substr
Date: Thu, 11 Feb 2010 18:10:52 -0500
To: Jesse Vincent <jesse [...] fsck.com>
From: Eric Brine <ikegami [...] adaelis.com>
Download (untitled) / with headers
text/plain 456b
On Thu, Feb 11, 2010 at 3:04 PM, Jesse Vincent <jesse@fsck.com> wrote: Show quoted text
> Can be summarized as: > > $ ./perl -le 'print substr "abcd", 0, $_ for 2**31-1, 2**31, 2**32' > > abcd > > abc > > It's one of only three known release blockers left. >
Zephram did the non-lvalue portion. I offered to do the lvalue portion for him, but got no reply. The fix is straightfoward and consists of changing some I32 to IV in pp*.c files (done?), in SV_PVLV, and in mg.c
CC: Jesse Vincent <jesse [...] fsck.com>, perl5-porters [...] perl.org
Subject: Re: BUG OF THE DAY [perl #62646] lvalue substr
Date: Thu, 11 Feb 2010 18:58:23 -0500
To: Eric Brine <ikegami [...] adaelis.com>
From: jesse <jesse [...] fsck.com>
Download (untitled) / with headers
text/plain 384b
On Thu, Feb 11, 2010 at 06:10:52PM -0500, Eric Brine wrote: Show quoted text
> Zephram did the non-lvalue portion. I offered to do the lvalue portion for > him, but got no reply. The fix is straightfoward and consists of changing > some I32 to IV in pp*.c files (done?), in SV_PVLV, and in mg.c
Ooh. I'm sorry I missed your offer. If you're still game, that would be great. Best, Jesse
CC: perl5-porters [...] perl.org
Subject: Re: BUG OF THE DAY [perl #62646] lvalue substr
Date: Thu, 11 Feb 2010 19:18:11 -0500
To: jesse <jesse [...] fsck.com>
From: Eric Brine <ikegami [...] adaelis.com>
Download (untitled) / with headers
text/plain 320b
On Thu, Feb 11, 2010 at 6:58 PM, jesse <jesse@fsck.com> wrote: Show quoted text
> Ooh. I'm sorry I missed your offer.
I had sent the offer directly to him If you're still game, that would be great. Show quoted text
>
Already started. One problem (Craig noticed) is that sv_pos_u2b only works with 32bit string lengths. I'm not sure how to fix that.
CC: perl5-porters [...] perl.org
Subject: [perl #62646] [PATCH] Re: BUG OF THE DAY lvalue substr
Date: Fri, 12 Feb 2010 00:42:00 -0500
To: jesse <jesse [...] fsck.com>
From: Eric Brine <ikegami [...] adaelis.com>

Message body is not shown because sender requested not to inline it.

Download (untitled) / with headers
text/plain 776b
On Thu, Feb 11, 2010 at 7:18 PM, Eric Brine <ikegami@adaelis.com> wrote:
Show quoted text
If you're still game, that would be great.

Already started. One problem (Craig noticed) is that sv_pos_u2b only works with 32bit string lengths. I'm not sure how to fix that.

Turns out sv_pos_u2b uses STRLEN internally even though it presents I32 for its arguments, so the changes needed were minor. All I did was to provide another interface. I called it sv_pos_u2b_proper, but you'll probably want to change that.

I think some more improvements relating to types can be made to pp_substr, but it's in code outside the mandate of the bug.

I wish I could do some testing with very large strings (2**31), but I don't have access to a system that can handle that.

Patch attached.

- ELB

CC: jesse <jesse [...] fsck.com>, perl5-porters [...] perl.org
Subject: Re: BUG OF THE DAY [perl #62646] lvalue substr
Date: Fri, 12 Feb 2010 07:14:44 +0000
To: Eric Brine <ikegami [...] adaelis.com>
From: Nicholas Clark <nick [...] ccl4.org>
Download (untitled) / with headers
text/plain 622b
On Thu, Feb 11, 2010 at 07:18:11PM -0500, Eric Brine wrote: Show quoted text
> On Thu, Feb 11, 2010 at 6:58 PM, jesse <jesse@fsck.com> wrote: >
> > Ooh. I'm sorry I missed your offer.
> > > I had sent the offer directly to him > > If you're still game, that would be great.
> >
> > Already started. One problem (Craig noticed) is that sv_pos_u2b only works > with 32bit string lengths. I'm not sure how to fix that.
I think: 1: give it (well, all the public functions, as sv_pod_b2u will need it too) a "new" name, and fix it to take STRLEN sized arguments. 2: Make sv_pos_u2b a wrapper around the fixed version Nicholas Clark
Subject: Re: BUG OF THE DAY [perl #62646] lvalue substr
Date: Fri, 12 Feb 2010 02:55:32 -0500
To: Eric Brine <ikegami [...] adaelis.com>, jesse <jesse [...] fsck.com>, perl5-porters [...] perl.org
From: Eric Brine <ikegami [...] adaelis.com>
Download (untitled) / with headers
text/plain 893b
On Fri, Feb 12, 2010 at 2:14 AM, Nicholas Clark <nick@ccl4.org> wrote: Show quoted text
> On Thu, Feb 11, 2010 at 07:18:11PM -0500, Eric Brine wrote:
> > On Thu, Feb 11, 2010 at 6:58 PM, jesse <jesse@fsck.com> wrote: > >
> > > Ooh. I'm sorry I missed your offer.
> > > > > > I had sent the offer directly to him > > > > If you're still game, that would be great.
> > >
> > > > Already started. One problem (Craig noticed) is that sv_pos_u2b only
> works
> > with 32bit string lengths. I'm not sure how to fix that.
> > I think: > > 1: give it (well, all the public functions, as sv_pod_b2u will need it too) > a "new" name, and fix it to take STRLEN sized arguments. > 2: Make sv_pos_u2b a wrapper around the fixed version >
That's exactly what I did. (See the patch I posted.) I couldn't come up with decent name for the function. It's called sv_pos_u2b_proper in the patch, but that's easily changeable.
Subject: Re: BUG OF THE DAY [perl #62646] lvalue substr
Date: Fri, 12 Feb 2010 10:53:27 +0000
To: perl5-porters [...] perl.org
From: Zefram <zefram [...] fysh.org>
Download (untitled) / with headers
text/plain 757b
Eric Brine wrote: Show quoted text
>Zephram did the non-lvalue portion. I offered to do the lvalue portion for >him, but got no reply.
Yeah, sorry. I was intending to work on this, but for the past month or so I've been occupied with fixing Devel::Declare (which was yet another victim of my sub lookup patch) and a motherboard failure. My patch for non-lvalue substr turned out to be faulty. The willingness to handle 64-bit offsets and lengths needs to go a bit deeper than I implemented. For the case of 64-bit offsets when the address space (and therefore size_t) is 32-bit, shims will be required before string operations can be handed off to standard library functions. I fear that the ramifications of trying to do this correctly are quite extensive. -zefram
CC: perl5-porters [...] perl.org
Subject: Re: BUG OF THE DAY [perl #62646] lvalue substr
Date: Fri, 12 Feb 2010 11:00:44 +0000
To: Zefram <zefram [...] fysh.org>
From: Nicholas Clark <nick [...] ccl4.org>
Download (untitled) / with headers
text/plain 1.1k
On Fri, Feb 12, 2010 at 10:53:27AM +0000, Zefram wrote: Show quoted text
> Eric Brine wrote:
> >Zephram did the non-lvalue portion. I offered to do the lvalue portion for > >him, but got no reply.
> > Yeah, sorry. I was intending to work on this, but for the past month or > so I've been occupied with fixing Devel::Declare (which was yet another > victim of my sub lookup patch) and a motherboard failure. > > My patch for non-lvalue substr turned out to be faulty. The willingness > to handle 64-bit offsets and lengths needs to go a bit deeper than > I implemented. For the case of 64-bit offsets when the address space > (and therefore size_t) is 32-bit, shims will be required before string > operations can be handed off to standard library functions. I fear that > the ramifications of trying to do this correctly are quite extensive.
In which case I think that the patch as Eric proposed is technically wrong. It uses IVs, which *can* be 64 bit when addresses are 32 bit. Whereas what we need is a (signed) STRLEN type which is the same size as pointers, so that it's not possible for it to hold an offset larger than memory. Nicholas Clark
Subject: Re: BUG OF THE DAY [perl #62646] lvalue substr
Date: Fri, 12 Feb 2010 11:12:43 +0000
To: perl5-porters [...] perl.org
From: Zefram <zefram [...] fysh.org>
Download (untitled) / with headers
text/plain 353b
Nicholas Clark wrote: Show quoted text
>Whereas what we need is a (signed) STRLEN type which is the same size as >pointers,
That's ssize_t, in standard C, just as the unsigned STRLEN is size_t. When IV is larger than ssize_t, you need some logic to handle offsets and lengths that don't fit into ssize_t. This is part of the job of the notional shim layer. -zefram
CC: perl5-porters [...] perl.org
Subject: Re: BUG OF THE DAY [perl #62646] lvalue substr
Date: Fri, 12 Feb 2010 11:18:11 +0000
To: Zefram <zefram [...] fysh.org>
From: Nicholas Clark <nick [...] ccl4.org>
Download (untitled) / with headers
text/plain 700b
On Fri, Feb 12, 2010 at 11:12:43AM +0000, Zefram wrote: Show quoted text
> Nicholas Clark wrote:
> >Whereas what we need is a (signed) STRLEN type which is the same size as > >pointers,
> > That's ssize_t, in standard C, just as the unsigned STRLEN is size_t.
config.h has the former available as SSize_t Show quoted text
> When IV is larger than ssize_t, you need some logic to handle offsets > and lengths that don't fit into ssize_t. This is part of the job of > the notional shim layer.
Are you envisaging a shim layer that merely copes with passing pointers to different length integers, or one that also range checks the values going in and out, for the case where the value overflows the smaller integer? Nicholas Clark
Subject: Re: BUG OF THE DAY [perl #62646] lvalue substr
Date: Fri, 12 Feb 2010 11:25:25 +0000
To: perl5-porters [...] perl.org
From: Zefram <zefram [...] fysh.org>
Download (untitled) / with headers
text/plain 432b
Nicholas Clark wrote: Show quoted text
>Are you envisaging a shim layer that merely copes with passing pointers to >different length integers, or one that also range checks the values going in >and out, for the case where the value overflows the smaller integer?
Correctly handling the overflows is essential to the bug we're dealing with. Exactly where they need to be handled depends on what internal interfaces turn out to be useful. -zefram
Subject: Re: BUG OF THE DAY [perl #62646] lvalue substr
Date: Fri, 12 Feb 2010 11:03:35 -0500
To: Zefram <zefram [...] fysh.org>, perl5-porters [...] perl.org
From: Eric Brine <ikegami [...] adaelis.com>
Download (untitled) / with headers
text/plain 730b
On Fri, Feb 12, 2010 at 6:00 AM, Nicholas Clark <nick@ccl4.org> wrote: Show quoted text
> In which case I think that the patch as Eric proposed is technically wrong. >
I did make a comment about that, but it didn't go in nearly enough details and my conclusion was wrong. The conversion from IV to STRLEN occurs in each of these lines: const STRLEN upos = pos; const STRLEN urem = rem; STRLEN bpos = pos; STRLEN brem = rem; The code can be adjusted so rem is never anything but a STRLEN, and a check can be added to make sure pos is in range. In the same code lurks another bug: On a 32-bit system, not all positions of the string can be accessed. Large positive positions are treated as negatives. I'll address these today.
Subject: Re: BUG OF THE DAY [perl #62646] lvalue substr
Date: Fri, 12 Feb 2010 16:21:34 +0000
To: Eric Brine <ikegami [...] adaelis.com>, jesse <jesse [...] fsck.com>, perl5-porters [...] perl.org
From: Nicholas Clark <nick [...] ccl4.org>
Download (untitled) / with headers
text/plain 2.1k
On Fri, Feb 12, 2010 at 07:14:44AM +0000, Nicholas Clark wrote: Show quoted text
> On Thu, Feb 11, 2010 at 07:18:11PM -0500, Eric Brine wrote:
> > On Thu, Feb 11, 2010 at 6:58 PM, jesse <jesse@fsck.com> wrote: > >
> > > Ooh. I'm sorry I missed your offer.
> > > > > > I had sent the offer directly to him > > > > If you're still game, that would be great.
> > >
> > > > Already started. One problem (Craig noticed) is that sv_pos_u2b only works > > with 32bit string lengths. I'm not sure how to fix that.
> > I think: > > 1: give it (well, all the public functions, as sv_pod_b2u will need it too) > a "new" name, and fix it to take STRLEN sized arguments. > 2: Make sv_pos_u2b a wrapper around the fixed version
For reference, in my unpacked CPAN: ack --sort -l '\b(?:Perl_)?sv_pos_(?:u2b|b2u)\b(?!\|)' gave: Convert-Binary-C/tests/include/perlinc/embed.h Convert-Binary-C/tests/include/perlinc/proto.h Devel-PPPort/PPPort.pm Padre/share/languages/perl5/perlapi_current.yml Perl-APIReference/lib/Perl/APIReference/V5_008_000.pm Perl-APIReference/lib/Perl/APIReference/V5_008_001.pm Perl-APIReference/lib/Perl/APIReference/V5_008_002.pm Perl-APIReference/lib/Perl/APIReference/V5_008_003.pm Perl-APIReference/lib/Perl/APIReference/V5_008_004.pm Perl-APIReference/lib/Perl/APIReference/V5_008_005.pm Perl-APIReference/lib/Perl/APIReference/V5_008_006.pm Perl-APIReference/lib/Perl/APIReference/V5_008_007.pm Perl-APIReference/lib/Perl/APIReference/V5_008_008.pm Perl-APIReference/lib/Perl/APIReference/V5_008_009.pm Perl-APIReference/lib/Perl/APIReference/V5_010_000.pm Perl-APIReference/lib/Perl/APIReference/V5_010_001.pm Perl-APIReference/lib/Perl/APIReference/V5_011_000.pm Perl-APIReference/lib/Perl/APIReference/V5_011_001.pm Perl-APIReference/lib/Perl/APIReference/V5_011_002.pm perl/embed.h perl/mg.c perl/pp.c perl/pp_ctl.c perl/pp_sys.c perl/proto.h perl/sv.c [I think that I need to investigate something a bit more sophisticated than a brute force ack, but I would like full perl regexps for my search] Google codesearch doesn't show any users either of either, outside the core. Looks like we can convert the I32 versions to wrappers, deprecated them, and subsequently remove them, without actually affecting anyone. Nicholas Clark
CC: perl5-porters [...] perl.org
Subject: [perl #62646] [PATCH] Re: BUG OF THE DAY lvalue substr, Take 2
Date: Sat, 13 Feb 2010 00:15:56 -0500
To: jesse <jesse [...] fsck.com>
From: Eric Brine <ikegami [...] adaelis.com>

Message body is not shown because sender requested not to inline it.

Download (untitled) / with headers
text/plain 395b
Take 2.

substr() must accept both UVs and IVs for its $pos and $len arguments in order to ensure any part of the string can be extracted, and to allow negative indexing. (Even then, they are limits on the range of negative indexing.) This patches provides that range.

It also fixes:

$ perl -wle'$[=2; print substr("abcdefghij", 1);'
panic: sv_setpvn called with negative strlen at -e line 1.

CC: jesse <jesse [...] fsck.com>, perl5-porters [...] perl.org
Subject: Re: [perl #62646] [PATCH] Re: BUG OF THE DAY lvalue substr, Take 2
Date: Sat, 13 Feb 2010 18:54:33 +0000
To: Eric Brine <ikegami [...] adaelis.com>
From: Nicholas Clark <nick [...] ccl4.org>
Download (untitled) / with headers
text/plain 3.5k
On Sat, Feb 13, 2010 at 12:15:56AM -0500, Eric Brine wrote: Show quoted text
> Take 2. > > substr() must accept both UVs and IVs for its $pos and $len arguments in > order to ensure any part of the string can be extracted, and to allow > negative indexing. (Even then, they are limits on the range of negative > indexing.) This patches provides that range. > > It also fixes: > > $ perl -wle'$[=2; print substr("abcdefghij", 1);' > panic: sv_setpvn called with negative strlen at -e line 1.
Whilst I've looked at the patch, and don't see anything wrong with it, for some reason I'm uneasy about applying it today. (Tomorrow should be fine). Whilst this *isn't* technically the same bug, it is related to I32 abuse and the UTF-8 caching code. It's rather embarrassing that something as simple as length can return the wrong result, or panic: $ time ./perl -lwe 'open A, shift or die $!; read A, $a, (1<<32) + 4; chop $a; warn length $a; substr $a, 0x100000000, 1, chr 256; print ord substr $a, 0x100000000; print ord substr $a, 0xFFFFFFF0; warn length $a; warn length $a' /dev/zero Hexadecimal number > 0xffffffff non-portable at -e line 1. Hexadecimal number > 0xffffffff non-portable at -e line 1. 4294967299 at -e line 1. 256 panic: sv_len_utf8 cache 3 real 4294967299 for Ä at -e line 1. real 37m17.458s user 2m56.791s sys 1m11.734s I propose the following work around, which disables storing a bad value in the cache for the length in Unicode characters, if that value has wrapped: diff --git a/sv.c b/sv.c index 02be580..87fc348 100644 --- a/sv.c +++ b/sv.c @@ -6072,6 +6072,10 @@ Perl_sv_len_utf8(pTHX_ register SV *const sv) } assert(mg); mg->mg_len = ulen; + /* For now, treat "overflowed" as "still unknown". + See RT #72784. */ + if (ulen != (STRLEN) mg->mg_len) + mg->mg_len = -1; } } return ulen; It avoids the panic, and produces the right results (albeit slowly, but 10 times faster than the panic takes to end): warn length $a; substr $a, 0x100000000, 1, chr 256; print ord substr $a, 0x100000000; print ord substr $a, 0xFFFFFFF0; warn length $a; warn length $a' /dev/zero Hexadecimal number > 0xffffffff non-portable at -e line 1. Hexadecimal number > 0xffffffff non-portable at -e line 1. 4294967299 at -e line 1. 256 0 4294967299 at -e line 1. 4294967299 at -e line 1. real 3m41.042s user 3m36.332s sys 0m4.668s [I've compiled with -DDEBUGGING so I get debug mode enabled. If I turn it off: $ time ./perl -lwe '${^UTF8CACHE}=1; open A, shift or die $!; read A, $a, (1<<32) + 4; chop $a; warn length $a; substr $a, 0x100000000, 1, chr 256; print ord substr $a, 0x100000000; print ord substr $a, 0xFFFFFFF0; warn length $a; warn length $a' /dev/zero Hexadecimal number > 0xffffffff non-portable at -e line 1. Hexadecimal number > 0xffffffff non-portable at -e line 1. 4294967299 at -e line 1. 256 0 4294967299 at -e line 1. 4294967299 at -e line 1. real 2m21.951s user 2m17.326s sys 0m4.579s I don't know how many fewer linear scans of the string that time drop equates to. One dromedary has the extra RAM installed it will become a lot less painful to start to investigate these things. 6GB is not enough :-) ] I've created a meta-ticket to group all the tickets related to I32 abuse, and collated the remaining uses of Perl_sv_pos_b2u() and Perl_sv_pos_u2b() as tickets under it: http://rt.perl.org/rt3/Ticket/Display.html?id=72784 Nicholas Clark
CC: jesse <jesse [...] fsck.com>, perl5-porters [...] perl.org
Subject: Re: [perl #62646] [PATCH] Re: BUG OF THE DAY lvalue substr, Take 2
Date: Tue, 16 Feb 2010 10:39:12 +0000
To: Eric Brine <ikegami [...] adaelis.com>
From: Nicholas Clark <nick [...] ccl4.org>
Download (untitled) / with headers
text/plain 888b
On Sun, Feb 14, 2010 at 02:23:37PM -0500, Eric Brine wrote: Show quoted text
> On Sun, Feb 14, 2010 at 11:50 AM, Nicholas Clark <nick@ccl4.org> wrote: >
> > So I renamed "_proper" to "_flags", and changed it to return the byte > > conversion of uoffset: > > > > -void > > -Perl_sv_pos_u2b_proper(pTHX_ register SV *const sv, STRLEN *const offsetp, > > STRLEN *const lenp) > > > > +STRLEN > > +Perl_sv_pos_u2b_flags(pTHX_ SV *const sv, STRLEN uoffset, STRLEN *const > > lenp, > > + U32 flags) > >
> > It looks like you haven't committed this yet. Before you do, can you please > move PERL_ARGS_ASSERT_SV_POS_U2B to Perl_sv_pos_u2b() and add > PERL_ARGS_ASSERT_SV_POS_U2B_FLAGS to Perl_sv_pos_u2b_flags()? Thanks
I had committed it, and fixed that part, and found a bug in the test that's supposed to catch those: http://rt.perl.org/rt3/Ticket/Display.html?id=72800 Nicholas Clark
RT-Send-CC: perl5-porters [...] perl.org
Download (untitled) / with headers
text/plain 1.7k
Thanks for the report. Sorry for the delay in replying. On Sat Sep 12 08:52:05 2009, perlbug@plan9.de wrote: Show quoted text
> > This is a bug report for perl from perlbug@plan9.de, > generated with the help of perlbug 1.36 running under perl 5.10.0. > > > ----------------------------------------------------------------- > [Please enter your report here] > > I found that substr acts weirdly when confronted with larger than ~2gb > strings: > > warn length $data; > substr $data, 0, 256; > > 15701683970 at ... > substr outside of string at ... > > My perl uses a uint64_t for STRLEN (standard amd64), so I would expect > substr to handle this, given that I pay the memory overhead to store > an > 8-byte length everywhere :) > > It seems that pp_substr uses I32 for everything, which is of course > not > enough. > > A cursory glance over the rets of pp.c indicates that perl simply > can't > handle >2gb scalars, even on a 64 bit system, as it uses I32 almost > everywhere (pp_index, pp_reverse, about anything that deals with > string > offsets uses I32).
Yes. It's defective. Show quoted text
> 15GB seems like a lot, but I don't think thr wish to, say, load a DVD > image into memory is so far off in the future. > > Maybe the safe way for now would be to disallow >31 bit scalar > lengths?
I don't think that there's actually an easy way to do that. There's a patch in blead now specifically for substr. Jesse isn't planning to fix anything else for 5.12.0, but I believe that many of the others can be fixed without breaking binary compatibility, so hopefully they will make it into 5.12.1 etc, not just 5.14.0 I made a new ticket to track tickets relating to the misuse of I32: http://rt.perl.org/rt3/Ticket/Display.html?id=72784 (the intent being that we close this ticket as it's just tracking substr) Nicholas Clark
Download (untitled) / with headers
text/plain 143b
The substr bug is resolved with 777f7c56. The other bugs are being tracked by ticket 72784 http://rt.perl.org/rt3/Ticket/Display.html?id=72784


This service is sponsored and maintained by Best Practical Solutions and runs on Perl.org infrastructure.

For issues related to this RT instance (aka "perlbug"), please contact perlbug-admin at perl.org