Skip Menu |
Report information
Id: 132595
Status: resolved
Priority: 0/
Queue: perl5

Owner: Nobody
Requestors: zefram [at] fysh.org
Cc:
AdminCc:

Operating System: (no value)
PatchStatus: (no value)
Severity: low
Type: core
Perl Version: 5.27.7
Fixed In: 5.27.9



To: perlbug [...] perl.org
CC: zefram [...] fysh.org
Date: Sun, 17 Dec 2017 13:29:31 +0000
Subject: multiconcat reading and fetching wrong order
From: zefram [...] fysh.org
Download (untitled) / with headers
text/plain 4.6k
This is a bug report for perl from zefram@fysh.org, generated with the help of perlbug 1.41 running under perl 5.27.7. ----------------------------------------------------------------- [Please describe your issue here] The multiconcat op is stated to replicate the order of magic handling and so on that was exhibited by the unoptimised concat op. But here's a case where the replication is not exact: $ perl5.27.5 -lwe '$i = 0; { package T; sub TIESCALAR { bless({}, $_[0]) } sub FETCH { ++$::i; $::foo = "b".$::i; "c".$::i } } $foo = "a"; tie $bar, "T"; print $foo.$bar.$foo.$bar' b1c1b1c2 $ perl5.27.6 -lwe '$i = 0; { package T; sub TIESCALAR { bless({}, $_[0]) } sub FETCH { ++$::i; $::foo = "b".$::i; "c".$::i } } $foo = "a"; tie $bar, "T"; print $foo.$bar.$foo.$bar' ac1b2c2 Blead behaves the same as 5.27.6. I wouldn't consider the new behaviour wrong in itself; the issue is just that it's not replicating the unoptimised behaviour, and so multiconcat is failing in its aim of being a very pure optimisation. FWIW, the behaviour of the same test case has changed before: 5.6.0 to 5.7.1: ac1b1c2 5.8.0 to 5.8.8, 5.9.0 to 5.9.2: b1c1b1c2 5.8.9, 5.9.3 to 5.13.1: ac1b1c2 5.13.2 to 5.27.5: b1c1b1c2 [Please do not change anything below this line] ----------------------------------------------------------------- --- Flags: category=core severity=low --- Site configuration information for perl 5.27.7: Configured by zefram at Sun Dec 17 11:32:35 GMT 2017. Summary of my perl5 (revision 5 version 27 subversion 7) configuration: Commit id: da4e040f42421764ef069371d77c008e6b801f45 Platform: osname=linux osvers=3.16.0-4-amd64 archname=x86_64-linux-thread-multi uname='linux barba.rous.org 3.16.0-4-amd64 #1 smp debian 3.16.43-2+deb8u2 (2017-06-26) x86_64 gnulinux ' config_args='-des -Dprefix=/home/zefram/usr/perl/perl_install/perl-git-blead-i64-f52 -Duselargefiles -Dusethreads -Uafs -Ud_csh -Uusesfio -Uusenm -Duseshrplib -Dusedevel -Uversiononly -Ui_db -DDEBUGGING' hint=recommended useposix=true d_sigaction=define useithreads=define usemultiplicity=define use64bitint=define use64bitall=define uselongdouble=undef usemymalloc=n default_inc_excludes_dot=define bincompat5005=undef Compiler: cc='cc' ccflags ='-D_REENTRANT -D_GNU_SOURCE -fwrapv -DDEBUGGING -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -D_FORTIFY_SOURCE=2' optimize='-O2 -g' cppflags='-D_REENTRANT -D_GNU_SOURCE -fwrapv -DDEBUGGING -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include' ccversion='' gccversion='4.9.2' gccosandvers='' intsize=4 longsize=8 ptrsize=8 doublesize=8 byteorder=12345678 doublekind=3 d_longlong=define longlongsize=8 d_longdbl=define longdblsize=16 longdblkind=3 ivtype='long' ivsize=8 nvtype='double' nvsize=8 Off_t='off_t' lseeksize=8 alignbytes=8 prototype=define Linker and Libraries: ld='cc' ldflags =' -fstack-protector-strong -L/usr/local/lib' libpth=/usr/local/lib /usr/lib/gcc/x86_64-linux-gnu/4.9/include-fixed /usr/include/x86_64-linux-gnu /usr/lib /lib/x86_64-linux-gnu /lib/../lib /usr/lib/x86_64-linux-gnu /usr/lib/../lib /lib libs=-lpthread -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lc -lgdbm_compat perllibs=-lpthread -lnsl -ldl -lm -lcrypt -lutil -lc libc=libc-2.19.so so=so useshrplib=true libperl=libperl.so gnulibc_version='2.19' Dynamic Linking: dlsrc=dl_dlopen.xs dlext=so d_dlsymun=undef ccdlflags='-Wl,-E -Wl,-rpath,/home/zefram/usr/perl/perl_install/perl-git-blead-i64-f52/lib/5.27.7/x86_64-linux-thread-multi/CORE' cccdlflags='-fPIC' lddlflags='-shared -O2 -g -L/usr/local/lib -fstack-protector-strong' --- @INC for perl 5.27.7: /home/zefram/usr/perl/pg/lib /home/zefram/usr/perl/perl_install/perl-git-blead-i64-f52/lib/site_perl/5.27.7/x86_64-linux-thread-multi /home/zefram/usr/perl/perl_install/perl-git-blead-i64-f52/lib/site_perl/5.27.7 /home/zefram/usr/perl/perl_install/perl-git-blead-i64-f52/lib/5.27.7/x86_64-linux-thread-multi /home/zefram/usr/perl/perl_install/perl-git-blead-i64-f52/lib/5.27.7 --- Environment for perl 5.27.7: HOME=/home/zefram LANG (unset) LANGUAGE (unset) LD_LIBRARY_PATH=/home/zefram/usr/perl/pg LOGDIR (unset) PATH=/home/zefram/usr/perl/util:/home/zefram/pub/x86_64-unknown-linux-gnu/bin:/home/zefram/pub/common/bin:/usr/bin:/bin:/usr/local/bin:/usr/games PERL5LIB=/home/zefram/usr/perl/pg/lib PERLDOC=-oman PERL_BADLANG (unset) SHELL=/usr/bin/zsh
From: Dave Mitchell <davem [...] iabyn.com>
Date: Tue, 20 Feb 2018 09:21:17 +0000
Subject: Re: [perl #132595] multiconcat reading and fetching wrong order
To: perl5-porters [...] perl.org
On Sun, Dec 17, 2017 at 05:29:41AM -0800, Zefram wrote: Show quoted text
> The multiconcat op is stated to replicate the order of magic handling > and so on that was exhibited by the unoptimised concat op. But here's > a case where the replication is not exact: > > $ perl5.27.5 -lwe '$i = 0; { package T; sub TIESCALAR { bless({}, $_[0]) } sub FETCH { ++$::i; $::foo = "b".$::i; "c".$::i } } $foo = "a"; tie $bar, "T"; print $foo.$bar.$foo.$bar' > b1c1b1c2 > $ perl5.27.6 -lwe '$i = 0; { package T; sub TIESCALAR { bless({}, $_[0]) } sub FETCH { ++$::i; $::foo = "b".$::i; "c".$::i } } $foo = "a"; tie $bar, "T"; print $foo.$bar.$foo.$bar' > ac1b2c2 > > Blead behaves the same as 5.27.6. I wouldn't consider the new > behaviour wrong in itself; the issue is just that it's not replicating > the unoptimised behaviour, and so multiconcat is failing in its aim of > being a very pure optimisation. > > FWIW, the behaviour of the same test case has changed before: > > 5.6.0 to 5.7.1: ac1b1c2 > 5.8.0 to 5.8.8, 5.9.0 to 5.9.2: b1c1b1c2 > 5.8.9, 5.9.3 to 5.13.1: ac1b1c2 > 5.13.2 to 5.27.5: b1c1b1c2
Now fixed with: commit af39014264c90cfc5a35e4f6e39ba038a8fb0c29 Author: David Mitchell <davem@iabyn.com> AuthorDate: Sat Feb 10 15:27:59 2018 +0000 Commit: David Mitchell <davem@iabyn.com> CommitDate: Mon Feb 19 22:06:49 2018 +0000 redo magic/overload handing in pp_multiconcat The way pp_multiconcat handles things like tieing and overloading doesn't work very well at the moment. There's a lot of code to handle edge cases, and there are still open bugs. The basic algorithm in pp_multiconcat is to first stringify (i.e. call SvPV() on) *all* args, then use the obtained values to calculate the total length and utf8ness required, then do a single SvGROW and copy all the bytes from all the args. This ordering is wrong when variables with visible side effects, such as tie/overload, are encountered. The current approach is to stringify args up until such an arg is encountered, concat all args up until that one together via the normal fast route, then jump to a special block of code which concats any remaining args one by one the "hard" way, handling overload etc. This is problematic because we sometimes need to go back in time. For example in ($undef . $overloaded), we're supposed to call $overloaded->concat($undef, reverse=1) so to speak, but by the time of the method call, we've already tried to stringify $undef and emitted a spurious 'uninit var' warning. The new approach taken in this commit is to: 1) Bail out of the stringify loop under a greater range of problematical variable classes - namely we stop when encountering *anything* which might cause external effects, so in addition to tied and overloaded vars, we now stop for any sort of get magic, or any undefined value where warnings are in scope. 2) If we bail out, we throw away any stringification results so far, and concatenate *all* args the slow way, even ones we're already stringified. This solves the "going back in time" problem mentioned above. It's safe because the only vars that get processed twice are ones for which the first stringification could have no side effects. The slow concat loop now uses S_do_concat(), which is a new static inline function which implements the main body of pp_concat() - so they share identical code. An intentional side-effect of this commit is to fix three tickets: RT #132783 RT #132827 RT #132595 so tests for them are included in this commit. One effect of this commit is that string concatenation of magic or undefined vars will now be slower than before, e.g. "pid=$$" "value=$undef" but they will probably still be faster than before pp_multiconcat was introduced. -- Please note that ash-trays are provided for the use of smokers, whereas the floor is provided for the use of all patrons. -- Bill Royston
CC: Perl5 Porters <perl5-porters [...] perl.org>
To: Dave Mitchell <davem [...] iabyn.com>
Date: Tue, 20 Feb 2018 11:06:36 +0100
Subject: Re: [perl #132595] multiconcat reading and fetching wrong order
From: Leon Timmermans <fawaka [...] gmail.com>
Download (untitled) / with headers
text/plain 2.6k
On Tue, Feb 20, 2018 at 10:21 AM, Dave Mitchell <davem@iabyn.com> wrote: Show quoted text
> Now fixed with: > > commit af39014264c90cfc5a35e4f6e39ba038a8fb0c29 > Author: David Mitchell <davem@iabyn.com> > AuthorDate: Sat Feb 10 15:27:59 2018 +0000 > Commit: David Mitchell <davem@iabyn.com> > CommitDate: Mon Feb 19 22:06:49 2018 +0000 > > redo magic/overload handing in pp_multiconcat > > The way pp_multiconcat handles things like tieing and overloading > doesn't work very well at the moment. There's a lot of code to handle > edge cases, and there are still open bugs. > > The basic algorithm in pp_multiconcat is to first stringify (i.e. call > SvPV() on) *all* args, then use the obtained values to calculate the total > length and utf8ness required, then do a single SvGROW and copy all the > bytes from all the args. > > This ordering is wrong when variables with visible side effects, such as > tie/overload, are encountered. The current approach is to stringify args > up until such an arg is encountered, concat all args up until that one > together via the normal fast route, then jump to a special block of code > which concats any remaining args one by one the "hard" way, handling > overload etc. > > This is problematic because we sometimes need to go back in time. For > example in ($undef . $overloaded), we're supposed to call > $overloaded->concat($undef, reverse=1) > so to speak, but by the time of the method call, we've already tried to > stringify $undef and emitted a spurious 'uninit var' warning. > > The new approach taken in this commit is to: > > 1) Bail out of the stringify loop under a greater range of problematical > variable classes - namely we stop when encountering *anything* which > might cause external effects, so in addition to tied and overloaded vars, > we now stop for any sort of get magic, or any undefined value where > warnings are in scope. > > 2) If we bail out, we throw away any stringification results so far, > and concatenate *all* args the slow way, even ones we're already > stringified. This solves the "going back in time" problem mentioned above. > It's safe because the only vars that get processed twice are ones for which > the first stringification could have no side effects. > > The slow concat loop now uses S_do_concat(), which is a new static inline > function which implements the main body of pp_concat() - so they share > identical code. > > An intentional side-effect of this commit is to fix three tickets: > > RT #132783 > RT #132827 > RT #132595 > > so tests for them are included in this commit.
Good stuff :-) Leon


This service is sponsored and maintained by Best Practical Solutions and runs on Perl.org infrastructure.

For issues related to this RT instance (aka "perlbug"), please contact perlbug-admin at perl.org