Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multiconcat reading and fetching wrong order #16311

Closed
p5pRT opened this issue Dec 17, 2017 · 5 comments
Closed

multiconcat reading and fetching wrong order #16311

p5pRT opened this issue Dec 17, 2017 · 5 comments

Comments

@p5pRT
Copy link

p5pRT commented Dec 17, 2017

Migrated from rt.perl.org#132595 (status was 'resolved')

Searchable as RT132595$

@p5pRT
Copy link
Author

p5pRT commented Dec 17, 2017

From zefram@fysh.org

Created by zefram@fysh.org

The multiconcat op is stated to replicate the order of magic handling
and so on that was exhibited by the unoptimised concat op. But here's
a case where the replication is not exact​:

$ perl5.27.5 -lwe '$i = 0; { package T; sub TIESCALAR { bless({}, $_[0]) } sub FETCH { ++$​::i; $​::foo = "b".$​::i; "c".$​::i } } $foo = "a"; tie $bar, "T"; print $foo.$bar.$foo.$bar'
b1c1b1c2
$ perl5.27.6 -lwe '$i = 0; { package T; sub TIESCALAR { bless({}, $_[0]) } sub FETCH { ++$​::i; $​::foo = "b".$​::i; "c".$​::i } } $foo = "a"; tie $bar, "T"; print $foo.$bar.$foo.$bar'
ac1b2c2

Blead behaves the same as 5.27.6. I wouldn't consider the new
behaviour wrong in itself; the issue is just that it's not replicating
the unoptimised behaviour, and so multiconcat is failing in its aim of
being a very pure optimisation.

FWIW, the behaviour of the same test case has changed before​:

5.6.0 to 5.7.1​: ac1b1c2
5.8.0 to 5.8.8, 5.9.0 to 5.9.2​: b1c1b1c2
5.8.9, 5.9.3 to 5.13.1​: ac1b1c2
5.13.2 to 5.27.5​: b1c1b1c2

Perl Info

Flags:
    category=core
    severity=low

Site configuration information for perl 5.27.7:

Configured by zefram at Sun Dec 17 11:32:35 GMT 2017.

Summary of my perl5 (revision 5 version 27 subversion 7) configuration:
  Commit id: da4e040f42421764ef069371d77c008e6b801f45
  Platform:
    osname=linux
    osvers=3.16.0-4-amd64
    archname=x86_64-linux-thread-multi
    uname='linux barba.rous.org 3.16.0-4-amd64 #1 smp debian 3.16.43-2+deb8u2 (2017-06-26) x86_64 gnulinux '
    config_args='-des -Dprefix=/home/zefram/usr/perl/perl_install/perl-git-blead-i64-f52 -Duselargefiles -Dusethreads -Uafs -Ud_csh -Uusesfio -Uusenm -Duseshrplib -Dusedevel -Uversiononly -Ui_db -DDEBUGGING'
    hint=recommended
    useposix=true
    d_sigaction=define
    useithreads=define
    usemultiplicity=define
    use64bitint=define
    use64bitall=define
    uselongdouble=undef
    usemymalloc=n
    default_inc_excludes_dot=define
    bincompat5005=undef
  Compiler:
    cc='cc'
    ccflags ='-D_REENTRANT -D_GNU_SOURCE -fwrapv -DDEBUGGING -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -D_FORTIFY_SOURCE=2'
    optimize='-O2 -g'
    cppflags='-D_REENTRANT -D_GNU_SOURCE -fwrapv -DDEBUGGING -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include'
    ccversion=''
    gccversion='4.9.2'
    gccosandvers=''
    intsize=4
    longsize=8
    ptrsize=8
    doublesize=8
    byteorder=12345678
    doublekind=3
    d_longlong=define
    longlongsize=8
    d_longdbl=define
    longdblsize=16
    longdblkind=3
    ivtype='long'
    ivsize=8
    nvtype='double'
    nvsize=8
    Off_t='off_t'
    lseeksize=8
    alignbytes=8
    prototype=define
  Linker and Libraries:
    ld='cc'
    ldflags =' -fstack-protector-strong -L/usr/local/lib'
    libpth=/usr/local/lib /usr/lib/gcc/x86_64-linux-gnu/4.9/include-fixed /usr/include/x86_64-linux-gnu /usr/lib /lib/x86_64-linux-gnu /lib/../lib /usr/lib/x86_64-linux-gnu /usr/lib/../lib /lib
    libs=-lpthread -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lc -lgdbm_compat
    perllibs=-lpthread -lnsl -ldl -lm -lcrypt -lutil -lc
    libc=libc-2.19.so
    so=so
    useshrplib=true
    libperl=libperl.so
    gnulibc_version='2.19'
  Dynamic Linking:
    dlsrc=dl_dlopen.xs
    dlext=so
    d_dlsymun=undef
    ccdlflags='-Wl,-E -Wl,-rpath,/home/zefram/usr/perl/perl_install/perl-git-blead-i64-f52/lib/5.27.7/x86_64-linux-thread-multi/CORE'
    cccdlflags='-fPIC'
    lddlflags='-shared -O2 -g -L/usr/local/lib -fstack-protector-strong'



@INC for perl 5.27.7:
    /home/zefram/usr/perl/pg/lib
    /home/zefram/usr/perl/perl_install/perl-git-blead-i64-f52/lib/site_perl/5.27.7/x86_64-linux-thread-multi
    /home/zefram/usr/perl/perl_install/perl-git-blead-i64-f52/lib/site_perl/5.27.7
    /home/zefram/usr/perl/perl_install/perl-git-blead-i64-f52/lib/5.27.7/x86_64-linux-thread-multi
    /home/zefram/usr/perl/perl_install/perl-git-blead-i64-f52/lib/5.27.7


Environment for perl 5.27.7:
    HOME=/home/zefram
    LANG (unset)
    LANGUAGE (unset)
    LD_LIBRARY_PATH=/home/zefram/usr/perl/pg
    LOGDIR (unset)
    PATH=/home/zefram/usr/perl/util:/home/zefram/pub/x86_64-unknown-linux-gnu/bin:/home/zefram/pub/common/bin:/usr/bin:/bin:/usr/local/bin:/usr/games
    PERL5LIB=/home/zefram/usr/perl/pg/lib
    PERLDOC=-oman
    PERL_BADLANG (unset)
    SHELL=/usr/bin/zsh

@p5pRT
Copy link
Author

p5pRT commented Feb 20, 2018

From @iabyn

On Sun, Dec 17, 2017 at 05​:29​:41AM -0800, Zefram wrote​:

The multiconcat op is stated to replicate the order of magic handling
and so on that was exhibited by the unoptimised concat op. But here's
a case where the replication is not exact​:

$ perl5.27.5 -lwe '$i = 0; { package T; sub TIESCALAR { bless({}, $_[0]) } sub FETCH { ++$​::i; $​::foo = "b".$​::i; "c".$​::i } } $foo = "a"; tie $bar, "T"; print $foo.$bar.$foo.$bar'
b1c1b1c2
$ perl5.27.6 -lwe '$i = 0; { package T; sub TIESCALAR { bless({}, $_[0]) } sub FETCH { ++$​::i; $​::foo = "b".$​::i; "c".$​::i } } $foo = "a"; tie $bar, "T"; print $foo.$bar.$foo.$bar'
ac1b2c2

Blead behaves the same as 5.27.6. I wouldn't consider the new
behaviour wrong in itself; the issue is just that it's not replicating
the unoptimised behaviour, and so multiconcat is failing in its aim of
being a very pure optimisation.

FWIW, the behaviour of the same test case has changed before​:

5.6.0 to 5.7.1​: ac1b1c2
5.8.0 to 5.8.8, 5.9.0 to 5.9.2​: b1c1b1c2
5.8.9, 5.9.3 to 5.13.1​: ac1b1c2
5.13.2 to 5.27.5​: b1c1b1c2

Now fixed with​:

commit af39014
Author​: David Mitchell <davem@​iabyn.com>
AuthorDate​: Sat Feb 10 15​:27​:59 2018 +0000
Commit​: David Mitchell <davem@​iabyn.com>
CommitDate​: Mon Feb 19 22​:06​:49 2018 +0000

  redo magic/overload handing in pp_multiconcat
 
  The way pp_multiconcat handles things like tieing and overloading
  doesn't work very well at the moment. There's a lot of code to handle
  edge cases, and there are still open bugs.
 
  The basic algorithm in pp_multiconcat is to first stringify (i.e. call
  SvPV() on) *all* args, then use the obtained values to calculate the total
  length and utf8ness required, then do a single SvGROW and copy all the
  bytes from all the args.
 
  This ordering is wrong when variables with visible side effects, such as
  tie/overload, are encountered. The current approach is to stringify args
  up until such an arg is encountered, concat all args up until that one
  together via the normal fast route, then jump to a special block of code
  which concats any remaining args one by one the "hard" way, handling
  overload etc.
 
  This is problematic because we sometimes need to go back in time. For
  example in ($undef . $overloaded), we're supposed to call
  $overloaded->concat($undef, reverse=1)
  so to speak, but by the time of the method call, we've already tried to
  stringify $undef and emitted a spurious 'uninit var' warning.
 
  The new approach taken in this commit is to​:
 
  1) Bail out of the stringify loop under a greater range of problematical
  variable classes - namely we stop when encountering *anything* which
  might cause external effects, so in addition to tied and overloaded vars,
  we now stop for any sort of get magic, or any undefined value where
  warnings are in scope.
 
  2) If we bail out, we throw away any stringification results so far,
  and concatenate *all* args the slow way, even ones we're already
  stringified. This solves the "going back in time" problem mentioned above.
  It's safe because the only vars that get processed twice are ones for which
  the first stringification could have no side effects.
 
  The slow concat loop now uses S_do_concat(), which is a new static inline
  function which implements the main body of pp_concat() - so they share
  identical code.
 
  An intentional side-effect of this commit is to fix three tickets​:
 
  RT #132783
  RT #132827
  RT #132595
 
  so tests for them are included in this commit.
 
  One effect of this commit is that string concatenation of magic or
  undefined vars will now be slower than before, e.g.
 
  "pid=$$"
  "value=$undef"
 
  but they will probably still be faster than before pp_multiconcat was
  introduced.

--
Please note that ash-trays are provided for the use of smokers,
whereas the floor is provided for the use of all patrons.
  -- Bill Royston

@p5pRT
Copy link
Author

p5pRT commented Feb 20, 2018

The RT System itself - Status changed from 'new' to 'open'

@p5pRT p5pRT closed this as completed Feb 20, 2018
@p5pRT
Copy link
Author

p5pRT commented Feb 20, 2018

@iabyn - Status changed from 'open' to 'resolved'

@p5pRT
Copy link
Author

p5pRT commented Feb 20, 2018

From @Leont

On Tue, Feb 20, 2018 at 10​:21 AM, Dave Mitchell <davem@​iabyn.com> wrote​:

Now fixed with​:

commit af39014
Author​: David Mitchell <davem@​iabyn.com>
AuthorDate​: Sat Feb 10 15​:27​:59 2018 +0000
Commit​: David Mitchell <davem@​iabyn.com>
CommitDate​: Mon Feb 19 22​:06​:49 2018 +0000

redo magic/overload handing in pp\_multiconcat

The way pp\_multiconcat handles things like tieing and overloading
doesn't work very well at the moment\. There's a lot of code to handle
edge cases\, and there are still open bugs\.

The basic algorithm in pp\_multiconcat is to first stringify \(i\.e\. call
SvPV\(\) on\) \*all\* args\, then use the obtained values to calculate the total
length and utf8ness required\, then do a single SvGROW and copy all the
bytes from all the args\.

This ordering is wrong when variables with visible side effects\, such as
tie/overload\, are encountered\. The current approach is to stringify args
up until such an arg is encountered\, concat all args up until that one
together via the normal fast route\, then jump to a special block of code
which concats any remaining args one by one the "hard" way\, handling
overload etc\.

This is problematic because we sometimes need to go back in time\. For
example in \($undef \. $overloaded\)\, we're supposed to call
    $overloaded\->concat\($undef\, reverse=1\)
so to speak\, but by the time of the method call\, we've already tried to
stringify $undef and emitted a spurious 'uninit var' warning\.

The new approach taken in this commit is to&#8203;:

1\) Bail out of the stringify loop under a greater range of problematical
variable classes \- namely we stop when encountering \*anything\* which
might cause external effects\, so in addition to tied and overloaded vars\,
we now stop for any sort of get magic\, or any undefined value where
warnings are in scope\.

2\) If we bail out\, we throw away any stringification results so far\,
and concatenate \*all\* args the slow way\, even ones we're already
stringified\. This solves the "going back in time" problem mentioned above\.
It's safe because the only vars that get processed twice are ones for which
the first stringification could have no side effects\.

The slow concat loop now uses S\_do\_concat\(\)\, which is a new static inline
function which implements the main body of pp\_concat\(\) \- so they share
identical code\.

An intentional side\-effect of this commit is to fix three tickets&#8203;:

    RT \#132783
    RT \#132827
    RT \#132595

so tests for them are included in this commit\.

Good stuff :-)

Leon

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant