Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

peephole optimiser could prune more dead code #10481

Closed
p5pRT opened this issue Jul 9, 2010 · 59 comments
Closed

peephole optimiser could prune more dead code #10481

p5pRT opened this issue Jul 9, 2010 · 59 comments

Comments

@p5pRT
Copy link

p5pRT commented Jul 9, 2010

Migrated from rt.perl.org#76438 (status was 'open')

Searchable as RT76438$

@p5pRT
Copy link
Author

p5pRT commented Jul 9, 2010

From @nwc10

Created by @nwc10

$ ./perl -Ilib -MO=Deparse -e 'if ("Pie" eq "Good") {print}'
'???';
-e syntax OK

but

$ ./perl -Ilib -MO=Deparse -e 'if ($a && "Pie" eq "Good") {print}'
if ($a and !1) {
  print $_;
}
-e syntax OK

which demonstrates that "Pie" eq "Good" is constant folded, but that the
optree for the block still exists.

The peephole optimiser is correct not to optimise this to nothing, as it
can't know that $a is neither tied nor overloaded, so cannot assume that
the lookup of $a has no side effects.

However, it can know that the conditional to the if block is always false,
and so could optimise away the ops for the block, freeing up their memory.
Hence the code should become

  $a and !1;

or even the perl equivalent of

  (void) (bool) $a;

Wishlist, because I've no idea how much real world perl code ends up with
constructions like this, and would benefit.

Nicholas Clark

Perl Info

Flags:
    category=core
    severity=wishlist

Site configuration information for perl 5.13.2:

Configured by nick at Fri Jul  9 10:52:23 BST 2010.

Summary of my perl5 (revision 5 version 13 subversion 2) configuration:
  Derived from: a2d3de138935fbe8f4190ee9176b8fdd812a91d5
  Platform:
    osname=linux, osvers=2.6.18.8-xenu, archname=x86_64-linux
    uname='linux eris 2.6.18.8-xenu #1 smp sat oct 3 10:27:42 bst 2009 x86_64 gnulinux '
    config_args='-Dusedevel=y -Dcc=ccache gcc -Dld=gcc -Ubincompat5005 -Uinstallusrbinperl -Dcf_email=nick@ccl4.org -Dperladmin=nick@ccl4.org -Dinc_version_list=  -Dinc_version_list_init=0 -Doptimize=-Os -Uusethreads -Duse64bitall -Uusemymalloc -Duseperlio -Dprefix=~/Sandpit/snap5.9.x-v5.13.2-220-ga2d3de1 -Uusevendorprefix -Uvendorprefix=~/Sandpit/snap5.9.x-v5.13.2-220-ga2d3de1 -Dinstallman1dir=none -Dinstallman3dir=none -Uuserelocatableinc -Accccflags=-DPERL_OLD_COPY_ON_WRITE -de'
    hint=recommended, useposix=true, d_sigaction=define
    useithreads=undef, usemultiplicity=undef
    useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef
    use64bitint=define, use64bitall=define, uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='ccache gcc', ccflags ='-fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
    optimize='-Os',
    cppflags='-fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include'
    ccversion='', gccversion='4.3.2', gccosandvers=''
    intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
    ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
    alignbytes=8, prototype=define
  Linker and Libraries:
    ld='gcc', ldflags =' -fstack-protector -L/usr/local/lib'
    libpth=/usr/local/lib /lib /usr/lib /lib64 /usr/lib64
    libs=-lnsl -ldb -ldl -lm -lcrypt -lutil -lc
    perllibs=-lnsl -ldl -lm -lcrypt -lutil -lc
    libc=/lib/libc-2.7.so, so=so, useshrplib=false, libperl=libperl.a
    gnulibc_version='2.7'
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E'
    cccdlflags='-fPIC', lddlflags='-shared -Os -L/usr/local/lib -fstack-protector'

Locally applied patches:
    


@INC for perl 5.13.2:
    lib
    /home/nick/Sandpit/snap5.9.x-v5.13.2-220-ga2d3de1/lib/perl5/site_perl/5.13.2/x86_64-linux
    /home/nick/Sandpit/snap5.9.x-v5.13.2-220-ga2d3de1/lib/perl5/site_perl/5.13.2
    /home/nick/Sandpit/snap5.9.x-v5.13.2-220-ga2d3de1/lib/perl5/5.13.2/x86_64-linux
    /home/nick/Sandpit/snap5.9.x-v5.13.2-220-ga2d3de1/lib/perl5/5.13.2
    .


Environment for perl 5.13.2:
    HOME=/home/nick
    LANG (unset)
    LANGUAGE (unset)
    LD_LIBRARY_PATH (unset)
    LOGDIR (unset)
    PATH=/home/nick/bin:/usr/local/bin:/usr/bin:/bin:/usr/games:/usr/local/sbin:/sbin:/usr/sbin
    PERL_BADLANG (unset)
    SHELL=/bin/bash

@p5pRT
Copy link
Author

p5pRT commented Jul 10, 2010

From james@mastros.biz

On 9 July 2010 16​:56, Nicholas Clark <perlbug-followup@​perl.org> wrote​:

$ ./perl -Ilib -MO=Deparse -e 'if ("Pie" eq "Good") {print}'
'???';
-e syntax OK

but

$ ./perl -Ilib -MO=Deparse -e 'if ($a && "Pie" eq "Good") {print}'
if ($a and !1) {
   print $_;
}
-e syntax OK

which demonstrates that "Pie" eq "Good" is constant folded, but that the
optree for the block still exists.

The peephole optimiser is correct not to optimise this to nothing, as it
can't know that $a is neither tied nor overloaded, so cannot assume that
the lookup of $a has no side effects.

However, it can know that the conditional to the if block is always false,
and so could optimise away the ops for the block, freeing up their memory.
Hence the code should become

   $a and !1;

or even the perl equivalent of

   (void) (bool) $a;

Wishlist, because I've no idea how much real world perl code ends up with
constructions like this, and would benefit

I do wonder, sometimes, if we worry entirely too much about just when
tie and overload calls or done. Would it break actual real-world code
to not retrieve the value of $a (tie) or not boolify it (overload)
when the value will be thrown away anyway? Clearly, we can't do this
in a maintance release, but perhaps we should add warnings that we are
planning on doing it to 5.14.0? It seems to me that doing this would
allow all sorts of optimizations that we currently think of, and then
say "that'd change overloading", and throw out, with very little
impact on real-world code, which either doesn't use overloading, or
would be happy if overloading were made faster by avoiding it where
possible.

  -=- James Mastros / theorbtwo

@p5pRT
Copy link
Author

p5pRT commented Jul 10, 2010

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Jul 10, 2010

From @jbenjore

On Fri, Jul 9, 2010 at 8​:56 AM, Nicholas Clark
<perlbug-followup@​perl.org> wrote​:

# New Ticket Created by  Nicholas Clark
# Please include the string​:  [perl #76438]
# in the subject line of all future correspondence about this issue.
# <URL​: http​://rt.perl.org/rt3/Ticket/Display.html?id=76438 >

This is a bug report for perl from nick@​ccl4.org,
generated with the help of perlbug 1.39 running under perl 5.13.2.

-----------------------------------------------------------------
[Please describe your issue here]

$ ./perl -Ilib -MO=Deparse -e 'if ("Pie" eq "Good") {print}'
'???';
-e syntax OK

but

$ ./perl -Ilib -MO=Deparse -e 'if ($a && "Pie" eq "Good") {print}'
if ($a and !1) {
   print $_;
}
-e syntax OK

which demonstrates that "Pie" eq "Good" is constant folded, but that the
optree for the block still exists.

The peephole optimiser is correct not to optimise this to nothing, as it
can't know that $a is neither tied nor overloaded, so cannot assume that
the lookup of $a has no side effects.

However, it can know that the conditional to the if block is always false,
and so could optimise away the ops for the block, freeing up their memory.
Hence the code should become

   $a and !1;

or even the perl equivalent of

   (void) (bool) $a;

Well actually, the 'bool' call was in scalar context, not void.

Josh

@p5pRT
Copy link
Author

p5pRT commented Jul 10, 2010

From @demerphq

On 10 July 2010 13​:36, James Mastros <james@​mastros.biz> wrote​:

On 9 July 2010 16​:56, Nicholas Clark <perlbug-followup@​perl.org> wrote​:

$ ./perl -Ilib -MO=Deparse -e 'if ("Pie" eq "Good") {print}'
'???';
-e syntax OK

but

$ ./perl -Ilib -MO=Deparse -e 'if ($a && "Pie" eq "Good") {print}'
if ($a and !1) {
   print $_;
}
-e syntax OK

which demonstrates that "Pie" eq "Good" is constant folded, but that the
optree for the block still exists.

The peephole optimiser is correct not to optimise this to nothing, as it
can't know that $a is neither tied nor overloaded, so cannot assume that
the lookup of $a has no side effects.

However, it can know that the conditional to the if block is always false,
and so could optimise away the ops for the block, freeing up their memory.
Hence the code should become

   $a and !1;

or even the perl equivalent of

   (void) (bool) $a;

Wishlist, because I've no idea how much real world perl code ends up with
constructions like this, and would benefit

I do wonder, sometimes, if we worry entirely too much about just when
tie and overload calls or done.  Would it break actual real-world code
to not retrieve the value of $a (tie) or not boolify it (overload)
when the value will be thrown away anyway?  Clearly, we can't do this
in a maintance release, but perhaps we should add warnings that we are
planning on doing it to 5.14.0?  It seems to me that doing this would
allow all sorts of optimizations that we currently think of, and then
say "that'd change overloading", and throw out, with very little
impact on real-world code, which either doesn't use overloading, or
would be happy if overloading were made faster by avoiding it where
possible.

This came up in another thread. JIT compilation techniques combined
with smaller less generic ops would give us the opportunity to rewrite
this as a more efficient structure.

I also proposed a "no magic" pragma/syntax that would allow the
optimiser to assume that all variables were "normal", and that funky
stuff wasnt going to occur during the scope of the pragma. And in such
a block i would expect this to block to be optimised away.

cheers,
Yves

--
perl -Mre=debug -e "/just|another|perl|hacker/"

@p5pRT
Copy link
Author

p5pRT commented Jul 10, 2010

From @Leont

On Sat, Jul 10, 2010 at 4​:17 PM, demerphq <demerphq@​gmail.com> wrote​:

I also proposed a "no magic" pragma/syntax that would allow the
optimiser to assume that all variables were "normal", and that funky
stuff wasnt going to occur during the scope of the pragma. And in such
a block i would expect this to block to be optimised away.

The use of magic is too pervasive for that. Not only because many
special variables use active magic ($!, $1, %ENV, %SIG, etc…) but also
autovivication, m//g state, tainting and utf8 caching, $#array, pos(),
lvalue substr(), scalar(keys) and a number of more obscure things.

I don't think it's workable.

Leon

@p5pRT
Copy link
Author

p5pRT commented Jul 10, 2010

From ben@morrow.me.uk

Quoth fawaka@​gmail.com (Leon Timmermans)​:

On Sat, Jul 10, 2010 at 4​:17 PM, demerphq <demerphq@​gmail.com> wrote​:

I also proposed a "no magic" pragma/syntax that would allow the
optimiser to assume that all variables were "normal", and that funky
stuff wasnt going to occur during the scope of the pragma. And in such
a block i would expect this to block to be optimised away.

The use of magic is too pervasive for that. Not only because many
special variables use active magic ($!, $1, %ENV, %SIG, etc…) but also
autovivication, m//g state, tainting and utf8 caching, $#array, pos(),
lvalue substr(), scalar(keys) and a number of more obscure things.

I don't think it's workable.

Most of those forms of magic don't have meaningful side-effects, though,
at least on mg_get. (That is, they may have side-effects, but it doesn't
matter if they aren't invoked. Tainting is the obvious exception.) This
is what really matters from the pov of optimisation.

Ben

@p5pRT
Copy link
Author

p5pRT commented Jul 10, 2010

From @Leont

On Sat, Jul 10, 2010 at 9​:16 PM, Ben Morrow <ben@​morrow.me.uk> wrote​:

Quoth fawaka@​gmail.com (Leon Timmermans)​:
Most of those forms of magic don't have meaningful side-effects, though,
at least on mg_get. (That is, they may have side-effects, but it doesn't
matter if they aren't invoked. Tainting is the obvious exception.) This
is what really matters from the pov of optimisation.

If you're only ignoring get magic and not set magic, I guess it's
possible if you special-case special variables with get magic such as
$!. @​_ elements should probably also be exempt or else passing a tied
argument will break very confusingly. I'm not entirely sure what the
consequences of ignoring substr, pos and $#array and autovivication
get magic. I suspect they'll work in the common case, but will break
in corner cases.

Leon

@p5pRT
Copy link
Author

p5pRT commented Jul 10, 2010

From @druud62

James Mastros wrote​:

I do wonder, sometimes, if we worry entirely too much about just when
tie and overload calls or done. Would it break actual real-world code
to not retrieve the value of $a (tie) or not boolify it (overload)
when the value will be thrown away anyway? Clearly, we can't do this
in a maintance release, but perhaps we should add warnings that we are
planning on doing it to 5.14.0? It seems to me that doing this would
allow all sorts of optimizations that we currently think of, and then
say "that'd change overloading", and throw out, with very little
impact on real-world code, which either doesn't use overloading, or
would be happy if overloading were made faster by avoiding it where
possible.

  no tie;

  no overload;

or

  use optimize qw( :no_overload :no_tie );

?

--
Ruud

@p5pRT
Copy link
Author

p5pRT commented Jul 11, 2010

From ben@morrow.me.uk

Quoth rvtol+usenet@​isolution.nl ("Dr.Ruud")​:

James Mastros wrote​:

I do wonder, sometimes, if we worry entirely too much about just when
tie and overload calls or done. Would it break actual real-world code
to not retrieve the value of $a (tie) or not boolify it (overload)
when the value will be thrown away anyway? Clearly, we can't do this
in a maintance release, but perhaps we should add warnings that we are
planning on doing it to 5.14.0? It seems to me that doing this would
allow all sorts of optimizations that we currently think of, and then
say "that'd change overloading", and throw out, with very little
impact on real-world code, which either doesn't use overloading, or
would be happy if overloading were made faster by avoiding it where
possible.

no tie;

no overload;

or

use optimize qw( :no_overload :no_tie );

Obviously, this should be

  use less "magic";

:)

Ben

@p5pRT
Copy link
Author

p5pRT commented Jul 11, 2010

From james@mastros.biz

On 11 July 2010 00​:25, Leon Timmermans <fawaka@​gmail.com> wrote​:

On Sat, Jul 10, 2010 at 9​:16 PM, Ben Morrow <ben@​morrow.me.uk> wrote​:

Quoth fawaka@​gmail.com (Leon Timmermans)​:
Most of those forms of magic don't have meaningful side-effects, though,
at least on mg_get. (That is, they may have side-effects, but it doesn't
matter if they aren't invoked. Tainting is the obvious exception.) This
is what really matters from the pov of optimisation.

If you're only ignoring get magic and not set magic, I guess it's
possible if you special-case special variables with get magic such as
$!. @​_ elements should probably also be exempt or else passing a tied
argument will break very confusingly. I'm not entirely sure what the
consequences of ignoring substr, pos and $#array and autovivication
get magic. I suspect they'll work in the common case, but will break
in corner cases.

I think what Nick was getting at, and certainly what I was getting at,
was not that we should bypass get magic, but rather that, in marked
blocks, it should be acceptable to not call get magic *when the output
is not relevant*, and to cache the value of the get magic during that
lexical scope, each execution. That is, we assume that values act
like values, and not like hidden accessors. $! would still work, so
long as we don't look at $! twice and expect it to change. That is​:

  {
  use less 'magic'; # best name I've seen, we aren't not using it,
just using less of it.
  if ($!) {
  die "Rock me, Amadeus​: $!"
  }
  }

is OK. We invoke the get magic, once, and assume the value hasn't
changed by the second time we want it, possibly, but it shouldn't
anyway.

@​_ elements is probably a matter of "don't do that, then". However,
it's only a problem if the thing passed in has get magic that make it
not act like a normal value during the period the pragma is in effect.

  -=- James Mastros / theorbtwo

@p5pRT
Copy link
Author

p5pRT commented Jul 11, 2010

From @nwc10

On Sun, Jul 11, 2010 at 10​:35​:05AM +0100, James Mastros wrote​:

I think what Nick was getting at, and certainly what I was getting at,
was not that we should bypass get magic, but rather that, in marked
blocks, it should be acceptable to not call get magic *when the output
is not relevant*, and to cache the value of the get magic during that
lexical scope, each execution. That is, we assume that values act
like values, and not like hidden accessors. $! would still work, so
long as we don't look at $! twice and expect it to change. That is​:

No, that's not what *I* was getting at. That's the entire thread that has gone
sideways from what I originally reported. What *I* reported was that​:

$ ./perl -Ilib -MO=Deparse -e 'if ($a && "Pie" eq "Good") {print}'
if ($a and !1) {
  print $_;
}
-e syntax OK

ie there is provably dead code still in the optree - the print statement.

(Which could be removed by the optimiser, without any change to any semantic
of the language. ie it's 100% safe)

Nicholas Clark

@p5pRT
Copy link
Author

p5pRT commented Jul 11, 2010

From @ikegami

On Sun, Jul 11, 2010 at 5​:35 AM, James Mastros <james@​mastros.biz> wrote​:

use less 'magic'; # best name I've seen, we aren't not using it,
just using less of it.

We're not instructing Perl to be less magical, we're promising Perl we won't
be using magic.

@p5pRT
Copy link
Author

p5pRT commented Jul 11, 2010

From @xdg

On Sat, Jul 10, 2010 at 7​:36 AM, James Mastros <james@​mastros.biz> wrote​:

I do wonder, sometimes, if we worry entirely too much about just when
tie and overload calls or done.  Would it break actual real-world code
to not retrieve the value of $a (tie) or not boolify it (overload)
when the value will be thrown away anyway?

I think explicitly clarifying that side effects (like magic) will not
happen if the compiler optimizes a block or expression is a good idea.

Then maybe we could use the less pragma when that isn't desired. (
e.g. "use less 'optimization'" )

-- David

@p5pRT
Copy link
Author

p5pRT commented Jul 11, 2010

From @ikegami

On Sun, Jul 11, 2010 at 3​:40 PM, David Golden <xdaveg@​gmail.com> wrote​:

Then maybe we could use the less pragma when that isn't desired. (
e.g. "use less 'optimization'"

Sounds great to me, except there are backward compatibility issues to
defaulting to aggressive optimisation.

@p5pRT
Copy link
Author

p5pRT commented Jul 11, 2010

From @jbenjore

On Sun, Jul 11, 2010 at 2​:56 PM, Eric Brine <ikegami@​adaelis.com> wrote​:

On Sun, Jul 11, 2010 at 3​:40 PM, David Golden <xdaveg@​gmail.com> wrote​:

Then maybe we could use the less pragma when that isn't desired.  (
e.g. "use less 'optimization'"

Sounds great to me, except there are backward compatibility issues to
defaulting to aggressive optimisation.

Sounds like you want some instrumentation for where your magic is
actually happening at.

Josh

@p5pRT
Copy link
Author

p5pRT commented Jul 12, 2010

From @xdg

On Sun, Jul 11, 2010 at 5​:58 PM, Joshua ben Jore <twists@​gmail.com> wrote​:

On Sun, Jul 11, 2010 at 2​:56 PM, Eric Brine <ikegami@​adaelis.com> wrote​:

On Sun, Jul 11, 2010 at 3​:40 PM, David Golden <xdaveg@​gmail.com> wrote​:

Then maybe we could use the less pragma when that isn't desired.  (
e.g. "use less 'optimization'"

Sounds great to me, except there are backward compatibility issues to
defaulting to aggressive optimisation.

Sounds like you want some instrumentation for where your magic is
actually happening at.

(Insert xdg rant on backwards compatibility)

I'm suggesting that we disclaim any implicit guarantee that the
compiler won't optimize away expressions that have side effects when
evaluated.

Whether any particular optimization is "worth it" is open for later
debate, but at least the door would be open.

Notwithstanding that Nicholas points out that this example isn't what
he was talking about, the question that has been raised is whether we
could just short-circuit an *entire* logical operation if "static"
analysis can determine whether it is true or false.

Effectively​:

  if ($a && 0) { ... } # could be optimized away entirely

I don't know how much code *relies* on something like $a being tied
and getting evaluated first in a logic operation. Likewise, I don't
know how much code actually relies on something like C<< && 0 >>. The
only thing that comes to mind is C<< && DEBUG >> where DEBUG is a
constant.

  warn "..." if $something && DEBUG;

-- David

@p5pRT
Copy link
Author

p5pRT commented Jul 12, 2010

From @ikegami

On Sun, Jul 11, 2010 at 10​:22 PM, David Golden <xdaveg@​gmail.com> wrote​:

I'm suggesting that we disclaim any implicit guarantee that the
compiler won't optimize away expressions that have side effects when
evaluated.

Without that guarantee,

my $x = f()
  or DEBUG && warn(...);
return $x;

would be buggy. Dunno if that matters

@p5pRT
Copy link
Author

p5pRT commented Jul 12, 2010

From @xdg

On Mon, Jul 12, 2010 at 12​:51 AM, Eric Brine <ikegami@​adaelis.com> wrote​:

On Sun, Jul 11, 2010 at 10​:22 PM, David Golden <xdaveg@​gmail.com> wrote​:

I'm suggesting that we disclaim any implicit guarantee that the
compiler won't optimize away expressions that have side effects when
evaluated.

Without that guarantee,

my $x = f()
   or DEBUG && warn(...);
return $x;

would be buggy. Dunno if that matters

I said "... have side effects when evaluated ..." but your example has
a side effect (functional call) in an assignment before the logic
expression. Maybe "evaluated" isn't the right term, but I was
intending it to mean the action of reading a value from a variable.

David

@p5pRT
Copy link
Author

p5pRT commented Jul 12, 2010

From @demerphq

On 12 July 2010 13​:07, David Golden <xdaveg@​gmail.com> wrote​:

On Mon, Jul 12, 2010 at 12​:51 AM, Eric Brine <ikegami@​adaelis.com> wrote​:

On Sun, Jul 11, 2010 at 10​:22 PM, David Golden <xdaveg@​gmail.com> wrote​:

I'm suggesting that we disclaim any implicit guarantee that the
compiler won't optimize away expressions that have side effects when
evaluated.

Without that guarantee,

my $x = f()
   or DEBUG && warn(...);
return $x;

would be buggy. Dunno if that matters

I said "... have side effects when evaluated ..." but your example has
a side effect (functional call) in an assignment before the logic
expression.  Maybe "evaluated" isn't the right term, but I was
intending it to mean the action of reading a value from a variable.

I think this is close to something i mentioned.

My thought was​: given that

$b=$a++ + $a++;

is not defined, that we could also assume that changing fetch magic
inside of fetch magic would only take effect after that statement
concluded, and thus

$b=$a + $a;

where $a is tied/overloaded and the magic changes on invocation of the
fetch is also undefined.

Thus we could check for magic at the beginning of the expression, and
then cache it for the duration, although we would guarantee that the
magic WAS called twice if there was magic.

But the problem with any of these changes is that it could/would break
stuff somewhere. Which is why i figured some kind of compiler hint was
in order, as it would mean that new code could be optimised relatively
sanely and old code would continue on unbroken.

cheers,
Yves

--
perl -Mre=debug -e "/just|another|perl|hacker/"

@p5pRT
Copy link
Author

p5pRT commented Jul 12, 2010

From @demerphq

On 12 July 2010 13​:36, demerphq <demerphq@​gmail.com> wrote​:

On 12 July 2010 13​:07, David Golden <xdaveg@​gmail.com> wrote​:

On Mon, Jul 12, 2010 at 12​:51 AM, Eric Brine <ikegami@​adaelis.com> wrote​:

On Sun, Jul 11, 2010 at 10​:22 PM, David Golden <xdaveg@​gmail.com> wrote​:

I'm suggesting that we disclaim any implicit guarantee that the
compiler won't optimize away expressions that have side effects when
evaluated.

Without that guarantee,

my $x = f()
   or DEBUG && warn(...);
return $x;

would be buggy. Dunno if that matters

I said "... have side effects when evaluated ..." but your example has
a side effect (functional call) in an assignment before the logic
expression.  Maybe "evaluated" isn't the right term, but I was
intending it to mean the action of reading a value from a variable.

I think this is close to something i mentioned.

My thought was​: given that

$b=$a++ + $a++;

is not defined, that we could also assume that changing fetch magic
inside of fetch magic would only take effect after that statement
concluded, and thus

$b=$a + $a;

where $a is tied/overloaded and the magic changes on invocation of the
fetch is also undefined.

Gah. That came out all wrong.

I mean​: given that $b=$a++ + $a++; is undefined, that is that an
expression using a mutator on a variable mentioned twice is undefined,
it seems to me that we can also consider a whole whack of fetch magic
to also be undefined.

Thus we could check for magic at the beginning of the expression, and
then cache it for the duration, although we would guarantee that the
magic WAS called twice if there was magic.

But the problem with any of these changes is that it could/would break
stuff somewhere. Which is why i figured some kind of compiler hint was
in order, as it would mean that new code could be optimised relatively
sanely and old code would continue on unbroken.

Cheers,
yves

--
perl -Mre=debug -e "/just|another|perl|hacker/"

@p5pRT
Copy link
Author

p5pRT commented Jul 12, 2010

From @rurban

2010/7/12 David Golden <xdaveg@​gmail.com>​:

On Sun, Jul 11, 2010 at 5​:58 PM, Joshua ben Jore <twists@​gmail.com> wrote​:

On Sun, Jul 11, 2010 at 2​:56 PM, Eric Brine <ikegami@​adaelis.com> wrote​:

On Sun, Jul 11, 2010 at 3​:40 PM, David Golden <xdaveg@​gmail.com> wrote​:
I'm suggesting that we disclaim any implicit guarantee that the
compiler won't optimize away expressions that have side effects when
evaluated.

Whether any particular optimization is "worth it" is open for later
debate, but at least the door would be open.

That's not the point.
Even with sideeffects from mg_get we can optimize this conditional
to $a only.

perl -MO=Concise,-exec -e'$a and "cmp" eq "cc"'
1 <0> enter
2 <;> nextstate(main 1 -e​:1) v​:{
3 <#> gvsv[*a] s
4 <|> and(other->5) vK/1
5 <@​> leave[1 ref] vKP/REFC

can be optimized to
perl -MO=Concise,-exec -e'$a'
1 <0> enter
2 <;> nextstate(main 1 -e​:1) v​:{
3 <#> gvsv[*a] s
4 <@​> leave[1 ref] vKP/REFC

gvsv is just checking magic and doing the sideeffect, and there would
be no better op to cut through that.

So the question if we should assert for less magic is bogus, as gvsv
is doing the needed run.time check super cheap.
We could gain a little if we know about the lvalue context, to get rid of
pp_hot​:pp_gvsv
if (PL_op->op_private & OPpLVAL_INTRO)
  PUSHs(save_scalar(cGVOP_gv));
  else
  PUSHs(GvSV(cGVOP_gv));
  RETURN;

Notwithstanding that Nicholas points out that this example isn't what
he was talking about, the question that has been raised is whether we
could just short-circuit an *entire* logical operation if "static"
analysis can determine whether it is true or false.

Effectively​:

   if ($a && 0) { ... } # could be optimized away entirely

Not entirely.
The pp_and the {} block could be optimized away to gvsv $a - for the get magic.

if ($a && 0) { ... }
=>
$a

This would nullify a lot of ops if the {} block is large, where we
would come to a state
where copying this part of the optree to exec order and running the
defragmented
tree without null ops would be actually faster then nullify it at
compile-time and running
the nullified original tree at run-time.
This could be known in advance in the optimizer by some heuristic,
when the cost of compile-time defragmentation is less than nullifiying
and skipping the null ops.

Bad that the optimizer module needs some core support. op_seq is gone.
But I can try that in the XS first.
--
Reini

@p5pRT
Copy link
Author

p5pRT commented Jul 12, 2010

From @iabyn

On Mon, Jul 12, 2010 at 06​:06​:15PM +0200, Reini Urban wrote​:

gvsv is just checking magic and doing the sideeffect

Huh? gvsv doesn't call magic.

Also, just as a data point, note that pp_concat *explicitly* calls get
magic twice on $a . $a​:

  if (left == right)
  /* $r.$r​: do magic twice​: tied might return different 2nd time */
  SvGETMAGIC(right);

--
Justice is when you get what you deserve.
Law is when you get what you pay for.

@p5pRT
Copy link
Author

p5pRT commented Jul 12, 2010

From @rurban

2010/7/12 Dave Mitchell <davem@​iabyn.com>​:

On Mon, Jul 12, 2010 at 06​:06​:15PM +0200, Reini Urban wrote​:

gvsv is just checking magic and doing the sideeffect

Huh? gvsv doesn't call magic.

Oops, right.
Something like op_defined or op_ref would be the cheapest existing op
then doing the SvGETMAGIC, or we would need a new op,
probably named pp_sideeffect or pp_getmagic.

Also, just as a data point, note that pp_concat *explicitly* calls get
magic twice on $a . $a​:

       if (left == right)
           /* $r.$r​: do magic twice​: tied might return different 2nd time */
           SvGETMAGIC(right);
--
Reini

@p5pRT
Copy link
Author

p5pRT commented Jul 12, 2010

From @ikegami

On Mon, Jul 12, 2010 at 7​:07 AM, David Golden <xdaveg@​gmail.com> wrote​:

I said "... have side effects when evaluated ..." but your example has
a side effect (functional call) in an assignment before the logic
expression.

I realise you knew it would break. What I was pointing out is that it breaks
a common idiom. "or" and "and" are used for flow control all over in my
code. You're suggesting we can no longer count on the argument evaluation
order of logical operators. It has long been documented that "the right
expression is evaluated only if the left expression is false.", but your
suggestion is to evaluate the RHS first if it's constant. It would change
the function of a fundemental operator.

@p5pRT
Copy link
Author

p5pRT commented Jul 13, 2010

From @demerphq

On 12 July 2010 20​:30, Dave Mitchell <davem@​iabyn.com> wrote​:

On Mon, Jul 12, 2010 at 06​:06​:15PM +0200, Reini Urban wrote​:

gvsv is just checking magic and doing the sideeffect

Huh? gvsv doesn't call magic.

Also, just as a data point, note that pp_concat *explicitly* calls get
magic twice on $a . $a​:

       if (left == right)
           /* $r.$r​: do magic twice​: tied might return different 2nd time */
           SvGETMAGIC(right);

Id like to argue that this was misguided. I dont think we guarantee
any particular order in this case for the fetch calls and thus the
statement is /still/ undefined even with this change.

cheers,
Yves

--
perl -Mre=debug -e "/just|another|perl|hacker/"

@p5pRT
Copy link
Author

p5pRT commented Jul 13, 2010

From ben@morrow.me.uk

Quoth demerphq@​gmail.com (demerphq)​:

On 12 July 2010 20​:30, Dave Mitchell <davem@​iabyn.com> wrote​:

On Mon, Jul 12, 2010 at 06​:06​:15PM +0200, Reini Urban wrote​:

gvsv is just checking magic and doing the sideeffect

Huh? gvsv doesn't call magic.

Also, just as a data point, note that pp_concat *explicitly* calls get
magic twice on $a . $a​:

� � � �if (left == right)
� � � � � �/* $r.$r​: do magic twice​: tied might return different 2nd time */
� � � � � �SvGETMAGIC(right);

Id like to argue that this was misguided. I dont think we guarantee
any particular order in this case for the fetch calls and thus the
statement is /still/ undefined even with this change.

Perl doesn't have undefined behaviour. No matter what weasel words
copied from stdc made it into the ++ docs, Perl's actual evaluation
order has always been straightforward and well-defined. Changing this
may be worth it, for a sufficiently beneficial optimisation, but it is
definitely a backwards-incompatible change.

Ben

@p5pRT
Copy link
Author

p5pRT commented Jul 13, 2010

From @demerphq

On 13 July 2010 13​:43, Ben Morrow <ben@​morrow.me.uk> wrote​:

Quoth demerphq@​gmail.com (demerphq)​:

On 12 July 2010 20​:30, Dave Mitchell <davem@​iabyn.com> wrote​:

On Mon, Jul 12, 2010 at 06​:06​:15PM +0200, Reini Urban wrote​:

gvsv is just checking magic and doing the sideeffect

Huh? gvsv doesn't call magic.

Also, just as a data point, note that pp_concat *explicitly* calls get
magic twice on $a . $a​:

       if (left == right)
           /* $r.$r​: do magic twice​: tied might return different 2nd time */
           SvGETMAGIC(right);

Id like to argue that this was misguided. I dont think we guarantee
any particular order in this case for the fetch calls and thus the
statement is /still/ undefined even with this change.

Perl doesn't have undefined behaviour. No matter what weasel words
copied from stdc made it into the ++ docs, Perl's actual evaluation
order has always been straightforward and well-defined. Changing this
may be worth it, for a sufficiently beneficial optimisation, but it is
definitely a backwards-incompatible change.

Just so everyone can conveniently see​:

From perldoc perlop​:

  Auto-increment and Auto-decrement

  "++" and "--" work as in C. That is, if placed before a
variable, they increment or decrement the variable by one before
returning the value, and if placed after, increment or decrement after
returning the value.

  $i = 0; $j = 0;
  print $i++; # prints 0
  print ++$j; # prints 1

  Note that just as in C, Perl doesn’t define when the variable
is incremented or decremented. You just know it will be done sometime
before or after the value is returned. This also means that modifying
a variable twice in
  the same statement will lead to undefined behaviour. Avoid
statements like​:

  $i = $i ++;
  print ++ $i + $i ++;

  Perl will not guarantee what the result of the above statements is.

If we are going to say that these statements are well defined then we
should probably document exactly what the rules are, as well as
correcting the above docs.

Ill just say that in this case I would much prefer we dont change the
documentation, except to make this much more prominent in the Tie and
Overload documentation. There is more benefit for more people if we
can take advantage of the undefinedness than there is harm done to
people doing naughty things like this despite the documentation (they
are saved only by the lack of prominence of this documentation).

cheers,
Yves

--
perl -Mre=debug -e "/just|another|perl|hacker/"

@p5pRT
Copy link
Author

p5pRT commented Jul 13, 2010

From @avar

On Tue, Jul 13, 2010 at 11​:43, Ben Morrow <ben@​morrow.me.uk> wrote​:

Perl doesn't have undefined behaviour. No matter what weasel words
copied from stdc made it into the ++ docs, Perl's actual evaluation
order has always been straightforward and well-defined. Changing this
may be worth it, for a sufficiently beneficial optimisation, but it is
definitely a backwards-incompatible change.

Undefined doesn't mean that the implementation doesn't act
consistently, just that its documentation explicitly denies
responsibility for having those things work in the future. If they
work now they only work incidentally, and you shouldn't rely on them.

Of course we can't liberally change things that are documented to be
undefined as liberally as a C compiler would, becuase there's only one
perl(1) but multiple cc(1)'s.

@p5pRT
Copy link
Author

p5pRT commented Jul 13, 2010

From @nwc10

On Tue, Jul 13, 2010 at 01​:11​:11PM +0000, Ævar Arnfjörð Bjarmason wrote​:

On Tue, Jul 13, 2010 at 11​:43, Ben Morrow <ben@​morrow.me.uk> wrote​:

Perl doesn't have undefined behaviour. No matter what weasel words
copied from stdc made it into the ++ docs, Perl's actual evaluation
order has always been straightforward and well-defined. Changing this
may be worth it, for a sufficiently beneficial optimisation, but it is
definitely a backwards-incompatible change.

Undefined doesn't mean that the implementation doesn't act
consistently, just that its documentation explicitly denies
responsibility for having those things work in the future. If they
work now they only work incidentally, and you shouldn't rely on them.

http​://www.lysator.liu.se/c/c-faq/c-5.html#5-23

  Briefly​: implementation-defined means that an implementation must choose
  some behavior and document it. Unspecified means that an implementation
  should choose some behavior, but need not document it. Undefined means
  that absolutely anything might happen.

I suspect that all of our documentation should say "unspecified" rather than
"undefined".

Of course we can't liberally change things that are documented to be
undefined as liberally as a C compiler would, becuase there's only one
perl(1) but multiple cc(1)'s.

But whatever we call it, that's the key problem. There is only one
implementation, and as that implementation strives hard to internally avoid
C undefined behaviour, its output will be deterministic, in some fashion.
Hence people come to rely on the current behaviour of the implementation,
documented or not.

Nicholas Clark

@p5pRT
Copy link
Author

p5pRT commented Jul 15, 2010

From @demerphq

On 15 July 2010 01​:12, Dave Mitchell <davem@​iabyn.com> wrote​:

On Wed, Jul 14, 2010 at 09​:39​:35AM +0200, demerphq wrote​:

Whomever decided that

  $a . $a

is specified when $a is tied and returns a different value each fetch
had forgotten this fact.

You'll have to argue that with Hugo then!

commit 8d6d96c
Author​:     Hugo van der Sanden <hv@​crypt.org>
AuthorDate​: Sat May 26 18​:05​:12 2001 +0100
Commit​:     Jarkko Hietaniemi <jhi@​iki.fi>
CommitDate​: Sat May 26 22​:31​:46 2001 +0000

   Re​: 5.6.*, bleadperl​: bugs in pp_concat
   Message-Id​: <200105261605.RAA12295@​crypt.compulink.co.uk>

I'm not sure what your point is? Simply because Hugo wrote/pushed a
patch that somehow proves something? I don't think so. Just because a
commiter didn't think through the full ramifications of a patch, or
even knew of the ramifications but still went through with it on the
grounds of providing "least worst" behaviour doesn't make that patch
law over long existing documentation.

The documentation for ++ is pretty clear.

If the concatenation of a tied variable that mutates is well
specified, then it would mean that one can take a construct documented
to have unspecified behaviour wrap it up in a tie to resolve the
unspecifiedness, which seems to me to be simply absurd.

Thus the onus is not on me to show why this is unspecified, as the
docs say it is, the onus instead is on those who disagree with the
documentation to find a way to get out of this logical absurdity.

I have to say that I'm struggling to see why what you just posted
doesn't essentially boil down to a position that the docs are
meaningless and that whatever is committed is right. If so then you
might as well stop fixing those "bugs" as they aren't really "bugs"
then are they? I'm pretty sure you don't think this, so why do you
think that this patch is different?

cheers,
Yves
--
perl -Mre=debug -e "/just|another|perl|hacker/"

@p5pRT
Copy link
Author

p5pRT commented Jul 15, 2010

From @iabyn

On Thu, Jul 15, 2010 at 10​:31​:51AM +0200, demerphq wrote​:

On 15 July 2010 01​:12, Dave Mitchell <davem@​iabyn.com> wrote​:

On Wed, Jul 14, 2010 at 09​:39​:35AM +0200, demerphq wrote​:

Whomever decided that

  $a . $a

is specified when $a is tied and returns a different value each fetch
had forgotten this fact.

You'll have to argue that with Hugo then!

commit 8d6d96c
Author​:     Hugo van der Sanden <hv@​crypt.org>
AuthorDate​: Sat May 26 18​:05​:12 2001 +0100
Commit​:     Jarkko Hietaniemi <jhi@​iki.fi>
CommitDate​: Sat May 26 22​:31​:46 2001 +0000

   Re​: 5.6.*, bleadperl​: bugs in pp_concat
   Message-Id​: <200105261605.RAA12295@​crypt.compulink.co.uk>

I'm not sure what your point is? Simply because Hugo wrote/pushed a
patch that somehow proves something?

I wasn't making a point, I was just providing information.

I don't think so. Just because a
commiter didn't think through the full ramifications of a patch, or
even knew of the ramifications but still went through with it on the
grounds of providing "least worst" behaviour doesn't make that patch
law over long existing documentation.

The documentation for ++ is pretty clear.

If the concatenation of a tied variable that mutates is well
specified, then it would mean that one can take a construct documented
to have unspecified behaviour wrap it up in a tie to resolve the
unspecifiedness, which seems to me to be simply absurd.

Thus the onus is not on me to show why this is unspecified, as the
docs say it is, the onus instead is on those who disagree with the
documentation to find a way to get out of this logical absurdity.

I have to say that I'm struggling to see why what you just posted
doesn't essentially boil down to a position that the docs are
meaningless and that whatever is committed is right. If so then you
might as well stop fixing those "bugs" as they aren't really "bugs"
then are they? I'm pretty sure you don't think this, so why do you
think that this patch is different?

Just to make it clear, I didn't post that patch to prove a point one way
of another, I just dug it up as a point of info so that people could, if
interested, examine it, look at at the reasoning behind it (e.g. the p5p
discussion if any), and draw whatever conclusions they want. For the
record, I haven't read the original 2001 p5p thread, and haven't drawn any
conclusions.

However, for my opinions for the topic in hand...

as regards tiedness, there are actually two orthogonal issues of
correctness. The first is which order in which the two $a's in $a.$a are
evaluated; the second is how many times $a is evaluated. It is quite
possible for the order not to be defined, but still for the fact that $a
is evaluated twice to be defined. For example, someone might be using tie
to instrument the number of accesses to a variable. Having said that, tied
hash elements only have mg_get called once on them until reset by a
mg_set, and I recently extended that mechanism to tied arrays too.

On the other hand, it may not be documented or specified, but I think most
people would expect that in the following, f() is called before g()​:
  $f() . $g()

Finally, my feeling is that any 'no magic;' scopes aren't really viable in
terms of providing enough guarantees of side-effects for aggressive
optimisation while still providing perly behaviour.

--
There's a traditional definition of a shyster​: a lawyer who, when the law
is against him, pounds on the facts; when the facts are against him,
pounds on the law; and when both the facts and the law are against him,
pounds on the table.
  -- Eben Moglen referring to SCO

@p5pRT
Copy link
Author

p5pRT commented Jul 15, 2010

From @rurban

2010/7/15 demerphq <demerphq@​gmail.com>​:

On 15 July 2010 01​:12, Dave Mitchell <davem@​iabyn.com> wrote​:

On Wed, Jul 14, 2010 at 09​:39​:35AM +0200, demerphq wrote​:

Whomever decided that

  $a . $a

is specified when $a is tied and returns a different value each fetch
had forgotten this fact.

You'll have to argue that with Hugo then!

I'm also with Yves here.
Hugo destroyed language semantics with 5.6.2, if I read the pp_hot.c
patch right.

$a . $a must evaluate $a twice, and not only once,
even if it saves coding lines for the construct
my $b = $a; $b . $b
which should be done if you want to evaluate $a only once.
We don't want to surprise users in favor of having to write less code.

In our case we need​:
1. document this special evaluation rule for the pp_concat op
("if both sides of . refer to the same tied variable, the tied access
is only done once, contrary to the obvious")
or preferred​:
2. revert the pp_concat patch from Hugo,
  so that $a.$a evaluates mg_get twice again for $a

This is would you expect from reading $a . $a.
If you want to evaluate it once, do it once. Typical semantics would be
{ my $b = $a; $b . $b }

But we certainly need a testcase for this mg_get sideeffect.

commit 8d6d96c
Author​:     Hugo van der Sanden <hv@​crypt.org>
AuthorDate​: Sat May 26 18​:05​:12 2001 +0100
Commit​:     Jarkko Hietaniemi <jhi@​iki.fi>
CommitDate​: Sat May 26 22​:31​:46 2001 +0000

   Re​: 5.6.*, bleadperl​: bugs in pp_concat
   Message-Id​: <200105261605.RAA12295@​crypt.compulink.co.uk>

I'm not sure what your point is? Simply because Hugo wrote/pushed a
patch that somehow proves something? I don't think so. Just because a
commiter didn't think through the full ramifications of a patch, or
even knew of the ramifications but still went through with it on the
grounds of providing "least worst" behaviour doesn't make that patch
law over long existing documentation.

The documentation for ++ is pretty clear.

If the concatenation of a tied variable that mutates is well
specified, then it would mean that one can take a construct documented
to have unspecified behaviour wrap it up in a tie to resolve the
unspecifiedness, which seems to me to be simply absurd.

Thus the onus is not on me to show why this is unspecified, as the
docs say it is, the onus instead is on those who disagree with the
documentation to find a way to get out of this logical absurdity.

I have to say that I'm struggling to see why what you just posted
doesn't essentially boil down to a position that the docs are
meaningless and that whatever is committed is right. If so then you
might as well stop fixing those "bugs" as they aren't really "bugs"
then are they? I'm pretty sure you don't think this, so why do you
think that this patch is different?
--
Reini Urban
http​://phpwiki.org/           http​://murbreak.at/

@p5pRT
Copy link
Author

p5pRT commented Jul 15, 2010

From @iabyn

On Thu, Jul 15, 2010 at 01​:28​:58PM +0200, Reini Urban wrote​:

2010/7/15 demerphq <demerphq@​gmail.com>​:

On 15 July 2010 01​:12, Dave Mitchell <davem@​iabyn.com> wrote​:

On Wed, Jul 14, 2010 at 09​:39​:35AM +0200, demerphq wrote​:

Whomever decided that

  $a . $a

is specified when $a is tied and returns a different value each fetch
had forgotten this fact.

You'll have to argue that with Hugo then!

I'm also with Yves here.
Hugo destroyed language semantics with 5.6.2, if I read the pp_hot.c
patch right.

No I think you're reading it the wrong way. Hugo's patch ensures that $a
is evaluated *twice* in $a.$a.

--
Standards (n). Battle insignia or tribal totems.

@p5pRT
Copy link
Author

p5pRT commented Jul 15, 2010

From @rurban

2010/7/15 Dave Mitchell <davem@​iabyn.com>​:

On Thu, Jul 15, 2010 at 01​:28​:58PM +0200, Reini Urban wrote​:

2010/7/15 demerphq <demerphq@​gmail.com>​:

On 15 July 2010 01​:12, Dave Mitchell <davem@​iabyn.com> wrote​:

On Wed, Jul 14, 2010 at 09​:39​:35AM +0200, demerphq wrote​:

Whomever decided that

  $a . $a

is specified when $a is tied and returns a different value each fetch
had forgotten this fact.

You'll have to argue that with Hugo then!

I'm also with Yves here.
Hugo destroyed language semantics with 5.6.2, if I read the pp_hot.c
patch right.

No I think you're reading it the wrong way. Hugo's patch ensures that $a
is evaluated *twice* in $a.$a.

Great! I see it now in the two gmagic tests.
So I'm not with Yves anymore and nothing needs to be done there.

I still have no time to finish my simple optimizer rule for Nicks
original report, so I attach my latest approach. Maybe someone else
might want to finish and test it. This weekend I'm away
--
Reini

@p5pRT
Copy link
Author

p5pRT commented Jul 15, 2010

From @rurban

#! perl

=head1 DESCRIPTION

optimize (and ... NO) to null if no gvsv/padsv, else (dor $x) or do some SvGETMAGIC
(and NO) is always false, but all SVs must call their mg_get for all SVs before not

=head1 EXAMPLE1 gvsv

    $ perl -MO=Concise,-exec -e'if ($a and "x" eq "y") { print $s;}'
    1  <0> enter
    2  <;> nextstate(main 3 -e:1) v:{
    3  <$> gvsv(*a) s
    4  <|> and(other->5) sK/1
    5      <$> const(SPECIAL sv_no) s
    6  <|> and(other->7) vK/1
    7      <0> pushmark s
    8      <$> gvsv(*s) s
    9      <@> print vK
    a  <@> leave[1 ref] vKP/REFC

can be optimized to

    1  <0> enter
    2  <;> nextstate(main 3 -e:1) v:{
    3  <$> gvsv(*a) s
    4  <1> dor vK/1
    a  <@> leave[1 ref] vKP/REFC

=head1 EXAMPLE2 padsv

    $ perl -MO=Concise,-exec -e'my $a; if ($a and "x" eq "y") { print $s;}'

    1  <0> enter
    2  <;> nextstate(main 1 -e:1) v:{
    3  <0> padsv[$a:1,4] vM/LVINTRO
...
    4  <;> nextstate(main 4 -e:1) v:{
    5  <0> padsv[$a:1,4] s
    6  <|> and(other->7) sK/1
    7      <$> const[SPECIAL sv_no] s
    8  <|> and(other->9) vK/1
    9      <0> pushmark s
    a      <#> gvsv[*s] s
    b      <@> print vK
    c  <@> leave[1 ref] vKP/REFC

can be optimized to

    1  <0> enter
    2  <;> nextstate(main 1 -e:1) v:{
    3  <0> padsv[$a:1,3] vM/LVINTRO
...
    4  <;> nextstate(main 2 -e:1) v:{
    5  <$> padsv([$a:1,3) s
    6  <1> dor vK/1
    7  <@> leave[1 ref] vKP/REFC

=head1 EXAMPLE3 ok

    $ perl -MO=Concise,-exec -e'if ("x" eq "y" and $a) { print $s;}'

is already optimized to

    1  <0> enter
    2  <;> nextstate(main 3 -e:1) v:{
    3  <@> leave[1 ref] vKP/REFC

=cut

use optimizer;
use B::Generate;

use optimizer callback => sub {
  my $o	= shift;
  if (($o->name eq 'gvsv' or $o->name eq 'padsv')
      and ${$o->next} and {$o->next}->name eq 'and'
      and ${$o->next->next} and {$o->next->next}->name eq 'const'
      and {$o->next->next}->sv == B::sv_no
     )
  {
    # change o->next to dor and nullify the rest
  }
};

@p5pRT
Copy link
Author

p5pRT commented Jul 15, 2010

From @demerphq

On 15 July 2010 13​:24, Dave Mitchell <davem@​iabyn.com> wrote​:

On Thu, Jul 15, 2010 at 10​:31​:51AM +0200, demerphq wrote​:

On 15 July 2010 01​:12, Dave Mitchell <davem@​iabyn.com> wrote​:

On Wed, Jul 14, 2010 at 09​:39​:35AM +0200, demerphq wrote​:

Whomever decided that

  $a . $a

is specified when $a is tied and returns a different value each fetch
had forgotten this fact.

You'll have to argue that with Hugo then!

commit 8d6d96c
Author​:     Hugo van der Sanden <hv@​crypt.org>
AuthorDate​: Sat May 26 18​:05​:12 2001 +0100
Commit​:     Jarkko Hietaniemi <jhi@​iki.fi>
CommitDate​: Sat May 26 22​:31​:46 2001 +0000

   Re​: 5.6.*, bleadperl​: bugs in pp_concat
   Message-Id​: <200105261605.RAA12295@​crypt.compulink.co.uk>

I'm not sure what your point is? Simply because Hugo wrote/pushed a
patch that somehow proves something?

I wasn't making a point, I was just providing information.

I don't think so. Just because a
commiter didn't think through the full ramifications of a patch, or
even knew of the ramifications but still went through with it on the
grounds of providing "least worst" behaviour doesn't make that patch
law over long existing documentation.

The documentation for ++ is pretty clear.

If the concatenation of a tied variable that mutates is well
specified, then it would mean that one can take a construct documented
to have unspecified behaviour wrap it up in a tie to resolve the
unspecifiedness, which seems to me to be simply absurd.

Thus the onus is not on me to show why this is unspecified, as the
docs say it is, the onus instead is on those who disagree with the
documentation to find a way to get out of this logical absurdity.

I have to say that I'm struggling to see why what you just posted
doesn't essentially boil down to a position that the docs are
meaningless and that whatever is committed is right. If so then you
might as well stop fixing those "bugs" as they aren't really "bugs"
then are they? I'm pretty sure you don't think this, so why do you
think that this patch is different?

Just to make it clear, I didn't post that patch to prove a point one way
of another, I just dug it up as a point of info so that people could, if
interested, examine it, look at at the reasoning behind it (e.g. the p5p
discussion if any), and draw whatever conclusions they want. For the
record, I haven't read the original 2001 p5p thread, and haven't drawn any
conclusions.

My apologies for making the incorrect inference. Clarification understood.

However, for my opinions for the topic in hand...

as regards tiedness, there are actually two orthogonal issues of
correctness. The first is which order in which the two $a's in $a.$a are
evaluated; the second is how many times $a is evaluated. It is quite
possible for the order not to be defined, but still for the fact that $a
is evaluated twice to be defined.

Yes, agreed. i dont have any issue with calling tie twice, and would
expect that it happens. I just wouldn't expect it to happen in a
particular order, nor that that patch makes the expression /defined/.

For example, someone might be using tie
to instrument the number of accesses to a variable. Having said that, tied
hash elements only have mg_get called once on them until reset by a
mg_set, and I recently extended that mechanism to tied arrays too.

What does this mean exactly.

On the other hand, it may not be documented or specified, but I think most
people would expect that in the following, f() is called before g()​:
   $f() . $g()

Hmm. I don't know that I would. If we want this to be the case then
IMO we should document it.

Finally, my feeling is that any 'no magic;' scopes aren't really viable in
terms of providing enough guarantees of side-effects for aggressive
optimisation while still providing perly behaviour.

Ok, thanks. What is the main problem as you see it?

Yves

--
perl -Mre=debug -e "/just|another|perl|hacker/"

@p5pRT
Copy link
Author

p5pRT commented Jul 16, 2010

From @hvds

Dave Mitchell <davem@​iabyn.com> wrote​:
:However, for my opinions for the topic in hand...
:
:as regards tiedness, there are actually two orthogonal issues of
:correctness. The first is which order in which the two $a's in $a.$a are
:evaluated; the second is how many times $a is evaluated. It is quite
:possible for the order not to be defined, but still for the fact that $a
:is evaluated twice to be defined. For example, someone might be using tie
:to instrument the number of accesses to a variable.

This agrees with my thinking - I do not care a jot about the order of
evaluation for this case, but I would be unhappy about any change to
the number of times magic is invoked unless there were first strong
evidence presented that substantial improvements (to speed or something
else) would justify the change.

Hugo

@p5pRT
Copy link
Author

p5pRT commented Jul 20, 2010

From @jandubois

On Fri, 16 Jul 2010, hv@​crypt.org wrote​:

Dave Mitchell <davem@​iabyn.com> wrote​:
:However, for my opinions for the topic in hand...
:
:as regards tiedness, there are actually two orthogonal issues of
:correctness. The first is which order in which the two $a's in $a.$a are
:evaluated; the second is how many times $a is evaluated. It is quite
:possible for the order not to be defined, but still for the fact that $a
:is evaluated twice to be defined. For example, someone might be using tie
:to instrument the number of accesses to a variable.

This agrees with my thinking - I do not care a jot about the order of
evaluation for this case, but I would be unhappy about any change to
the number of times magic is invoked unless there were first strong
evidence presented that substantial improvements (to speed or something
else) would justify the change.

Could you explain _why_ you would care about invoking magic twice, but
don't care about the order of evaluation?

And could you also explain why it makes sense that $a.$a has to invoke
magic twice, while $a x 2 will only call it once?

Cheers,
-Jan

@p5pRT
Copy link
Author

p5pRT commented Jul 20, 2010

From @nwc10

On Tue, Jul 20, 2010 at 01​:12​:29PM -0700, Jan Dubois wrote​:

On Fri, 16 Jul 2010, hv@​crypt.org wrote​:

Dave Mitchell <davem@​iabyn.com> wrote​:
:However, for my opinions for the topic in hand...
:
:as regards tiedness, there are actually two orthogonal issues of
:correctness. The first is which order in which the two $a's in $a.$a are
:evaluated; the second is how many times $a is evaluated. It is quite
:possible for the order not to be defined, but still for the fact that $a
:is evaluated twice to be defined. For example, someone might be using tie
:to instrument the number of accesses to a variable.

This agrees with my thinking - I do not care a jot about the order of
evaluation for this case, but I would be unhappy about any change to
the number of times magic is invoked unless there were first strong
evidence presented that substantial improvements (to speed or something
else) would justify the change.

Could you explain _why_ you would care about invoking magic twice, but
don't care about the order of evaluation?

And could you also explain why it makes sense that $a.$a has to invoke
magic twice, while $a x 2 will only call it once?

On this part, I believe that I agree with Hugo, because my answer is​:

I read $a . $a as equivalent to $x . $y, where it happens that $x and $y
alias the same value. $a was *written* twice by the programmer, so as there
are two references to it, it gets accessed *exactly* twice.

Whereas $a x 2 has $a *written* once by the programmer, so as there is only
one reference to it, it gets accessed *exactly* once.

Basically, I view tie as active data, with an implied contract that it will
be called once for each semantic read, and that this should be honoured.

Hence I don't view $a . $a and $a x 2 as identical and interchangeable - if
the programmer wanted the other, he/she should have written the other.
Yes, this means that the compiler can't perform strength reduction or other
optimisations in the general case. But I'm thinking of this from a
perspective of "hooks exist to intercept the actions of the runtime"
therefore the compiler isn't *allowed* to consider that transformations
that are semantically valid for passive data are generally valid, because
Perl *allows* active data.

(Overload, on the other hand, I view as should-be-idempotent. I see its role
as different. overload is expression of values. tie is a system to implement
side effects)

Nicholas Clark

@p5pRT
Copy link
Author

p5pRT commented Jul 20, 2010

From @jandubois

On Tue, 20 Jul 2010, Nicholas Clark wrote​:

I read $a . $a as equivalent to $x . $y, where it happens that $x and $y
alias the same value. $a was *written* twice by the programmer, so as there
are two references to it, it gets accessed *exactly* twice.

In that case I think you'll find plenty of places where it is accessed
more often than you expect. E.g. ++$a might access $a once if it is
just SV_pIOK, or twice if it is just SV_pNOK, because sv_inc() will first
try to see if it can't convert the NV to an IV, triggering an additional
FETCH call​:

  flags = SvFLAGS(sv);
  if ((flags & (SVp_NOK|SVp_IOK)) == SVp_NOK) {
  /* It's (privately or publicly) a float, but not tested as an
  integer, so test it to see. */
  (void) SvIV(sv);
  flags = SvFLAGS(sv);
  }

It is easy to guarantee that each tied variable is fetched at least once
for each time it is mentioned in the source code, but it is extremely
hard to guarantee that it isn't called more often​: Any innocent looking
SvIV(), SvNV() or SvPV() call anywhere in the core may trigger an
additional call to FETCH a tied variable.

I prefer to view this as an inefficiency, not as a bug, because I think
FETCH should be side-effect free.

Cheers,
-Jan

@p5pRT
Copy link
Author

p5pRT commented Jul 20, 2010

From @jandubois

Eirik Berg Hanssen wrote​:

On Tue, Jul 20, 2010 at 10​:12 PM, Jan Dubois <jand@​activestate.com> wrote​:

And could you also explain why it makes sense that $a.$a has to invoke
magic twice, while $a x 2 will only call it once?

  For the same reason that f().f() calls &f twice, while f() x 2 will only call it once?

Yes, but f() may have side-effects, whereas $a shouldn't have any (IMO).

As I wrote earlier to Nicholas, FETCH may be called more than you expect
anyways, which already implicitly forbids side effect (unless you
consider calling it more than strictly needed a bug)​:


sub foo​::TIESCALAR { bless \my $x => "foo" }
sub foo​::FETCH { print "FETCH\n"; ${$_[0]} }
sub foo​::STORE { print "STORE\n"; ${$_[0]} = $_[1] }

tie $a, "foo";

print "NV\n"; $a = 1.;
print "Inc\n"; ++$a;
print "IV\n"; $a = 1;
print "Inc\n"; ++$a;


NV
STORE
Inc
FETCH
FETCH
STORE
IV
STORE
Inc
FETCH
STORE


So FETCH is called twice when $a is a floating point number and only
once when it is an integer.

But even assuming there is a canonical number of times FETCH should
be called, what is that number for "$b = $a++;"? Should it be 2, because
the expression is just a shorthand for "$b = $a; $a = $a + 1;"? Or is
it fine to allow Perl to optimize this into a single access?

perlop.pod implies that there should be 2 accesses​:

  "++" and "--" work as in C. That is, if placed before a variable, they
  increment or decrement the variable by one before returning the value,
  and if placed after, increment or decrement after returning the value.

There is nothing there stating that the returned value can be re-used to
increment the variable, so by your logic it will have to be fetched again.
Which is not how it is currently implemented (reasonably again, IMO).

Cheers,
-Jan

@p5pRT
Copy link
Author

p5pRT commented Jul 21, 2010

From ebhanssen@cpan.org

On Tue, Jul 20, 2010 at 10​:12 PM, Jan Dubois <jand@​activestate.com> wrote​:

And could you also explain why it makes sense that $a.$a has to invoke
magic twice, while $a x 2 will only call it once?

  For the same reason that f().f() calls &f twice, while f() x 2 will only
call it once?

Eirik

@p5pRT
Copy link
Author

p5pRT commented Jul 29, 2010

From @Abigail

On Tue, Jul 13, 2010 at 06​:55​:33PM +0100, Ben Morrow wrote​:

Quoth nick@​ccl4.org (Nicholas Clark)​:

On Tue, Jul 13, 2010 at 01​:11​:11PM +0000, ?var Arnfj?r? Bjarmason wrote​:

Of course we can't liberally change things that are documented to be
undefined as liberally as a C compiler would, becuase there's only one
perl(1) but multiple cc(1)'s.

But whatever we call it, that's the key problem. There is only one
implementation, and as that implementation strives hard to internally avoid
C undefined behaviour, its output will be deterministic, in some fashion.
Hence people come to rely on the current behaviour of the implementation,
documented or not.

Quite. And

print $i\+\+\, $i\+\+;

has DWIM forever (probably since perl 1). I'm not saying we *cannot*
change it, just that any change needs to be either only within the scope
of a lexical pragma or to go through a full deprecation cycle with
mandatory warnings before it changes.

Actually, I added the explicite statement about undefinedness in the
documentation, because not all expressions containing multiple $i ++
did DWIM for everyone, leading to lots of (pointless) discussions on
what the 'right' value of complicated statements where supposed to be.

  $i = $j = 1;
  print $i ++, $i ++; # Prints 12.
  print ++ $j, ++ $j; # Prints 33. Both are really DWIM?

Abigail

@p5pRT
Copy link
Author

p5pRT commented Jul 29, 2010

From @Abigail

On Tue, Jul 20, 2010 at 02​:41​:34PM -0700, Jan Dubois wrote​:

On Tue, 20 Jul 2010, Nicholas Clark wrote​:

I read $a . $a as equivalent to $x . $y, where it happens that $x and $y
alias the same value. $a was *written* twice by the programmer, so as there
are two references to it, it gets accessed *exactly* twice.

In that case I think you'll find plenty of places where it is accessed
more often than you expect. E.g. ++$a might access $a once if it is
just SV_pIOK, or twice if it is just SV_pNOK, because sv_inc() will first
try to see if it can't convert the NV to an IV, triggering an additional
FETCH call​:

flags = SvFLAGS\(sv\);
if \(\(flags & \(SVp\_NOK|SVp\_IOK\)\) == SVp\_NOK\) \{
/\* It's \(privately or publicly\) a float\, but not tested as an
   integer\, so test it to see\. \*/
\(void\) SvIV\(sv\);
flags = SvFLAGS\(sv\);
\}

It is easy to guarantee that each tied variable is fetched at least once
for each time it is mentioned in the source code, but it is extremely
hard to guarantee that it isn't called more often​: Any innocent looking
SvIV(), SvNV() or SvPV() call anywhere in the core may trigger an
additional call to FETCH a tied variable.

I prefer to view this as an inefficiency, not as a bug, because I think
FETCH should be side-effect free.

If FETCH is to be side-effect free, many of the interesting cases for using
ties disappear.

Abigail

@p5pRT
Copy link
Author

p5pRT commented Jul 29, 2010

From @rgarcia

On 29 July 2010 15​:42, Abigail <abigail@​abigail.be> wrote​:

I prefer to view this as an inefficiency, not as a bug, because I think
FETCH should be side-effect free.

If FETCH is to be side-effect free, many of the interesting cases for using
ties disappear.

At this point, someone usually mentions monads.

@p5pRT
Copy link
Author

p5pRT commented Aug 6, 2010

From @ap

* Eric Brine <ikegami@​adaelis.com> [2010-07-12 06​:55]​:

On Sun, Jul 11, 2010 at 10​:22 PM, David Golden <xdaveg@​gmail.com> wrote​:

I'm suggesting that we disclaim any implicit guarantee that the
compiler won't optimize away expressions that have side effects when
evaluated.

Without that guarantee,

my $x = f()
or DEBUG && warn(...);
return $x;

would be buggy. Dunno if that matters

Why?

As far as I can tell, the compiler would statically determine
that it can optimise the `DEBUG && warn` part down to `!1` when
`DEBUG` is false. This seems correct to me.

It would also statically determine that an `or !1` clause in that
statement does nothing, and so fold it away altogether.

That would leave the code looking like

  my $x = f();
  return $x;

when `DEBUG` is false.

Which seems 100% on the mark to me.

Am I missing something in your objection?

Regards,
--
Aristotle Pagaltzis // <http​://plasmasturm.org/>

@p5pRT
Copy link
Author

p5pRT commented Aug 6, 2010

From @ikegami

Indeed. I made a booboo when I wrote that.

@p5pRT
Copy link
Author

p5pRT commented Jul 18, 2013

From @nthykier

Hi,

Attached is a prototype branch that enables some branch elimination
while retaining possible side-effects OPs.

It can kill branches in if-else cases. Examples​:

  $ ./perl -Ilib -MO=Deparse \
  -e 'if ($a && "Pie" eq "Good") {print "True\n" }'
  -e ' else { print "False\n" }'
  unless ($a and !1) {
  print "False\n";
  }

  $ ./perl -Ilib -MO=Deparse \
  -e 'if ($a || "Pie" ne "Good") {print "True\n" }'
  -e ' else { print "False\n" }'
  if ($a or 1) {
  print "True\n";
  }

The patch adds two op_private flags for OP_AND and OP_OR, which
determines if the expression will always evaluate to either TRUE or
FALSE. If newCONDOP detects its condition is one of these two OPs and
they have one of these flags set, it will kill the dead branch.

However, the newLOGOP does not use this flag itself to elimite dead
branches. Accordingly, the original test case is still​:

  $ ./perl -Ilib -MO=Deparse -e 'if ($a && "Pie" ne "Good") {print}'
  if ($a and 1) {
  print $_;
  }

The only problem here is that folding the OP_AND/OP_OR expressions
should probably be deferred to peep optimisation. Otherwise, we might fold​:

  $a && "Pie" ne "Good" => $a

Before newCONP has a change to see it (and thus lose the ability to
eliminate the dead branch).

~Niels

@p5pRT
Copy link
Author

p5pRT commented Jul 18, 2013

From @nthykier

op-prune-cond-branch.diff
diff --git a/op.c b/op.c
index d5323a0..a5675d3 100644
--- a/op.c
+++ b/op.c
@@ -5976,6 +5976,39 @@ S_new_logop(pTHX_ I32 type, I32 flags, OP** firstp, OP** otherp)
     first->op_next = (OP*)logop;
     first->op_sibling = other;
 
+    if (type == OP_AND || type == OP_OR)
+    {
+        /*  Look for stuff like: a() and (b() or 1)
+
+            The truth value of this expression is entirely decidable,
+            but we cannot eliminate the expression (a() and b() might
+            have side effects).
+         */
+        U8 rhs_value = 0;
+        cstop = search_const(other);
+        /* Check if the RHS is known to be FALSE */
+        if (cstop)
+        {
+            if (SvTRUE(((SVOP*)cstop)->op_sv))
+                rhs_value = OPpLOGOP_CONST_TRUE;
+            else
+                rhs_value = OPpLOGOP_CONST_FALSE;
+        }
+        if ((other->op_type == OP_AND || other->op_type == OP_OR)
+            && (other->op_private & OPpLOGOP_CONST_MASK))
+        {
+            rhs_value = other->op_private & OPpLOGOP_CONST_MASK;
+        }
+
+        if (rhs_value)
+        {
+            if (type == OP_AND && rhs_value == OPpLOGOP_CONST_FALSE)
+                logop->op_private |= rhs_value;
+            if (type == OP_OR && rhs_value == OPpLOGOP_CONST_TRUE)
+                logop->op_private |= rhs_value;
+        }
+    }
+
     CHECKOP(type,logop);
 
     o = newUNOP(prepend_not ? OP_NOT : OP_NULL, 0, (OP*)logop);
@@ -6043,6 +6076,47 @@ Perl_newCONDOP(pTHX_ I32 flags, OP *first, OP *trueop, OP *falseop)
 	    live->op_private |= OPpCONST_FOLDED;
 	return live;
     }
+    o = first;
+    while (o && o->op_type == OP_NULL && (o->op_flags & OPf_KIDS))
+        o = cUNOPo->op_first;
+
+    if (o && (o->op_type == OP_AND || o->op_type == OP_OR)
+        && (o->op_private & OPpLOGOP_CONST_MASK)) {
+        const bool left = (o->op_private & OPpLOGOP_CONST_MASK)
+            == OPpLOGOP_CONST_TRUE;
+        OP *live = left ? trueop : falseop;
+        OP *const dead = left ? falseop : trueop;
+        /* Rewrite:
+
+               if (a() and <false>) {<dead>} else { b() } =>
+                  (a() and <false>) or { b() }
+
+               if (a() or <true>) { b() } else {<dead>} =>
+                  (a() or <true>) and { b() }
+
+           (NB: <false> and <true> are not limited to simple OP_CONST)
+
+         */
+        OPCODE combine_op = o->op_type == OP_AND ? OP_OR : OP_AND;
+
+	if (PL_madskills) {
+	    /* This is all dead code when PERL_MAD is not defined.  */
+	    live = newUNOP(OP_NULL, 0, live);
+	    op_getmad(first, live, 'C');
+	    op_getmad(dead, live, left ? 'e' : 't');
+	} else {
+	    op_free(dead);
+	}
+	if (live->op_type == OP_LEAVE)
+	    live = newUNOP(OP_NULL, OPf_SPECIAL, live);
+	else if (live->op_type == OP_MATCH || live->op_type == OP_SUBST
+	      || live->op_type == OP_TRANS || live->op_type == OP_TRANSR)
+	    /* Mark the op as being unbindable with =~ */
+	    live->op_flags |= OPf_SPECIAL;
+	else if (live->op_type == OP_CONST)
+	    live->op_private |= OPpCONST_FOLDED;
+	return newLOGOP(combine_op, 0, first, live);
+    }
     NewOp(1101, logop, 1, LOGOP);
     logop->op_type = OP_COND_EXPR;
     logop->op_ppaddr = PL_ppaddr[OP_COND_EXPR];
diff --git a/op.h b/op.h
index 5d1a771..b055965 100644
--- a/op.h
+++ b/op.h
@@ -176,6 +176,12 @@ Deprecated.  Use C<GIMME_V> instead.
 #define OPpASSIGN_BACKWARDS	64	/* Left & right switched. */
 #define OPpASSIGN_CV_TO_GV	128	/* Possible optimisation for constants. */
 
+
+/* Private for OP_AND and OP_OR */
+#define OPpLOGOP_CONST_TRUE	2	/* Result is always true, but op has side-effect*/
+#define OPpLOGOP_CONST_FALSE	4	/* Result is always false, but op has side-effect*/
+#define OPpLOGOP_CONST_MASK	(2|4)	/* Mask is always true, but op has side-effect*/
+
 /* Private for OP_MATCH and OP_SUBST{,CONT} */
 #define OPpRUNTIME		64	/* Pattern coming in on the stack */
 

@toddr
Copy link
Member

toddr commented Feb 4, 2020

Are we planning on applying this? If not should it be closed?

@toddr toddr added the Closable? We might be able to close this ticket, but we need to check with the reporter label Feb 4, 2020
@iabyn
Copy link
Contributor

iabyn commented Feb 5, 2020 via email

@toddr toddr removed the Closable? We might be able to close this ticket, but we need to check with the reporter label Feb 5, 2020
@toddr toddr closed this as completed Feb 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants