Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ref and other ops are inefficient in boolean context #10706

Closed
p5pRT opened this issue Oct 8, 2010 · 14 comments
Closed

ref and other ops are inefficient in boolean context #10706

p5pRT opened this issue Oct 8, 2010 · 14 comments

Comments

@p5pRT
Copy link

p5pRT commented Oct 8, 2010

Migrated from rt.perl.org#78288 (status was 'resolved')

Searchable as RT78288$

@p5pRT
Copy link
Author

p5pRT commented Oct 8, 2010

From @timbunce

Created by @timbunce

[16​:05] timbunce​: code like "if ref $foo" is far more expensive than it needs to be because ref() still creates a new SVPV when called in a boolean context.
[16​:08] xdg​: timbunce, could you please open a ticket on that? I'm sure someone can optimize that.
[16​:10] • Nicholas wonders if "if ref $foo" can be spotted with much the same code as "if %hash" is
[16​:10] Zefram​: yes it can, at op check time
[16​:11] timbunce​: How can internals determine "boolean context"?
[16​:11] vincent​: at op check time
[16​:11] timbunce​: there are probably other ops that could do less work in boolean context
[16​:11] Nicholas​: offhand, "I don't know", but I know that "if %hash" does this

[16​:11] Zefram​: but ref($foo)'s truth value isn't as simple as most people expect, so it's approximately never what a programmer actually wants
[16​:12] timbunce​: Zefram​: true, but that doesn't stop people writing "if ref $foo" often
[16​:13] Zefram​: I'd be more inclined to use check-time logic to emit a warning for it
[16​:14] Zefram​: "Dubious use of reference-type operator in truth-value context"
[16​:14] timbunce​: Zefram​: I think you'd have angry hordes after you.
[16​:14] timbunce​: better done in perlcritic
[16​:14] PerlJam​: timbunce​: +1
[16​:14] mst​: Zefram​: but then Scalar​::Defer won't work anymore
[16​:14] Zefram​: Scalar​::Defer is an abomination
[16​:15] mst​: Zefram​: an abomination that's heavily used in the pumpking's production code
[16​:16] timbunce​: mst Zefram​: I think Scalar​::Defer could still work. The pp_ref code just needs to know if it should return sv_yes or sv_no

In boolean context, pp_ref would check the pv returned by sv_reftype in
the same way that SvTRUE does (for "" and "0") and the push &sv_yes or &sv_no.

Perl Info

Flags:
    category=core
    severity=low

Site configuration information for perl 5.12.2:

Configured by timbo at Sun Oct  3 16:42:33 IST 2010.

Summary of my perl5 (revision 5 version 12 subversion 2) configuration:
  Commit id: 7a3b65c9d99f69553fffe01f73d49fe9abf95120
  Platform:
    osname=darwin, osvers=10.4.0, archname=darwin-thread-multi-2level
    uname='darwin timac.local 10.4.0 darwin kernel version 10.4.0: fri apr 23 18:28:53 pdt 2010; root:xnu-1504.7.4~1release_i386 i386 '
    config_args='-des -Doptimize=-g -DEBUGGING=both -Dusethreads -Dusemultiplicity -Dusesitecustomize -Dusedevel -Uversiononly -Dprefix=/usr/local/perl512-dev'
    hint=recommended, useposix=true, d_sigaction=define
    useithreads=define, usemultiplicity=define
    useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef
    use64bitint=undef, use64bitall=undef, uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='cc', ccflags ='-fno-common -DPERL_DARWIN -no-cpp-precomp -DDEBUGGING -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include -I/opt/local/include',
    optimize='-g',
    cppflags='-no-cpp-precomp -fno-common -DPERL_DARWIN -no-cpp-precomp -DDEBUGGING -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include -I/opt/local/include'
    ccversion='', gccversion='4.2.1 (Apple Inc. build 5664)', gccosandvers=''
    intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
    ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
    alignbytes=8, prototype=define
  Linker and Libraries:
    ld='env MACOSX_DEPLOYMENT_TARGET=10.3 cc', ldflags =' -fstack-protector -L/usr/local/lib -L/opt/local/lib'
    libpth=/usr/local/lib /opt/local/lib /usr/lib
    libs=-lgdbm -ldbm -ldl -lm -lutil -lc
    perllibs=-ldl -lm -lutil -lc
    libc=/usr/lib/libc.dylib, so=dylib, useshrplib=false, libperl=libperl.a
    gnulibc_version=''
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=bundle, d_dlsymun=undef, ccdlflags=' '
    cccdlflags=' ', lddlflags=' -bundle -undefined dynamic_lookup -L/usr/local/lib -L/opt/local/lib -fstack-protector'

Locally applied patches:
    


@INC for perl 5.12.2:
    /usr/local/perl512-dev/lib/site_perl/5.12.2/darwin-thread-multi-2level
    /usr/local/perl512-dev/lib/site_perl/5.12.2
    /usr/local/perl512-dev/lib/5.12.2/darwin-thread-multi-2level
    /usr/local/perl512-dev/lib/5.12.2
    /usr/local/perl512-dev/lib/site_perl/5.12.0/darwin-thread-multi-2level
    /usr/local/perl512-dev/lib/site_perl/5.12.0
    /usr/local/perl512-dev/lib/site_perl
    .


Environment for perl 5.12.2:
    DYLD_LIBRARY_PATH=:/usr/local/pgsql/lib/:/opt/local/lib/mysql5/mysql
    HOME=/Users/timbo
    LANG=en_IE.UTF-8
    LANGUAGE (unset)
    LC_ALL=en_IE.UTF-8
    LD_LIBRARY_PATH (unset)
    LOGDIR (unset)
    PATH=/usr/local/pgsql/bin:/Users/timbo/bin:/usr/local/perl512-dev/bin:/Users/timbo/perl6/rakudo/parrot_install/bin:/usr/local/mysql/bin:/usr/local/bin:/opt/local/bin:/opt/local/sbin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/X11/bin
    PERLCRITIC=/Users/timbo/.setdev/perlcriticrc
    PERLTIDY=/Users/timbo/.setdev/perltidyrc
    PERL_BADLANG (unset)
    SHELL=/bin/bash

@p5pRT
Copy link
Author

p5pRT commented Jan 1, 2017

From @jkeenan

On Fri, 08 Oct 2010 13​:29​:35 GMT, timbo wrote​:

This is a bug report for perl from Tim.Bunce@​pobox.com,
generated with the help of perlbug 1.39 running under perl 5.12.2.

-----------------------------------------------------------------
[Please describe your issue here]

[16​:05] timbunce​: code like "if ref $foo" is far more expensive than
it needs to be because ref() still creates a new SVPV when called in a
boolean context.
[16​:08] xdg​: timbunce, could you please open a ticket on that? I'm
sure someone can optimize that.
[16​:10] • Nicholas wonders if "if ref $foo" can be spotted with much
the same code as "if %hash" is
[16​:10] Zefram​: yes it can, at op check time
[16​:11] timbunce​: How can internals determine "boolean context"?
[16​:11] vincent​: at op check time
[16​:11] timbunce​: there are probably other ops that could do less work
in boolean context
[16​:11] Nicholas​: offhand, "I don't know", but I know that "if %hash"
does this

[16​:11] Zefram​: but ref($foo)'s truth value isn't as simple as most
people expect, so it's approximately never what a programmer actually
wants
[16​:12] timbunce​: Zefram​: true, but that doesn't stop people writing
"if ref $foo" often
[16​:13] Zefram​: I'd be more inclined to use check-time logic to emit a
warning for it
[16​:14] Zefram​: "Dubious use of reference-type operator in truth-value
context"
[16​:14] timbunce​: Zefram​: I think you'd have angry hordes after you.
[16​:14] timbunce​: better done in perlcritic
[16​:14] PerlJam​: timbunce​: +1
[16​:14] mst​: Zefram​: but then Scalar​::Defer won't work anymore
[16​:14] Zefram​: Scalar​::Defer is an abomination
[16​:15] mst​: Zefram​: an abomination that's heavily used in the
pumpking's production code
[16​:16] timbunce​: mst Zefram​: I think Scalar​::Defer could still work.
The pp_ref code just needs to know if it should return sv_yes or sv_no

In boolean context, pp_ref would check the pv returned by sv_reftype
in
the same way that SvTRUE does (for "" and "0") and the push &sv_yes or
&sv_no.

Tim Bunce proposed this approach more than six years ago.

Can anyone respond?

Thank you very much.

--
James E Keenan (jkeenan@​cpan.org)

@p5pRT
Copy link
Author

p5pRT commented Jan 1, 2017

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Jan 3, 2017

From @iabyn

On Sat, Dec 31, 2016 at 04​:09​:30PM -0800, James E Keenan via RT wrote​:

On Fri, 08 Oct 2010 13​:29​:35 GMT, timbo wrote​:

This is a bug report for perl from Tim.Bunce@​pobox.com,
generated with the help of perlbug 1.39 running under perl 5.12.2.

-----------------------------------------------------------------
[Please describe your issue here]

[16​:05] timbunce​: code like "if ref $foo" is far more expensive than
it needs to be because ref() still creates a new SVPV when called in a
boolean context.
[16​:08] xdg​: timbunce, could you please open a ticket on that? I'm
sure someone can optimize that.
[16​:10] • Nicholas wonders if "if ref $foo" can be spotted with much
the same code as "if %hash" is
[16​:10] Zefram​: yes it can, at op check time
[16​:11] timbunce​: How can internals determine "boolean context"?
[16​:11] vincent​: at op check time
[16​:11] timbunce​: there are probably other ops that could do less work
in boolean context
[16​:11] Nicholas​: offhand, "I don't know", but I know that "if %hash"
does this

[16​:11] Zefram​: but ref($foo)'s truth value isn't as simple as most
people expect, so it's approximately never what a programmer actually
wants
[16​:12] timbunce​: Zefram​: true, but that doesn't stop people writing
"if ref $foo" often
[16​:13] Zefram​: I'd be more inclined to use check-time logic to emit a
warning for it
[16​:14] Zefram​: "Dubious use of reference-type operator in truth-value
context"
[16​:14] timbunce​: Zefram​: I think you'd have angry hordes after you.
[16​:14] timbunce​: better done in perlcritic
[16​:14] PerlJam​: timbunce​: +1
[16​:14] mst​: Zefram​: but then Scalar​::Defer won't work anymore
[16​:14] Zefram​: Scalar​::Defer is an abomination
[16​:15] mst​: Zefram​: an abomination that's heavily used in the
pumpking's production code
[16​:16] timbunce​: mst Zefram​: I think Scalar​::Defer could still work.
The pp_ref code just needs to know if it should return sv_yes or sv_no

In boolean context, pp_ref would check the pv returned by sv_reftype
in
the same way that SvTRUE does (for "" and "0") and the push &sv_yes or
&sv_no.

Tim Bunce proposed this approach more than six years ago.

Can anyone respond?

I've just pushed the commit shown below as davem/boolref.
It shows that the concept is sound, but I suggest not merging into blead
until after 5.26.0 is released​: even just setting an extra private flag on
an op may require various B-ish CPAN modules to be updated.

The repeep() code in my commit is separate from similar code that puts
rv2hv into boolean context in things like if (%h) {...}. I suspect they
want merging into a single function. I also noticed there's a bug in the
rv2hv stuff​: if (!%h) {...} *doesn't* get optimised.

I'm open to suggestions for other ops which could benefit from being
special-cased in boolean context.

commit 9108e50
Author​: David Mitchell <davem@​iabyn.com>
AuthorDate​: Tue Jan 3 12​:08​:08 2017 +0000
Commit​: David Mitchell <davem@​iabyn.com>
CommitDate​: Tue Jan 3 13​:15​:42 2017 +0000

  add an OPpREF_BOOL flag to OP_REF
 
  RT #78288
 
  When ref() is used in a boolean context, it's not necessary to return
  the name of the package which an object is blessed into; instead a simple
  truth value can be returned, which is faster.
 
  Note that it has to cope with the subtlety of an object blessed into the
  class "0", which should return false.
 
  Porting/bench.pl shows for the expression !ref($r), approximately​:
  unchanged for a non-reference $r
  doubling of speed for a reference $r
  tripling of speed for a blessed reference $r
 
  I haven't attempted to use a provisional boolean flag for when the
  expression's context isn't known at compile time (e.g. when used in the
  last expression in a sub), unlike the similar mechanism for rv2hv etc
  which has both OPpTRUEBOOL and OPpMAYBE_TRUEBOOL flags.

--
Fire extinguisher (n) a device for holding open fire doors.

@p5pRT
Copy link
Author

p5pRT commented Jan 6, 2017

From @iabyn

On Tue, Jan 03, 2017 at 01​:22​:08PM +0000, Dave Mitchell wrote​:

I've just pushed the commit shown below as davem/boolref.
It shows that the concept is sound, but I suggest not merging into blead
until after 5.26.0 is released​: even just setting an extra private flag on
an op may require various B-ish CPAN modules to be updated.

The repeep() code in my commit is separate from similar code that puts
rv2hv into boolean context in things like if (%h) {...}. I suspect they
want merging into a single function. I also noticed there's a bug in the
rv2hv stuff​: if (!%h) {...} *doesn't* get optimised.

I'm open to suggestions for other ops which could benefit from being
special-cased in boolean context.

I've since done this a different way. First I've pushed the following set
of commits to blead​:

  commit bec5a1b
  Merge​: 12e1fa6 b243b19
  Author​: David Mitchell <davem@​iabyn.com>
  AuthorDate​: Fri Jan 6 16​:28​:50 2017 +0000
  Commit​: David Mitchell <davem@​iabyn.com>
  CommitDate​: Fri Jan 6 16​:28​:50 2017 +0000

  [MERGE] redo boolean context
 
  Overhaul the stuff that flags an op as being in boolean context (currently
  just padhv and rv2hv). Make the mechanism in rpeep() general, so that
  other ops can be easily added in future, and add a generic testing
  framework for such ops in t/perf/optree.t.
 
  This alters the amount on situations recognised as being in boolean
  context (mainly increasing them).

which fixes some bugs and adds a general framework for supporting boolean-
context ops.

Then I've pushed the following branch as davem/boolref2 which I intend to
merge sometime after 5.26​:

  commit a83d533
  Author​: David Mitchell <davem@​iabyn.com>
  AuthorDate​: Fri Jan 6 14​:59​:54 2017 +0000
  Commit​: David Mitchell <davem@​iabyn.com>
  CommitDate​: Fri Jan 6 16​:32​:21 2017 +0000

  make OP_REF support boolean context
 
  RT #78288
 
  When ref() is used in a boolean context, it's not necessary to return
  the name of the package which an object is blessed into; instead a simple
  truth value can be returned, which is faster.
 
  Note that it has to cope with the subtlety of an object blessed into the
  class "0", which should return false.
 
  Porting/bench.pl shows for the expression !ref($r), approximately​:
  unchanged for a non-reference $r
  doubling of speed for a reference $r
  tripling of speed for a blessed reference $r
 
  This commit builds on the mechanism already used to set the OPpTRUEBOOL
  and OPpMAYBE_TRUEBOOL flags on padhv and rv2hv ops when used in boolean
  context.

I've had a quick look though our ops list to see what other ops might
benefit from special-case handling in boolean context. There aren't
actually all that many. OP_KEYS and OP_VALUES could probably be tidied up a
bit and length($utf8_string) would probably benefit from not needing to
do a bytes -> chars length conversion. Other than one or two other minor
cases, not a lot leaped out at me. Its possible that ops which return
integer values might be able to do it more efficiently if they just return
PL_sv_yes/no rather than having to set a PADTMP to an integer value.

Anyway, I'll look further into doing other ops after 5.26.

--
"Emacs isn't a bad OS once you get used to it.
It just lacks a decent editor."

@p5pRT
Copy link
Author

p5pRT commented Jun 5, 2017

From @iabyn

On Fri, Jan 06, 2017 at 04​:44​:27PM +0000, Dave Mitchell wrote​:

On Tue, Jan 03, 2017 at 01​:22​:08PM +0000, Dave Mitchell wrote​:

I've just pushed the commit shown below as davem/boolref.
It shows that the concept is sound, but I suggest not merging into blead
until after 5.26.0 is released​: even just setting an extra private flag on
an op may require various B-ish CPAN modules to be updated.

The repeep() code in my commit is separate from similar code that puts
rv2hv into boolean context in things like if (%h) {...}. I suspect they
want merging into a single function. I also noticed there's a bug in the
rv2hv stuff​: if (!%h) {...} *doesn't* get optimised.

I'm open to suggestions for other ops which could benefit from being
special-cased in boolean context.

I've since done this a different way. First I've pushed the following set
of commits to blead​:

commit bec5a1ba1ddeb36f189da30307757fead2933e93
Merge&#8203;: 12e1fa6 b243b19
Author&#8203;:     David Mitchell \<davem@&#8203;iabyn\.com>
AuthorDate&#8203;: Fri Jan 6 16&#8203;:28&#8203;:50 2017 \+0000
Commit&#8203;:     David Mitchell \<davem@&#8203;iabyn\.com>
CommitDate&#8203;: Fri Jan 6 16&#8203;:28&#8203;:50 2017 \+0000

\[MERGE\] redo boolean context

Overhaul the stuff that flags an op as being in boolean context \(currently
just padhv and rv2hv\)\. Make the mechanism in rpeep\(\) general\, so that
other ops can be easily added in future\, and add a generic testing
framework for such ops in t/perf/optree\.t\.

This alters the amount on situations recognised as being in boolean
context \(mainly increasing them\)\.

which fixes some bugs and adds a general framework for supporting boolean-
context ops.

Then I've pushed the following branch as davem/boolref2 which I intend to
merge sometime after 5.26​:

commit a83d533d7b264b0df884decfc144af10b79c7f24
Author&#8203;:     David Mitchell \<davem@&#8203;iabyn\.com>
AuthorDate&#8203;: Fri Jan 6 14&#8203;:59&#8203;:54 2017 \+0000
Commit&#8203;:     David Mitchell \<davem@&#8203;iabyn\.com>
CommitDate&#8203;: Fri Jan 6 16&#8203;:32&#8203;:21 2017 \+0000

make OP\_REF support boolean context

RT \#78288

When ref\(\) is used in a boolean context\, it's not necessary to return
the name of the package which an object is blessed into; instead a simple
truth value can be returned\, which is faster\.

Note that it has to cope with the subtlety of an object blessed into the
class "0"\, which should return false\.

Porting/bench\.pl shows for the expression \!ref\($r\)\, approximately&#8203;:
    unchanged         for a non\-reference $r
    doubling of speed for a reference $r
    tripling of speed for a blessed reference $r

This commit builds on the mechanism already used to set the OPpTRUEBOOL
and OPpMAYBE\_TRUEBOOL flags on padhv and rv2hv ops when used in boolean
context\.

which I've now pushed as v5.27.0-127-gba75e9a

--
"But Sidley Park is already a picture, and a most amiable picture too.
The slopes are green and gentle. The trees are companionably grouped at
intervals that show them to advantage. The rill is a serpentine ribbon
unwound from the lake peaceably contained by meadows on which the right
amount of sheep are tastefully arranged." -- Lady Croom, "Arcadia"

@p5pRT
Copy link
Author

p5pRT commented Jul 27, 2017

From @iabyn

On Fri, Jan 06, 2017 at 04​:44​:27PM +0000, Dave Mitchell wrote​:

I've had a quick look though our ops list to see what other ops might
benefit from special-case handling in boolean context. There aren't
actually all that many. OP_KEYS and OP_VALUES could probably be tidied up a
bit and length($utf8_string) would probably benefit from not needing to
do a bytes -> chars length conversion. Other than one or two other minor
cases, not a lot leaped out at me. Its possible that ops which return
integer values might be able to do it more efficiently if they just return
PL_sv_yes/no rather than having to set a PADTMP to an integer value.

Anyway, I'll look further into doing other ops after 5.26.

Which I've now done and just pushed with this merge commit​:

commit c1a6686
Merge​: 0283ad9 cd5acdd
Author​: David Mitchell <davem@​iabyn.com>
AuthorDate​: Thu Jul 27 11​:30​:50 2017 +0100
Commit​: David Mitchell <davem@​iabyn.com>
CommitDate​: Thu Jul 27 11​:30​:50 2017 +0100

  [MERGE] various boolean-related optimisations
 
  This branch contains about 50 commits, which collectively optimise
  various aspects of perl's behaviour when detailing with boolean values
  or ops that are called in boolean context.
 
  The main changes are​:
 
  * A &PL_sv_zero variable has been added. This is a new per-interpreter
  immortal SV, very similar to &PL_sv_no, except that it has a string value
  of "0" rather than "". As well as being directly usable in cases where
  code might otherwise need to do newSViv(0), it has a more subtle use in
  ops that handle boolean context directly. For example in
 
  sub f {
  ....;
  if (%h) { .... }
  }
 
  the 'if' statement is compiled using OP_AND, so is equivalent to
 
  %h && do { .... }
 
  If %h is empty, then the result of the boolean expression should be 0
  rather than &PL_sv_no, and this value gets returned to the caller, which
  may expect a scalar result​: and what it expects won't be known until run
  time. So by returning &PL_sv_yes and &PL_sv_zero rather than yes and no,
  we increase the number of places where it is safe to return a boolean
  value.
 
  A downside of &PL_sv_zero is that if assigned to a variable, that variable
  gets int, num and string values rather than just an int value.
 
  * SvTRUE() is now more efficient.
 
  This macro is called in places like pp_and, pp_not etc. It has a long list
  of conditions which it goes through to determine the truthiness of an SV,
  such as whether it has a string value, and if so whether the length is
  zero, or the length is 1 and the string's value is "0". It turns out that
  the immortals like &PL_sv_yes fare really badly here​: they have to go
  through nearly every check to finally determine their value. To get round
  this, I have made it very quick to check whether an SV is one of the
  immortals, and if so whether it is true. This has been done by ensuring
  that PL_sv_undef, PL_sv_no, PL_sv_zero and PL_sv_yes are all contiguous in
  memory, so that a quick single address comparison is enough to determine
  immortality, and then comparing the address against &PL_sv_yes is enough
  to determine whether its true.
 
  In particular in non-multiplicity builds, PL_sv_undef etc have been
  replaced with the array PL_sv_immortals[4], with PL_sv_undef #defed to
  PL_sv_immortals[0] etc.
 
  Also, the SvOK() macro has been made more efficient by restoring the POK
  flag on REGEXP svs and and PVLVs which hold a regex. This removes the two
  extra checks that SvOK() had to do each time. This has been done by
  changing the way that PVLV's-holding-a-regex are implemented. The downside
  of this change is that ReANY() now includes a single conditional. To
  ameliorate that, places like pp_match() have been tweaked to only fetch
  ReANY() once where possible.
 
  * the OP_KEYS op is now optimised away in void and scalar context.
  Since a hash in scalar context was changed so that it no longer returns a
  bucket count but instead just a key count, '%h' and 'keys %h' in
  void/boolean/scalar context are now very similar. So for 'keys %h', rather
  than calling pp_padhv+pp_keys, just call pp_padhv with a OPpPADHV_ISKEYS
  flag set. Similarly for pp_rv2hv. As well as skipping an extra op call,
  this brings the existing boolean-context optimisations of '%h' to 'keys
  %h' too. In particular, 'keys %tied' in boolean context now calls SCALAR()
  if available, or FIRSTKEY() otherwise, rather than iterating through the
  whole hash.
 
  I have also given OP_RV2HV a targ so that it can return integer values
  more efficiently.
 
  * Various integer-returning ops are now flagged when in boolean context,
  which means at runtime they can just return &PL_sv_yes/&PL_sv_zero rather
  than setting a targ to an integer value, or for ops without targs, having
  to create a new integer-valued mortal. As well as being quicker to return
  a value, this works well with SvTRUE() which now recognises immortals
  quickly. Also for ops like length() and pos(), it doesn't need to convert
  between byte and char offsets; the fact that the offset is non-zero is
  sufficient.
 
  These ops are​:
 
  OP_AASSIGN
  OP_GREPWHILE
  OP_LENGTH
  OP_PADAV
  OP_POS
  OP_RV2AV
  OP_SUBST
 
  Also, index() doesn't return a boolean value, but for no match it returns
  -1. So for code like
 
  if (index(...) != -1) { ... }
 
  optimise away the OP_CONST and the OP_EQ and flag the index op to return a
  boolean value.
 
  * Speed up OP_ITER
 
  OP_ITER is called for every iteration of a for loop or similar. Its job is
  iterate the loop variable once, then return &PL_sv_yes or &PL_sv_no
  depending on whether it's the last iteration. OP_ITER is always followed
  by OP_AND, which examines the truth value on the stack, and returns
  op_next or op_other accordingly. Now, pp_iter() just asserts that
  PL_op->op_next is an OP_AND, and returns PL_op->op_next->op_next or
  PL_op->op_next->op_other directly, skipping the PL_sv_yes/no push/pop and
  eliminating the call to pp_and().
 
  As part of these changes, I have moved pp_padav(), pp_padhv() from pp.c
  to pp_hot.c, moved some common code into a new function
  S_padhv_rv2hv_common(), created a new (non-API) function Perl_hv_pushkv()
  which pushes a hash's keys or values or both onto the stack, and reduced
  the number of callers of Perl_do_kv() (which was acting as both a pp
  function for several ops and as a general-purpose function too).
 
  Of the 360 or so tests in t/perf/benchmarks, the following number of
  tests had their COND field changed from 100% to the following ranges​:
 
  36 @​ 96.55% .. 99.99%
  245 @​ 100.00% .. 100.99%
  28 @​ 101.00% .. 109.99%
  7 @​ 110.00% .. 119.99%
  10 @​ 120.00% .. 129.99%
  29 @​ 130.00% .. 199.99%
  4 @​ 200.00% .. 299.99%
  1 @​ 314.29%
 
  so about 10% of tests became marginally slower - usually due to one extra
  conditional in an op to test for a private BOOL flag or ReANY(); about 70%
  of tests were almost unaffected, while 20% of tests showed improvement,
  most with considerable improvement, and a few with spectacular improvement.
  (The 314% is for an empty @​lexical tested in boolean context).

--
"I do not resent criticism, even when, for the sake of emphasis,
it parts for the time with reality".
  -- Winston Churchill, House of Commons, 22nd Jan 1941.

@p5pRT
Copy link
Author

p5pRT commented Jul 27, 2017

From @cpansprout

On Thu, 27 Jul 2017 03​:50​:56 -0700, davem wrote​:

A downside of &PL_sv_zero is that if assigned to a variable, that
variable
gets int, num and string values rather than just an int value.

Why does that need to be the case? Why cannot PL_sv_zero have just the SvIOK flag on?

Does it matter that \(%h && $foo) now returns a reference to a read-only value when %h is empty? (If so, that can be solved by turning on PADTMP on PL_sv_zero.)

Also, index() doesn't return a boolean value, but for no match it
returns
-1. So for code like

if (index(...) != -1) { ... }

optimise away the OP_CONST and the OP_EQ and flag the index op to
return a
boolean value.

index(...) >= 0 is also quite common.

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Jul 27, 2017

From @cpansprout

On Thu, 27 Jul 2017 16​:14​:18 -0700, sprout wrote​:

On Thu, 27 Jul 2017 03​:50​:56 -0700, davem wrote​:

A downside of &PL_sv_zero is that if assigned to a variable, that
variable
gets int, num and string values rather than just an int value.

Why does that need to be the case? Why cannot PL_sv_zero have just
the SvIOK flag on?

Does it matter that \(%h && $foo) now returns a reference to a read-
only value when %h is empty? (If so, that can be solved by turning on
PADTMP on PL_sv_zero.)

I think it does matter. grep(1,@​list_with_one_item) returning a mutable 1 and grep(1,@​list_with_zero_items) returning a read-only variable is the kind of icky inconsistency that I tried hard to eliminate.

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Jul 28, 2017

From @iabyn

On Thu, Jul 27, 2017 at 04​:22​:47PM -0700, Father Chrysostomos via RT wrote​:

On Thu, 27 Jul 2017 16​:14​:18 -0700, sprout wrote​:

On Thu, 27 Jul 2017 03​:50​:56 -0700, davem wrote​:

A downside of &PL_sv_zero is that if assigned to a variable, that
variable
gets int, num and string values rather than just an int value.

Why does that need to be the case? Why cannot PL_sv_zero have just
the SvIOK flag on?

because the first time you do, e.g.

  @​a=();
  say "size=" . @​a;

PL_sv_zero gets upgraded to an SvPOK() PVIV anyway.

Does it matter that \(%h && $foo) now returns a reference to a read-
only value when %h is empty? (If so, that can be solved by turning on
PADTMP on PL_sv_zero.)

I think it does matter. grep(1,@​list_with_one_item) returning a mutable
1 and grep(1,@​list_with_zero_items) returning a read-only variable is
the kind of icky inconsistency that I tried hard to eliminate.

Won't setting PADTMP cause PL_sv_zero's PVX buffer to be stolen? Or is
that not done on a R/O var?

--
A walk of a thousand miles begins with a single step...
then continues for another 1,999,999 or so.

@p5pRT
Copy link
Author

p5pRT commented Jul 28, 2017

From @cpansprout

On Fri, 28 Jul 2017 04​:35​:26 -0700, davem wrote​:

On Thu, Jul 27, 2017 at 04​:22​:47PM -0700, Father Chrysostomos via RT wrote​:

On Thu, 27 Jul 2017 16​:14​:18 -0700, sprout wrote​:

On Thu, 27 Jul 2017 03​:50​:56 -0700, davem wrote​:

A downside of &PL_sv_zero is that if assigned to a variable, that
variable
gets int, num and string values rather than just an int value.

Why does that need to be the case? Why cannot PL_sv_zero have just
the SvIOK flag on?

because the first time you do, e.g.

@&#8203;a=\(\);
say "size=" \. @&#8203;a;

PL_sv_zero gets upgraded to an SvPOK() PVIV anyway.

Does it matter that \(%h && $foo) now returns a reference to a read-
only value when %h is empty? (If so, that can be solved by turning on
PADTMP on PL_sv_zero.)

I think it does matter. grep(1,@​list_with_one_item) returning a mutable
1 and grep(1,@​list_with_zero_items) returning a read-only variable is
the kind of icky inconsistency that I tried hard to eliminate.

Won't setting PADTMP cause PL_sv_zero's PVX buffer to be stolen? Or is
that not done on a R/O var?

It is not done on a R/O var.

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Aug 4, 2017

From @iabyn

On Fri, Jul 28, 2017 at 10​:09​:09AM -0700, Father Chrysostomos via RT wrote​:

On Fri, 28 Jul 2017 04​:35​:26 -0700, davem wrote​:

On Thu, Jul 27, 2017 at 04​:22​:47PM -0700, Father Chrysostomos via RT wrote​:

On Thu, 27 Jul 2017 16​:14​:18 -0700, sprout wrote​:

On Thu, 27 Jul 2017 03​:50​:56 -0700, davem wrote​:

A downside of &PL_sv_zero is that if assigned to a variable, that
variable
gets int, num and string values rather than just an int value.

Why does that need to be the case? Why cannot PL_sv_zero have just
the SvIOK flag on?

because the first time you do, e.g.

@&#8203;a=\(\);
say "size=" \. @&#8203;a;

PL_sv_zero gets upgraded to an SvPOK() PVIV anyway.

Does it matter that \(%h && $foo) now returns a reference to a read-
only value when %h is empty? (If so, that can be solved by turning on
PADTMP on PL_sv_zero.)

I think it does matter. grep(1,@​list_with_one_item) returning a mutable
1 and grep(1,@​list_with_zero_items) returning a read-only variable is
the kind of icky inconsistency that I tried hard to eliminate.

Won't setting PADTMP cause PL_sv_zero's PVX buffer to be stolen? Or is
that not done on a R/O var?

It is not done on a R/O var.

Now done.

commit 02960b5
Author​: David Mitchell <davem@​iabyn.com>
AuthorDate​: Fri Aug 4 15​:17​:44 2017 +0100
Commit​: David Mitchell <davem@​iabyn.com>
CommitDate​: Fri Aug 4 15​:17​:44 2017 +0100

  set SVs_PADTMP flag on PL_sv_zero
 
  Where an op in scalar but not boolean context returns &PL_sv_zero as a
  more efficient way of doing sv_2mortal(newSViv(0)), the returned value
  must be mutable. For example
 
  my @​a = ();
  my $r = \ scalar grep $_ == 1, @​a;
  $$r += 10;
 
  By setting the SVs_PADTMP flag, this forces pp_srefgen() and similar to
  make a mortal copy of &PL_sv_zero.
 
  This kind of defeats the original optimisation, but the copy only kicks
  in under certain circumstances, whereas the newSViv(0) approach would
  create a new mortal every time.
 
  See RT #78288 for where FC suggested the problem and the solution.

--
Diplomacy is telling someone to go to hell in such a way that they'll
look forward to the trip

@p5pRT
Copy link
Author

p5pRT commented Aug 5, 2017

From @iabyn

On Thu, Jul 27, 2017 at 04​:14​:18PM -0700, Father Chrysostomos via RT wrote​:

Also, index() doesn't return a boolean value, but for no match it
returns
-1. So for code like

if (index(...) != -1) { ... }

optimise away the OP_CONST and the OP_EQ and flag the index op to
return a
boolean value.

index(...) >= 0 is also quite common.

I've now added comparison variant like that with v5.27.2-102-g400ffcf.

--
The Enterprise's efficient long-range scanners detect a temporal vortex
distortion in good time, allowing it to be safely avoided via a minor
course correction.
  -- Things That Never Happen in "Star Trek" #21

@p5pRT
Copy link
Author

p5pRT commented Nov 28, 2017

@iabyn - Status changed from 'open' to 'resolved'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant