Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

grep {/PATTERN/} is slow #7579

Open
p5pRT opened this issue Nov 4, 2004 · 16 comments
Open

grep {/PATTERN/} is slow #7579

p5pRT opened this issue Nov 4, 2004 · 16 comments

Comments

@p5pRT
Copy link

p5pRT commented Nov 4, 2004

Migrated from rt.perl.org#32331 (status was 'open')

Searchable as RT32331$

@p5pRT
Copy link
Author

p5pRT commented Nov 4, 2004

From @Abigail

Created by @Abigail

I would expect

  grep {EXPR} LIST

to be somewhat slower than

  grep EXPR, LIST

For instance, the following benchmark shows the block version to
be 5% slower​:

  use Benchmark qw /cmpthese/;

  our @​array = 101 .. 200;

  cmpthese -2 => {
  grep_eq_expr => '$​::a1 = grep $_ == 50, @​​::array',
  grep_eq_blck => '$​::a2 = grep {$_ == 50} @​​::array',
  };

  __END__
  Rate grep_eq_blck grep_eq_expr
  grep_eq_blck 28902/s -- -5%
  grep_eq_expr 30282/s 5% --

If, however, the expression is a regexp, things change drastically​:

  use Benchmark qw /cmpthese/;

  our @​array = 101 .. 200;

  cmpthese -2 => {
  grep_re_expr => '$​::b1 = grep /^50$/, @​​::array',
  grep_re_blck => '$​::b2 = grep {/^50$/} @​​::array',
  };

  __END__
  Rate grep_re_blck grep_re_expr
  grep_re_blck 6455/s -- -69%
  grep_re_expr 20848/s 223% --

This difference in speed surprises me.

It's not the fact that it's a regexp that's run inside a block,
as a similar for loop doesn't show the huge difference​:

  use Benchmark qw /cmpthese/;

  our @​array = 101 .. 200;

  cmpthese -2 => {
  for_eq_expr => '$​::c1 = 0; $_ == 50 and $​::c1 ++ for @​​::array',
  for_eq_blck => '$​::c2 = 0; for (@​​::array) {$_ == 50 and $​::c2 ++}',
  };

  __END__
  Rate for_eq_blck for_eq_expr
  for_eq_blck 23640/s -- -5%
  for_eq_expr 24992/s 6% --

  use Benchmark qw /cmpthese/;

  our @​array = 101 .. 200;

  cmpthese -2 => {
  for_re_expr => '$​::d1 = 0; /^50$/ and $​::d1 ++ for @​​::array',
  for_re_blck => '$​::d2 = 0; for (@​​::array) {/^50$/ and $​::d2 ++}',
  };

  __END__
  Rate for_re_blck for_re_expr
  for_re_blck 19091/s -- -2%
  for_re_expr 19545/s 2% --

In both cases, the block version is slower, but by a small margin.

So, I suspect a bug somewhere.

I'll be leaving for my honeymoon and YAPC​::AU tomorrow morning, so it might
be a while before I'll be able to reply to any questions.

Abigail

Perl Info

Flags:
    category=core
    severity=low

Site configuration information for perl v5.8.5:

Configured by abigail at Tue Aug  3 21:00:04 CEST 2004.

Summary of my perl5 (revision 5 version 8 subversion 5) configuration:
  Platform:
    osname=linux, osvers=2.4.18-bf2.4, archname=i686-linux-64int-ld
    uname='linux alexandra 2.4.18-bf2.4 #1 son apr 14 09:53:28 cest 2002 i686 unknown '
    config_args='-des -Dusemorebits -Uversiononly -Dmydomain=.abigail.nl -Dcf_email=abigail@abigail.nl -Dperladmin=abigail@abigail.nl -Doptimize=-g -Dcc=gcc -Dprefix=/opt/perl'
    hint=recommended, useposix=true, d_sigaction=define
    usethreads=undef use5005threads=undef useithreads=undef usemultiplicity=undef
    useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
    use64bitint=define use64bitall=undef uselongdouble=define
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='gcc', ccflags ='-DDEBUGGING -fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
    optimize='-g',
    cppflags='-DDEBUGGING -fno-strict-aliasing -pipe -I/usr/local/include'
    ccversion='', gccversion='3.0.4', gccosandvers=''
    intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=12345678
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
    ivtype='long long', ivsize=8, nvtype='long double', nvsize=12, Off_t='off_t', lseeksize=8
    alignbytes=4, prototype=define
  Linker and Libraries:
    ld='gcc', ldflags =' -L/usr/local/lib'
    libpth=/usr/local/lib /lib /usr/lib
    libs=-lnsl -ldl -lm -lcrypt -lutil -lc
    perllibs=-lnsl -ldl -lm -lcrypt -lutil -lc
    libc=/lib/libc-2.2.5.so, so=so, useshrplib=false, libperl=libperl.a
    gnulibc_version='2.2.5'
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E'
    cccdlflags='-fpic', lddlflags='-shared -L/usr/local/lib'

Locally applied patches:
    defined-or
    no-syntax-warnings


@INC for perl v5.8.5:
    /home/abigail/Perl
    /opt/perl/lib/5.8.5/i686-linux-64int-ld
    /opt/perl/lib/5.8.5
    /opt/perl/lib/site_perl/5.8.5/i686-linux-64int-ld
    /opt/perl/lib/site_perl/5.8.5
    /opt/perl/lib/site_perl/5.8.4/i686-linux-64int-ld
    /opt/perl/lib/site_perl/5.8.4
    /opt/perl/lib/site_perl/5.8.3/i686-linux-64int-ld
    /opt/perl/lib/site_perl/5.8.3
    /opt/perl/lib/site_perl/5.8.2/i686-linux-64int-ld
    /opt/perl/lib/site_perl/5.8.2
    /opt/perl/lib/site_perl/5.8.1/i686-linux-64int-ld
    /opt/perl/lib/site_perl/5.8.1
    /opt/perl/lib/site_perl/5.8.0/i686-linux-64int-ld
    /opt/perl/lib/site_perl/5.8.0
    /opt/perl/lib/site_perl
    .


Environment for perl v5.8.5:
    HOME=/home/abigail
    LANG=C
    LANGUAGE (unset)
    LD_LIBRARY_PATH=/home/abigail/Lib:/usr/local/lib:/usr/lib:/lib:/usr/X11R6/lib
    LOGDIR (unset)
    PATH=/home/abigail/Bin:/opt/perl/bin:/usr/local/bin:/usr/local/X11/bin:/usr/bin:/bin:/usr/local/sbin:/usr/sbin:/sbin:/usr/X11R6/bin:/usr/games:/usr/share/texmf/bin:/opt/Acrobat/bin:/opt/java/blackdown/j2sdk1.3.1/bin:/usr/local/games/bin
    PERL5LIB=/home/abigail/Perl
    PERLDIR=/opt/perl
    PERL_BADLANG (unset)
    SHELL=/bin/bash

@p5pRT
Copy link
Author

p5pRT commented Nov 11, 2004

From tassilo.von.parseval@rwth-aachen.de

On Thu, Nov 04, 2004 at 11​:55​:31PM +0000 abigail@​abigail.nl (via RT) wrote​:

# New Ticket Created by abigail@​abigail.nl
# Please include the string​: [perl #32331]
# in the subject line of all future correspondence about this issue.
# <URL​: http​://rt.perl.org​:80/rt3/Ticket/Display.html?id=32331 >

This is a bug report for perl from abigail@​abigail.nl,
generated with the help of perlbug 1.35 running under perl v5.8.5.

-----------------------------------------------------------------
[Please enter your report here]

I would expect

grep \{EXPR\} LIST

to be somewhat slower than

grep  EXPR\, LIST

For instance, the following benchmark shows the block version to
be 5% slower​:

use Benchmark qw /cmpthese/;

our @&#8203;array = 101 \.\. 200;

cmpthese \-2 => \{
    grep\_eq\_expr => '$&#8203;::a1 = grep  $\_ == 50\, @&#8203;&#8203;::array'\,
    grep\_eq\_blck => '$&#8203;::a2 = grep \{$\_ == 50\} @&#8203;&#8203;::array'\,
\};

\_\_END\_\_
                Rate grep\_eq\_blck grep\_eq\_expr
grep\_eq\_blck 28902/s           \-\-          \-5%
grep\_eq\_expr 30282/s           5%           \-\-

If, however, the expression is a regexp, things change drastically​:

use Benchmark qw /cmpthese/;

our @&#8203;array = 101 \.\. 200;

cmpthese \-2 => \{
    grep\_re\_expr => '$&#8203;::b1 = grep  /^50$/\, @&#8203;&#8203;::array'\,
    grep\_re\_blck => '$&#8203;::b2 = grep \{/^50$/\} @&#8203;&#8203;::array'\,
\};

\_\_END\_\_
                Rate grep\_re\_blck grep\_re\_expr
grep\_re\_blck  6455/s           \-\-         \-69%
grep\_re\_expr 20848/s         223%           \-\-

This difference in speed surprises me.

[...]

Just for the record​: I can confirm this. The problem is still in blead
(tested on #23466).

Tassilo
--
$_=q#",}])!JAPH!qq(tsuJ[{@​"tnirp}3..0}_$;//​::niam/s~=)]3[))_$-3(rellac(=_$({
pam{rekcahbus})(rekcah{lrePbus})(lreP{rehtonabus})!JAPH!qq(rehtona{tsuJbus#;
$_=reverse,s+(?&lt;=sub).+q#q!'"qq.\t$&amp;."'!#+sexisexiixesixeseg;y~\n~~dddd;eval

@p5pRT
Copy link
Author

p5pRT commented Nov 11, 2004

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Dec 3, 2004

From ajs@ajs.com

I've tried this with 5.6.1, 5.8.3 and 5.8.5, all with the same result. I
modified the program several ways. It appears to be that the regexp is
being re-parsed for every invocation of the block. I have no idea why,
but I see, for example, that pre-parsing the regexp results in slightly
better performance. That is​:

@p5pRT
Copy link
Author

p5pRT commented Dec 3, 2004

From ajs@ajs.com

Sorry, I submitted before I was done. My final version of the test​:

  use Benchmark qw /cmpthese/;

  our @​array = 101 .. 20000;
  our $b1;
  our $b2;

  cmpthese -2 => {
  grep_re_expr => '$​::b1 = grep /50/, @​​::array',
  grep_re_blck => '$​::b1 = grep {/50/} @​​::array',
  grep_re_subr => '$​::b2 = grep {is50($_)} @​​::array',
  };

  die unless $b1 == $b2;

  sub is50 {
  return ($_[0] =~ /50/);
  }

I also tried​:

  our $p = qr{^50$};

and then something like​:

  grep_re_blck => '$​::b1 = grep {/$p/} @​​::array',

which was very slightly faster (relative to the expr version with the
pre-compiled regexp), so this makes me suspect repeated re-parsing of
the regexp.

@p5pRT
Copy link
Author

p5pRT commented Apr 30, 2012

From @Hugmeir

On Fri Dec 03 11​:22​:39 2004, ajs wrote​:

Sorry, I submitted before I was done. My final version of the test​:

use Benchmark qw /cmpthese/;

our @&#8203;array = 101 \.\. 20000;
our $b1;
our $b2;

cmpthese \-2 => \{
    grep\_re\_expr => '$&#8203;::b1 = grep  /50/\, @&#8203;&#8203;::array'\,
    grep\_re\_blck => '$&#8203;::b1 = grep \{/50/\} @&#8203;&#8203;::array'\,
    grep\_re\_subr => '$&#8203;::b2 = grep \{is50\($\_\)\} @&#8203;&#8203;::array'\,
\};

die unless $b1 == $b2;

sub is50 {
return ($_[0] =~ /50/);
}

I also tried​:

our $p = qr{^50$};

and then something like​:

grep_re_blck => '$​::b1 = grep {/$p/} @​​::array',

which was very slightly faster (relative to the expr version with the
pre-compiled regexp), so this makes me suspect repeated re-parsing of
the regexp.

This is still present in blead; But as far as I can tell, the slowdown
is because the block version needs to create and enter/leave a scope,
while the expression version has no such penalty; So unless I'm wrong, I
think this has to be marked as WontFix.

@p5pRT
Copy link
Author

p5pRT commented Apr 30, 2012

From @cpansprout

On Sun Apr 29 18​:49​:32 2012, Hugmeir wrote​:

On Fri Dec 03 11​:22​:39 2004, ajs wrote​:

Sorry, I submitted before I was done. My final version of the test​:

use Benchmark qw /cmpthese/;

our @&#8203;array = 101 \.\. 20000;
our $b1;
our $b2;

cmpthese \-2 => \{
    grep\_re\_expr => '$&#8203;::b1 = grep  /50/\, @&#8203;&#8203;::array'\,
    grep\_re\_blck => '$&#8203;::b1 = grep \{/50/\} @&#8203;&#8203;::array'\,
    grep\_re\_subr => '$&#8203;::b2 = grep \{is50\($\_\)\} @&#8203;&#8203;::array'\,
\};

die unless $b1 == $b2;

sub is50 {
return ($_[0] =~ /50/);
}

I also tried​:

our $p = qr{^50$};

and then something like​:

grep_re_blck => '$​::b1 = grep {/$p/} @​​::array',

which was very slightly faster (relative to the expr version with the
pre-compiled regexp), so this makes me suspect repeated re-parsing of
the regexp.

This is still present in blead; But as far as I can tell, the slowdown
is because the block version needs to create and enter/leave a scope,
while the expression version has no such penalty; So unless I'm wrong, I
think this has to be marked as WontFix.

But the enter/leave shouldn’t make this much difference​:

The first script, which uses $_==50​:

$ pbpaste|perl5.15.9
  Rate grep_eq_blck grep_eq_expr
grep_eq_blck 42400/s -- -4%
grep_eq_expr 44245/s 4% --

The second script, which uses /^50$/​:

$ pbpaste|perl5.15.9
  Rate grep_re_blck grep_re_expr
grep_re_blck 9481/s -- -69%
grep_re_expr 30488/s 222% --

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented May 1, 2012

From @epa

Some say that

  grep { /PATTERN } @​x

is better style than

  grep /PATTERN/, @​x

(Perl Best Practices and perlcritic BuiltinFunctions​::RequireBlockGrep)

To keep these people happy it would be good to put in an optimization for the
former code to keep it just as fast as the latter.

If speeding it up is a WONTFIX, then perhaps perlcritic's rules need to change.

--
Ed Avis <eda@​waniasset.com>

@p5pRT
Copy link
Author

p5pRT commented May 1, 2012

From @nwc10

On Tue, May 01, 2012 at 10​:48​:53AM +0000, Ed Avis wrote​:

Some say that

grep \{ /PATTERN \} @&#8203;x

is better style than

grep /PATTERN/\, @&#8203;x

(Perl Best Practices and perlcritic BuiltinFunctions​::RequireBlockGrep)

To keep these people happy it would be good to put in an optimization for the
former code to keep it just as fast as the latter.

If speeding it up is a WONTFIX, then perhaps perlcritic's rules need to change.

It is a safe general-case transformation? Is there any behaviour one can
put into a pattern that would notice the lack of scope exit after each
run? ie the difference at steps 8, 9 and b in the second case​:

$ perl -MO=Concise,-exec -e 'grep /PATTERN/, @​x'
1 <0> enter
2 <;> nextstate(main 1 -e​:1) v​:{
3 <0> pushmark s
4 <$> gv(*x) s
5 <1> rv2av[t1] lKM/1
6 <@​> grepstart K
7 <|> grepwhile(other->8)[t2] vK
8 </> match(/"PATTERN"/) s/RTIME
  goto 7
9 <@​> leave[1 ref] vKP/REFC
-e syntax OK

$ perl -MO=Concise,-exec -e 'grep {/PATTERN/} @​x'
1 <0> enter
2 <;> nextstate(main 2 -e​:1) v​:{
3 <0> pushmark s
4 <$> gv(*x) s
5 <1> rv2av[t1] lKM/1
6 <@​> grepstart K*
7 <|> grepwhile(other->8)[t2] vK
8 <0> enter s
9 <;> nextstate(main 1 -e​:1) v​:{
a </> match(/"PATTERN"/) s/RTIME
b <@​> leave sKP
  goto 7
c <@​> leave[1 ref] vKP/REFC
-e syntax OK

The idea of an optimisation seems nice. But I don't know if it's trivially
safe. Or whether it would need to check the pattern for complexity, and
only work on "simple" things.

Nicholas Clark

@p5pRT
Copy link
Author

p5pRT commented May 1, 2012

From @iabyn

On Tue, May 01, 2012 at 10​:48​:53AM +0000, Ed Avis wrote​:

Some say that

grep \{ /PATTERN \} @&#8203;x

is better style than

grep /PATTERN/\, @&#8203;x

(Perl Best Practices and perlcritic BuiltinFunctions​::RequireBlockGrep)

To keep these people happy it would be good to put in an optimization for the
former code to keep it just as fast as the latter.

If speeding it up is a WONTFIX, then perhaps perlcritic's rules need to change.

It definitely isn't yet a WONTFIX​: there's something specifically about
using a regex (as opposed to just == say) that makes it *much* more slower
in a block than would be expected.

Until the reason is diagnosed, we can't really decide what to do about
it (but I'm not intending to look at it just yet).

--
Modern art​:
  "That's easy, I could have done that!"
  "Ah, but you didn't!"

@p5pRT
Copy link
Author

p5pRT commented May 1, 2012

From @Abigail

On Tue, May 01, 2012 at 04​:14​:12AM -0700, Dave Mitchell via RT wrote​:

On Tue, May 01, 2012 at 10​:48​:53AM +0000, Ed Avis wrote​:

Some say that

grep \{ /PATTERN \} @&#8203;x

is better style than

grep /PATTERN/\, @&#8203;x

(Perl Best Practices and perlcritic BuiltinFunctions​::RequireBlockGrep)

To keep these people happy it would be good to put in an optimization for the
former code to keep it just as fast as the latter.

If speeding it up is a WONTFIX, then perhaps perlcritic's rules need to change.

It definitely isn't yet a WONTFIX​: there's something specifically about
using a regex (as opposed to just == say) that makes it *much* more slower
in a block than would be expected.

Note that the slowdown isn't a "regexp in a block". It's a
"regexp in a grep block". As the original bug report shows, the
difference between for-expr and for-block are much less (with 5.15.9,
the difference between for-expr and for-block is less than 10% on
my machine).

With map, I get an inbetween value (using 5.15.9)​:

  use Benchmark qw /cmpthese/;

  our @​array = 1 .. 200;

  cmpthese -2 => {
  map_re_blck => '$​::d1 = map {/^50$/ ? $_ : ()} @​​::array',
  map_re_expr => '$​::d2 = map /^50$/ ? $_ : (), @​​::array',
  };

  __END__
  Rate map_re_blck map_re_expr
  map_re_blck 8756/s -- -39%
  map_re_expr 14422/s 65% --

Also, if the matches are failing, the difference between grep-block
and grep-expr is bigger (using 5.15.9)​:

  use Benchmark qw /cmpthese/;

  our @​array1 = 5000 .. 5099;
  our @​array2 = 4000 .. 4099;

  cmpthese -2 => {
  grep_re_blck1 => '$​::d1 = grep {/^50/} @​​::array1',
  grep_re_expr1 => '$​::d2 = grep /^50/, @​​::array1',
  grep_re_blck2 => '$​::d1 = grep {/^50/} @​​::array2',
  grep_re_expr2 => '$​::d2 = grep /^50/, @​​::array2',
  };

  __END__
  Rate grep_re_blck1 grep_re_blck2 grep_re_expr1 grep_re_expr2
  grep_re_blck1 5879/s -- -32% -51% -81%
  grep_re_blck2 8635/s 47% -- -28% -72%
  grep_re_expr1 12041/s 105% 39% -- -61%
  grep_re_expr2 30629/s 421% 255% 154% --

Until the reason is diagnosed, we can't really decide what to do about
it (but I'm not intending to look at it just yet).

The bug is quite old (I've known about it for longer than the bug report
is old -- it is as least as old as 5.005), so there's no urgency. OTOH,
"grep {/PATTERN/}" is a often used idiom.

Abigail

@richardleach
Copy link
Contributor

As @nwc observed, with $::b1 = grep /^50$/, @::array, the interesting part of the op tree seems to be just:

l                    <|> grepwhile(other->m)[t12] sK ->n
...
m                             </> match(/"^50$"/) s ->l

With $::b1 = grep {/^50$/} @::array, a lot more goes on in each iteration:

l                    <|> grepwhile(other->m)[t12] sK ->q
...
p                                <@> leave sKP ->l
m                                   <0> enter s ->n
n                                   <;> nextstate(main 6 -e:1) v:{ ->o
o                                   </> match(/"^50$"/) s ->p

Using the wrapper of our @​array = 101 .. 200; for (1..1_000_000) { $​::b1 = grep EXPR-or-BLOCK @​​::array }, perf record and perf report on blead show >1% of run time contributors as the following for the first form:

  20.66%  miniperl  miniperl           [.] Perl_pp_grepwhile
  18.93%  miniperl  miniperl           [.] Perl_pp_match
  18.48%  miniperl  miniperl           [.] Perl_re_intuit_start
  18.21%  miniperl  miniperl           [.] Perl_leave_scope
   8.71%  miniperl  miniperl           [.] Perl_regexec_flags
   4.93%  miniperl  miniperl           [.] Perl_save_vptr
   3.36%  miniperl  miniperl           [.] Perl_sv_2pv_flags
   3.27%  miniperl  miniperl           [.] Perl_pop_scope

and this for the second:

  12.70%  miniperl  miniperl           [.] Perl_pp_match
  12.26%  miniperl  miniperl           [.] Perl_sv_setsv_flags
  11.22%  miniperl  miniperl           [.] Perl_pp_grepwhile
  10.96%  miniperl  miniperl           [.] Perl_sv_upgrade
   8.39%  miniperl  miniperl           [.] Perl_re_intuit_start
   7.46%  miniperl  miniperl           [.] Perl_leave_scope
   5.27%  miniperl  miniperl           [.] Perl_pp_enter
   5.15%  miniperl  miniperl           [.] Perl_leave_adjust_stacks
   5.14%  miniperl  miniperl           [.] Perl_pp_leave
   4.85%  miniperl  miniperl           [.] Perl_sv_clear
   3.54%  miniperl  miniperl           [.] Perl_regexec_flags
   3.23%  miniperl  miniperl           [.] Perl_sv_free2
   2.98%  miniperl  miniperl           [.] Perl_pp_nextstate
   1.53%  miniperl  miniperl           [.] Perl_save_vptr
   1.39%  miniperl  miniperl           [.] Perl_sv_2pv_flags
   1.08%  miniperl  miniperl           [.] Perl_pop_scope

A good chunk of the extra runtime seems to be due to Perl_leave_adjust_stacks creating, upgrading and assigning to a new mortal SV on every iteration and something (maybe the additional nextstate) clearing and freeing that SV.

Neither of $​::b2 = grep {/^50$/} @​​::array or $::b1 = grep $_ == 50, @::array do this. The relevant bits of the op tree are basically identical:

l                    <|> grepwhile(other->m)[t13] sK ->p
...
o                             <2> eq sK/2 ->l
-                                <1> ex-rv2sv sK/1 ->n
m                                   <#> gvsv[*_] s ->n
n                                <$> const[IV 50] s ->o

vs:

l                    <|> grepwhile(other->m)[t13] sK ->p
...
o                                   <2> eq sK/2 ->l
-                                      <1> ex-rv2sv sK/1 ->n
m                                         <#> gvsv[*_] s ->n
n                                      <$> const[IV 50] s ->o

@richardleach
Copy link
Contributor

A good chunk of the extra runtime seems to be due to Perl_leave_adjust_stacks creating, upgrading and assigning to a new mortal SV on every iteration and something (maybe the additional nextstate) clearing and freeing that SV.

The SV for which the mortal copy is being made is &PL_sv_no. It seems like it shouldn't be necessary to make a mortal copy of this?

@richardleach
Copy link
Contributor

Using Abigail's benchmark against blead, things look tighter than they used to be:

                  Rate grep_re_blck1 grep_re_blck2 grep_re_expr1 grep_re_expr2
grep_re_blck1 118606/s            --           -6%          -53%          -62%
grep_re_blck2 126029/s            6%            --          -50%          -59%
grep_re_expr1 250863/s          112%           99%            --          -19%
grep_re_expr2 310597/s          162%          146%           24%            --

#20800 would still help a lot:

                   Rate grep_re_blck1 grep_re_blck2 grep_re_expr1 grep_re_expr2
grep_re_blck1 170393/s            --          -14%          -32%          -45%
grep_re_blck2 198189/s           16%            --          -21%          -36%
grep_re_expr1 252057/s           48%           27%            --          -19%
grep_re_expr2 312076/s           83%           57%           24%            --  

The remaining overhead then seems down to the enter, leave, extra nextstate, and lesser action in Perl_leave_adjust_stacks.

@richardleach
Copy link
Contributor

grep {/^50$/} gets an ENTER+LEAVE pair while grep { $ == 50} gets a SCOPE in Perl_op_scope, with the difference being whether or not the LINESEQ has the PARENS flag.

1    lineseq LISTOP(0x55c8a0130718) ===> [0x0]
     PARENT ===> [0x0]
     FLAGS = (UNKNOWN,KIDS,PARENS,SLABBED)
     |
2    +--nextstate COP(0x55c8a01300d0) ===> [SELF]
     |   FLAGS = (VOID,SLABBED,MORESIB)
     |   LINE = 1
     |   PACKAGE = "main"
     |   HINTS = 00000100
     |   SEQ = 3
     |
3    +--match PMOP(0x55c8a0130758) ===> [0x0]
         FLAGS = (UNKNOWN,SLABBED)
         PMf_PRE /^50$/
         PMFLAGS = ()

vs

1    lineseq LISTOP(0x561034011630) ===> [0x0]
     PARENT ===> [0x0]
     FLAGS = (UNKNOWN,KIDS,SLABBED)
     |
2    +--nextstate COP(0x561034011670) ===> [SELF]
     |   FLAGS = (VOID,SLABBED,MORESIB)
     |   LINE = 1
     |   PACKAGE = "main"
     |   SEQ = 3
     |
3    +--eq BINOP(0x5610340116d0) ===> 4 [gv 0x5610340110f8]
         FLAGS = (SCALAR,KIDS,SLABBED)
         PRIVATE = (0x2)
         |
5        +--rv2sv UNOP(0x561034011748) ===> 6 [const 0x561034011710]
         |   FLAGS = (SCALAR,KIDS,SLABBED,MORESIB)
         |   PRIVATE = (0x1)
         |   |
4        |   +--gv SVOP(0x5610340110f8) ===> 5 [rv2sv 0x561034011748]
         |       FLAGS = (SCALAR,SLABBED)
         |       GV = main::_ (0x561033fe6738)
         |
6        +--const SVOP(0x561034011710) ===> 3 [eq 0x5610340116d0]
             FLAGS = (SCALAR,SLABBED)
             SV = IV(50)

I don't know why the difference in the LINESEQs...

@richardleach
Copy link
Contributor

richardleach commented Feb 12, 2023

I don't know why the difference in the LINESEQs...

OPf_PARENS is applied to grep {/^50$/} by S_voidnonfinal because (PL_hints & HINT_BLOCK_SCOPE) is true, whereas it is not true for grep { $_ == 0}.

(Updated from original comment which suggested that it was set by the parser.)

The flag seems to be set by Perl_pmruntime in the following snippet, but the rationale for doing so is not outlined in any comments:

    PL_hints |= HINT_BLOCK_SCOPE;
    pm = cPMOPo;
    assert(floor==0 || (pm->op_pmflags & PMf_HAS_CV));

    if (is_compiletime) {

Commenting out that line does allow miniperl to produce a simpler OP tree for grep {^50$/}:

      +--scope LISTOP(0x556104efd018) ===> 16 [null 0x556104efcfb0]
           FLAGS = (SCALAR,KIDS,SLABBED)
           |   
           +--null (ex-nextstate) COP(0x556104efc9d0) ===> 12 [match 0x556104efd058]
           |   FLAGS = (VOID,SLABBED,MORESIB)
           |   LINE = 1
           |   PACKAGE = "main"
           |   SEQ = 4294967248
           |   
           +--match PMOP(0x556104efd058) ===> 11 [grepwhile 0x556104efcec8]
               FLAGS = (SCALAR,SLABBED)
               PMf_PRE /^50$/
               PMFLAGS = ()

but perl then won't build:

$ make perl
./miniperl -Ilib make_patchnum.pl
Updating 'git_version.h' and 'lib/Config_git.pl'
./miniperl -Ilib configpm
written lib/Config.pod
updated lib/Config.pm
updated lib/Config_heavy.pl
./miniperl -Ilib make_ext.pl ext/ExtUtils-Miniperl/pm_to_blib  MAKE="make" LIBPERL_A=libperl.a
Running pm_to_blib for ext/ExtUtils-Miniperl directly
miniperl: sv.c:3757: S_glob_assign_glob: Assertion `isGV_with_GP(_gvgp)' failed.
Aborted
make: *** [makefile:603: ext/ExtUtils-Miniperl/pm_to_blib] Error 134

Maybe someone more familiar with the regex engine would be able to figure out if there are circumstances in which Perl_pmruntime does not need to set HINT_BLOCK_SCOPE?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants