Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance regression for the .. operator in v5.21.5 #14334

Closed
p5pRT opened this issue Dec 14, 2014 · 8 comments
Closed

Performance regression for the .. operator in v5.21.5 #14334

p5pRT opened this issue Dec 14, 2014 · 8 comments

Comments

@p5pRT
Copy link

p5pRT commented Dec 14, 2014

Migrated from rt.perl.org#123430 (status was 'resolved')

Searchable as RT123430$

@p5pRT
Copy link
Author

p5pRT commented Dec 14, 2014

From @avar

Created by @avar

sri reported an issue with the .. operator being massively slower in
blead than in v5.20.0. I bisected it down to a performance regression
in v5.21.4-51-g14d9114
(http​://perl5.git.perl.org/perl.git/commitdiff/v5.21.4-51-g14d9114)
first released with v5.21.5.

Before that commit (one v5.21.4-50-g39ff6c3, commit earlier), on a
non-debugging perl (same config /machine as pasted below, but
obviously different perl versions) we'd get this​:
 
  $ time ./miniperl -e 'my $str = "sub { my \$foo;"; $str .= "for (1..3){ \$foo .= \"test\"; }" x 3000; $str .= "}"; eval $str for 1 .. 100'
  real 0m1.588s
  user 0m1.576s
  sys 0m0.008s

And now​:

  $ time ./miniperl -e 'my $str = "sub { my \$foo;"; $str .= "for (1..3){ \$foo .= \"test\"; }" x 3000; $str .= "}"; eval $str for 1 .. 100'
  real 0m16.757s
  user 0m16.740s
  sys 0m0.004s

Interestingly it seems from looking at that patch that some bugfix for
the flip-flop operator is spilling into a slowdown for the range
operator. If we write this as a C-style loop then before we'd get​:

  $ time ./miniperl -e 'my $str = "sub { my \$foo;"; $str .= "for (\$_ = 1; \$_ <= 3; \$_++){ \$foo .= \"test\"; }" x 3000; $str .= "}"; eval $str for 1 .. 100'
  real 0m2.457s
  user 0m2.424s
  sys 0m0.028s

And after​:

  $ time ./miniperl -e 'my $str = "sub { my \$foo;"; $str .= "for (\$_ = 1; \$_ <= 3; \$_++){ \$foo .= \"test\"; }" x 3000; $str .= "}"; eval $str for 1 .. 100'
  real 0m2.749s
  user 0m2.720s
  sys 0m0.032s

The increase in time between those two runs is probably means nothing
and just results from having one sample, but I haven't looked into it.

In addition to the slowdown with the .. range operator, the
.. flip-flop operator also got much slower, presumably due to the
copying we have to do now to fix the bug FC was addressing in that
commit.

Same commits & perl configuration, before​:

  $ time ./miniperl -e 'my $str = "sub { my \$foo;"; $str .= "for (\$_ = 1; \$_ <= 3; \$_++){ \$foo .= scalar(0..0); }" x 3000; $str .= "}"; eval $str for 1 .. 100'
  real 0m3.059s
  user 0m3.008s
  sys 0m0.048s

After​:
 
  $ time ./miniperl -e 'my $str = "sub { my \$foo;"; $str .= "for (\$_ = 1; \$_ <= 3; \$_++){ \$foo .= scalar(0..0); }" x 3000; $str .= "}"; eval $str for 1 .. 100'
  real 1m2.791s
  user 1m2.624s
  sys 0m0.128s

Perl Info

Flags:
    category=core
    severity=medium

Site configuration information for perl 5.21.7:

Configured by avar at Sun Dec 14 18:39:01 UTC 2014.

Summary of my perl5 (revision 5 version 21 subversion 7) configuration:
  Commit id: 33f91fd56bd8de8ca69a82535ffaa3d3ce791e59
  Platform:
    osname=linux, osvers=3.14-2-amd64, archname=x86_64-linux-thread-multi
    uname='linux u.nix.is 3.14-2-amd64 #1 smp debian 3.14.13-2 (2014-07-24) x86_64 gnulinux '
    config_args='-Dusethreads -Dprefix=~/perl5/installed -Dusedevel -des'
    hint=recommended, useposix=true, d_sigaction=define
    useithreads=define, usemultiplicity=define
    use64bitint=define, use64bitall=define, uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='cc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -D_FORTIFY_SOURCE=2',
    optimize='-O2',
    cppflags='-D_REENTRANT -D_GNU_SOURCE -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include'
    ccversion='', gccversion='4.2.1 Compatible Debian Clang 3.5.0 (tags/RELEASE_350/final)', gccosandvers=''
    intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678, doublekind=3
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16, longdblkind=3
    ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
    alignbytes=8, prototype=define
  Linker and Libraries:
    ld='cc', ldflags =' -fstack-protector-strong -L/usr/local/lib'
    libpth=/usr/local/lib /usr/include/x86_64-linux-gnu /usr/lib /lib/x86_64-linux-gnu /lib/../lib /usr/lib/x86_64-linux-gnu /usr/lib/../lib /lib
    libs=-lnsl -ldb -ldl -lm -lcrypt -lutil -lc -lpthread
    perllibs=-lnsl -ldl -lm -lcrypt -lutil -lc -lpthread
    libc=libc-2.19.so, so=so, useshrplib=false, libperl=libperl.a
    gnulibc_version='2.19'
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E'
    cccdlflags='-fPIC', lddlflags='-shared -O2 -L/usr/local/lib -fstack-protector-strong'



@INC for perl 5.21.7:
    /home/avar/perl5/installed/lib/site_perl/5.21.7/x86_64-linux-thread-multi
    /home/avar/perl5/installed/lib/site_perl/5.21.7
    /home/avar/perl5/installed/lib/5.21.7/x86_64-linux-thread-multi
    /home/avar/perl5/installed/lib/5.21.7
    .


Environment for perl 5.21.7:
    HOME=/home/avar
    LANG=en_US.UTF-8
    LANGUAGE (unset)
    LD_LIBRARY_PATH (unset)
    LOGDIR (unset)
    PATH=/home/avar/local/bin:/home/avar/perl5/installed/bin:/home/v-perlbrew/perl5/perlbrew/bin:/home/v-perlbrew/perl5/perlbrew/perls/current/bin:/home/avar/local/bin:/home/avar/g/misc-scripts:/home/avar/bin:/usr/local/bin:/usr/bin:/bin:/usr/games
    PERLDOC=-MPod::Text::Ansi
    PERL_BADLANG (unset)
    SHELL=/bin/bash

@p5pRT
Copy link
Author

p5pRT commented Dec 15, 2014

From @cpansprout

On Sun Dec 14 10​:54​:28 2014, avar wrote​:

This is a bug report for perl from avar@​cpan.org,
generated with the help of perlbug 1.40 running under perl 5.21.7.

-----------------------------------------------------------------
[Please describe your issue here]

sri reported an issue with the .. operator being massively slower in
blead than in v5.20.0. I bisected it down to a performance regression
in v5.21.4-51-g14d9114
(http​://perl5.git.perl.org/perl.git/commitdiff/v5.21.4-51-g14d9114)
first released with v5.21.5.

Before that commit (one v5.21.4-50-g39ff6c3, commit earlier), on a
non-debugging perl (same config /machine as pasted below, but
obviously different perl versions) we'd get this​:

$ time ./miniperl -e 'my $str = "sub { my \$foo;"; $str .= "for
(1..3){ \$foo .= \"test\"; }" x 3000; $str .= "}"; eval $str for 1 ..
100'
real 0m1.588s
user 0m1.576s
sys 0m0.008s

And now​:

$ time ./miniperl -e 'my $str = "sub { my \$foo;"; $str .= "for
(1..3){ \$foo .= \"test\"; }" x 3000; $str .= "}"; eval $str for 1 ..
100'
real 0m16.757s
user 0m16.740s
sys 0m0.004s

Interestingly it seems from looking at that patch that some bugfix for
the flip-flop operator is spilling into a slowdown for the range
operator.

It’s the same operator internally; it just behaves differently depending on the context.

In the context of foreach(...), a range operator is created at first, and then it gets taken apart when it turns out to be the only thing in foreach’s list.

I notice your test case is solely testing the timing of compilation. At run-time I would not expect to see any difference, unless you write foreach(scalar($foo..$bar)) or \scalar($foo..$bar).

The only real compile-time difference is the use of pad_add_name_pvn instead of pad_alloc, so it may be worth looking into why lexical variable creation is much slower than target allocation.

In addition to the slowdown with the .. range operator, the
.. flip-flop operator also got much slower, presumably due to the
copying we have to do now to fix the bug FC was addressing in that
commit.

Again, you are not even running code, just compiling it. Furthermore, you don’t use .. in any lvalue context, so no copying would happen.

Same commits & perl configuration, before​:

$ time ./miniperl -e 'my $str = "sub { my \$foo;"; $str .= "for (\$_ =
1; \$_ <= 3; \$_++){ \$foo .= scalar(0..0); }" x 3000; $str .= "}";
eval $str for 1 .. 100'
real 0m3.059s
user 0m3.008s
sys 0m0.048s

After​:

$ time ./miniperl -e 'my $str = "sub { my \$foo;"; $str .= "for (\$_ =
1; \$_ <= 3; \$_++){ \$foo .= scalar(0..0); }" x 3000; $str .= "}";
eval $str for 1 .. 100'
real 1m2.791s
user 1m2.624s
sys 0m0.128s

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Dec 15, 2014

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Dec 15, 2014

From @avar

Thanks for the clarification. When sri noted this on #p5p he mentioned
that this showed up as a slowdown in a Mojo​::Template benchmark, this
is likely to significantly slowdown similar generated templating code
elsewhere.

On Mon, Dec 15, 2014 at 1​:20 AM, Father Chrysostomos via RT
<perlbug-followup@​perl.org> wrote​:

On Sun Dec 14 10​:54​:28 2014, avar wrote​:

This is a bug report for perl from avar@​cpan.org,
generated with the help of perlbug 1.40 running under perl 5.21.7.

-----------------------------------------------------------------
[Please describe your issue here]

sri reported an issue with the .. operator being massively slower in
blead than in v5.20.0. I bisected it down to a performance regression
in v5.21.4-51-g14d9114
(http​://perl5.git.perl.org/perl.git/commitdiff/v5.21.4-51-g14d9114)
first released with v5.21.5.

Before that commit (one v5.21.4-50-g39ff6c3, commit earlier), on a
non-debugging perl (same config /machine as pasted below, but
obviously different perl versions) we'd get this​:

$ time ./miniperl -e 'my $str = "sub { my \$foo;"; $str .= "for
(1..3){ \$foo .= \"test\"; }" x 3000; $str .= "}"; eval $str for 1 ..
100'
real 0m1.588s
user 0m1.576s
sys 0m0.008s

And now​:

$ time ./miniperl -e 'my $str = "sub { my \$foo;"; $str .= "for
(1..3){ \$foo .= \"test\"; }" x 3000; $str .= "}"; eval $str for 1 ..
100'
real 0m16.757s
user 0m16.740s
sys 0m0.004s

Interestingly it seems from looking at that patch that some bugfix for
the flip-flop operator is spilling into a slowdown for the range
operator.

It’s the same operator internally; it just behaves differently depending on the context.

In the context of foreach(...), a range operator is created at first, and then it gets taken apart when it turns out to be the only thing in foreach’s list.

I notice your test case is solely testing the timing of compilation. At run-time I would not expect to see any difference, unless you write foreach(scalar($foo..$bar)) or \scalar($foo..$bar).

The only real compile-time difference is the use of pad_add_name_pvn instead of pad_alloc, so it may be worth looking into why lexical variable creation is much slower than target allocation.

In addition to the slowdown with the .. range operator, the
.. flip-flop operator also got much slower, presumably due to the
copying we have to do now to fix the bug FC was addressing in that
commit.

Again, you are not even running code, just compiling it. Furthermore, you don’t use .. in any lvalue context, so no copying would happen.

Same commits & perl configuration, before​:

$ time ./miniperl -e 'my $str = "sub { my \$foo;"; $str .= "for (\$_ =
1; \$_ <= 3; \$_++){ \$foo .= scalar(0..0); }" x 3000; $str .= "}";
eval $str for 1 .. 100'
real 0m3.059s
user 0m3.008s
sys 0m0.048s

After​:

$ time ./miniperl -e 'my $str = "sub { my \$foo;"; $str .= "for (\$_ =
1; \$_ <= 3; \$_++){ \$foo .= scalar(0..0); }" x 3000; $str .= "}";
eval $str for 1 .. 100'
real 1m2.791s
user 1m2.624s
sys 0m0.128s

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Dec 15, 2014

From @cpansprout

On Sun Dec 14 16​:20​:57 2014, sprout wrote​:

On Sun Dec 14 10​:54​:28 2014, avar wrote​:

This is a bug report for perl from avar@​cpan.org,
generated with the help of perlbug 1.40 running under perl 5.21.7.

-----------------------------------------------------------------
[Please describe your issue here]

sri reported an issue with the .. operator being massively slower in
blead than in v5.20.0. I bisected it down to a performance regression
in v5.21.4-51-g14d9114
(http​://perl5.git.perl.org/perl.git/commitdiff/v5.21.4-51-g14d9114)
first released with v5.21.5.

Before that commit (one v5.21.4-50-g39ff6c3, commit earlier), on a
non-debugging perl (same config /machine as pasted below, but
obviously different perl versions) we'd get this​:

$ time ./miniperl -e 'my $str = "sub { my \$foo;"; $str .= "for
(1..3){ \$foo .= \"test\"; }" x 3000; $str .= "}"; eval $str for 1 ..
100'
real 0m1.588s
user 0m1.576s
sys 0m0.008s

And now​:

$ time ./miniperl -e 'my $str = "sub { my \$foo;"; $str .= "for
(1..3){ \$foo .= \"test\"; }" x 3000; $str .= "}"; eval $str for 1 ..
100'
real 0m16.757s
user 0m16.740s
sys 0m0.004s

Interestingly it seems from looking at that patch that some bugfix
for
the flip-flop operator is spilling into a slowdown for the range
operator.

It’s the same operator internally; it just behaves differently
depending on the context.

In the context of foreach(...), a range operator is created at first,
and then it gets taken apart when it turns out to be the only thing in
foreach’s list.

I notice your test case is solely testing the timing of compilation.
At run-time I would not expect to see any difference, unless you write
foreach(scalar($foo..$bar)) or \scalar($foo..$bar).

The only real compile-time difference is the use of pad_add_name_pvn
instead of pad_alloc, so it may be worth looking into why lexical
variable creation is much slower than target allocation.

In addition to the slowdown with the .. range operator, the
.. flip-flop operator also got much slower, presumably due to the
copying we have to do now to fix the bug FC was addressing in that
commit.

Again, you are not even running code, just compiling it. Furthermore,
you don’t use .. in any lvalue context, so no copying would happen.

The reason for the slowdown is that, by putting thousands of ..’s in the same sub, you are creating a HUGE pad that takes a long time to search, since pad lookup is linear. (I want to replace pad lookup with a binary search if possible, but it would be very tricky to get right, because things are not always in order.)

The thing that usually saves the day for pathological cases is an optimisation I added in 5.20 (v5.19.2-231-g7db6405) that records the maximum named slot. I did that because constants added to the pad under threaded builds get relocated to the pad at the end of the sub’s compilation, so the pad may double in size, causing pad lookup for string eval in that sub to be twice as slow.

However, 5.18.3 appears just as fast as 5.20, so my initial guess as to the cause may be completely wrong....

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Dec 15, 2014

From @cpansprout

On Sun Dec 14 16​:48​:29 2014, avar wrote​:

Thanks for the clarification. When sri noted this on #p5p he mentioned
that this showed up as a slowdown in a Mojo​::Template benchmark, this
is likely to significantly slowdown similar generated templating code
elsewhere.

Aha, generated code! The bane of pad lookup algorithms. :-)

See the commit message of v5.19.2-231-g7db6405.

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Dec 15, 2014

From @cpansprout

On Sun Dec 14 16​:50​:17 2014, sprout wrote​:

The reason for the slowdown is that, by putting thousands of ..’s in
the same sub, you are creating a HUGE pad that takes a long time to
search, since pad lookup is linear. (I want to replace pad lookup
with a binary search if possible, but it would be very tricky to get
right, because things are not always in order.)

The thing that usually saves the day for pathological cases is an
optimisation I added in 5.20 (v5.19.2-231-g7db6405) that records the
maximum named slot. I did that because constants added to the pad
under threaded builds get relocated to the pad at the end of the sub’s
compilation, so the pad may double in size, causing pad lookup for
string eval in that sub to be twice as slow.

However, 5.18.3 appears just as fast as 5.20, so my initial guess as
to the cause may be completely wrong....

Fixed in dd5d1b8.

The reason for the slowdown was that the name pad was getting extended by flipflop/range targets, whereas previously it was not. It’s the name pad that gets searched when symbols are looked up.

In 5.20 and earlier (and current bleadperl as of dd5d1b8), you can see the same slowdown if you add a ‘my $x’ to your generated code, because then the name pad gets extended (numbers from 5.18.4)​:

$ time ./miniperl -e 'my $str = "sub { my \$foo;"; $str .= "for (1..3){ \$foo .= \"test\"; }" x 3000; $str .= "}"; eval $str for 1 .. 100'

real 0m2.280s
user 0m2.261s
sys 0m0.015s
$ time ./miniperl -e 'my $str = "sub { my \$foo;"; $str .= "for (1..3){ \$foo .= \"test\"; } {my\$x}" x 3000; $str .= "}"; eval $str for 1 .. 100'

real 1m0.014s
user 0m59.872s
sys 0m0.033s

which is really not ideal. I’m just trying to avoid making things slower than before.

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Dec 15, 2014

@cpansprout - Status changed from 'open' to 'resolved'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant