Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"for $a..$b -> $i { ... }" loops are sometimes much slower than c-style loops #6150

Open
p6rt opened this issue Mar 11, 2017 · 9 comments
Open
Labels

Comments

@p6rt
Copy link

p6rt commented Mar 11, 2017

Migrated from rt.perl.org#130982 (status was 'open')

Searchable as RT130982$

@p6rt
Copy link
Author

p6rt commented Mar 11, 2017

From @mscha

Perl6-style simple a-to-b loops are often much slower than the
corresponding C-style loops, especially when dealing with embedded loops.

My "real life" example (as far as Project Euler is real life) is​:
http://pastebin.com/SVYAyA5z
It takes about 55 seconds on my machine with C-style loops, and about 88
seconds with Perl6-style loops. (Comment the "loop" statements and
uncomment the corresponding "for" statements.)

A reduced example is at the bottom of this report. It outputs

111.04240975
95.56877319

on my machine.

I'm using the latest Rakudo Star, 2017.01.

See​: https://irclog.perlgeek.de/perl6/2017-03-11#i_14245536 and below.


#!/usr/bin/env perl6

sub perl6-loop($n)
{
  my $start = now;
  my $sum = 0;
  for 1..$n -> $i {
  for 1..$i -> $j {
  $sum += $i+$j;
  }
  }
  say now - $start;
}

sub c-loop($n)
{
  my $start = now;
  my $sum = 0;
  loop (my $i = 1; $i <= $n; $i++) {
  loop (my $j = 1; $j <= $i; $j++) {
  $sum += $i+$j;
  }
  }
  say now - $start;
}

sub MAIN($n = 10⁴)
{
  perl6-loop($n);
  c-loop($n);
}

@p6rt
Copy link
Author

p6rt commented Mar 12, 2017

From @LLFourn

If you think that discrepancy is impressive you're going to love this. I
added a version to your example using native ints​:
https://gist.github.com/LLFourn/8c3e895e789fab957355ce23c9420133

bash-3.2$ perl6 native-int-perf.p6
perl6-loop​: 84.8739988
c-loop​: 67.65849241 (1.25 times faster)
native-loop​: 0.4981954 (135.81 times faster)

(!)

On Sun, Mar 12, 2017 at 2​:08 AM Michael Schaap <perl6-bugs-followup@​perl.org>
wrote​:

# New Ticket Created by Michael Schaap
# Please include the string​: [perl #​130982]
# in the subject line of all future correspondence about this issue.
# <URL​: https://rt-archive.perl.org/perl6/Ticket/Display.html?id=130982 >

Perl6-style simple a-to-b loops are often much slower than the
corresponding C-style loops, especially when dealing with embedded loops.

My "real life" example (as far as Project Euler is real life) is​:
http://pastebin.com/SVYAyA5z
It takes about 55 seconds on my machine with C-style loops, and about 88
seconds with Perl6-style loops. (Comment the "loop" statements and
uncomment the corresponding "for" statements.)

A reduced example is at the bottom of this report. It outputs

111.04240975
95.56877319

on my machine.

I'm using the latest Rakudo Star, 2017.01.

See​: https://irclog.perlgeek.de/perl6/2017-03-11#i_14245536 and below.


#!/usr/bin/env perl6

sub perl6-loop($n)
{
  my $start = now;
  my $sum = 0;
  for 1..$n -> $i {
  for 1..$i -> $j {
  $sum += $i+$j;
  }
  }
  say now - $start;
}

sub c-loop($n)
{
  my $start = now;
  my $sum = 0;
  loop (my $i = 1; $i <= $n; $i++) {
  loop (my $j = 1; $j <= $i; $j++) {
  $sum += $i+$j;
  }
  }
  say now - $start;
}

sub MAIN($n = 10⁴)
{
  perl6-loop($n);
  c-loop($n);
}

@p6rt
Copy link
Author

p6rt commented Mar 12, 2017

The RT System itself - Status changed from 'new' to 'open'

@p6rt
Copy link
Author

p6rt commented Mar 12, 2017

From @geekosaur

On Sun, Mar 12, 2017 at 12​:48 AM, Lloyd Fournier <lloyd.fourn@​gmail.com>
wrote​:

perl6-loop​: 84.8739988
c-loop​: 67.65849241 (1.25 times faster)
native-loop​: 0.4981954 (135.81 times faster)

Still quite a lot of optimization to be done on that front. WRT native int,
one of the issues is needing to track when boxing is/isn't needed,
otherwise nativizing can be a pessimization because of constant reboxing.
The optimizer isn't up to tracking boxing like that yet.

--
brandon s allbery kf8nh sine nomine associates
allbery.b@​gmail.com ballbery@​sinenomine.net
unix, openafs, kerberos, infrastructure, xmonad http://sinenomine.net

@p6rt
Copy link
Author

p6rt commented Nov 19, 2017

From @MasterDuke17

On Sun, 12 Mar 2017 07​:27​:37 -0700, allbery.b@​gmail.com wrote​:

On Sun, Mar 12, 2017 at 12​:48 AM, Lloyd Fournier <lloyd.fourn@​gmail.com>
wrote​:

perl6-loop​: 84.8739988
c-loop​: 67.65849241 (1.25 times faster)
native-loop​: 0.4981954 (135.81 times faster)

Still quite a lot of optimization to be done on that front. WRT native int,
one of the issues is needing to track when boxing is/isn't needed,
otherwise nativizing can be a pessimization because of constant reboxing.
The optimizer isn't up to tracking boxing like that yet.

FWIW.

$ perl6 native-int-perf.p6
perl6-loop​: 117.55178620
c-loop​: 156.7308145 (0.75 times faster)
native-loop​: 1.2542871 (124.96 times faster)

$ perl6 --version
This is Rakudo version 2017.10-211-g2f0da94c3 built on MoarVM version 2017.10-82-ge876f1484

@p6rt
Copy link
Author

p6rt commented Nov 20, 2017

From @LLFourn

For comparison to march on the same comp​:
bash-3.2$ perl6 perf.p6
perl6-loop​: 63.0037058
c-loop​: 76.86853305 (0.82 times faster)
native-loop​: 0.2170930 (354.08 times faster)

perl6 loops are faster. c style loops are slower. Native loops are even
faster relative to the others (for me).

We can probably close this RT. Seeing as the issue in the title has been
addressed kinda :P

LL

On Sun, Nov 19, 2017 at 1​:05 PM Daniel Green via RT <
perl6-bugs-followup@​perl.org> wrote​:

On Sun, 12 Mar 2017 07​:27​:37 -0700, allbery.b@​gmail.com wrote​:

On Sun, Mar 12, 2017 at 12​:48 AM, Lloyd Fournier <lloyd.fourn@​gmail.com>
wrote​:

perl6-loop​: 84.8739988
c-loop​: 67.65849241 (1.25 times faster)
native-loop​: 0.4981954 (135.81 times faster)

Still quite a lot of optimization to be done on that front. WRT native
int,
one of the issues is needing to track when boxing is/isn't needed,
otherwise nativizing can be a pessimization because of constant reboxing.
The optimizer isn't up to tracking boxing like that yet.

FWIW.

$ perl6 native-int-perf.p6
perl6-loop​: 117.55178620
c-loop​: 156.7308145 (0.75 times faster)
native-loop​: 1.2542871 (124.96 times faster)

$ perl6 --version
This is Rakudo version 2017.10-211-g2f0da94c3 built on MoarVM version
2017.10-82-ge876f1484

@p6rt
Copy link
Author

p6rt commented Nov 20, 2017

From @ronaldxs

What about a native perl6 range loop? Couldn't there be some way for Perl 6 / Rakudo to generate code competitive on a small range with the "native-loop" example?

perl6 -e '
  {
  my int ($a, $one, $three) = (42, 1, 3);
  for ^10_000_000 { $a += $one + $a%$three };
  say now - ENTER now;
  say $a
  }
  {
  # 50% slower with loop invariant $limit
  my int ($a, $one, $three) = (42, 1, 3);
  my int $limit = 10_000_000; # my removing "int" does not effect perf
  for ^$limit { $a += $one + $a%$three };
  say now - ENTER now;
  say $a
  }
  {
  my int ($a, $one, $three, $limit) = (42, 1, 3, 10_000_000);
  loop (my int $i = 0; $i < $limit; $i++) { $a += $one + $a%$three };
  say now - ENTER now;
  say $a
  }'

...program prints...

2.10023098 # perl6 loop
15000042
3.2119384 # attempt at native perl 6 loop
15000042
0.7915038 # native loop
15000042

@p6rt
Copy link
Author

p6rt commented Nov 22, 2017

From @timo

On Mon, 20 Nov 2017 12​:13​:47 -0800, ronaldxs wrote​:

What about a native perl6 range loop? Couldn't there be some way for
Perl 6 / Rakudo to generate code competitive on a small range with the
"native-loop" example?

perl6 -e '
{
my int ($a, $one, $three) = (42, 1, 3);
for ^10_000_000 { $a += $one + $a%$three };
say now - ENTER now;
say $a
}'

This code will actually be turned into a `loop` loop by the optimizer at compile time. If you type the block to have a signature of `-> int $_ --> Nil` it'll get faster.

The optimizer doesn't do anything like that for the `^$limit` case, however.

@p6rt
Copy link
Author

p6rt commented Nov 23, 2017

From @lizmat

On 22 Nov 2017, at 19​:31, Timo Paulssen via RT <perl6-bugs-followup@​perl.org> wrote​:
On Mon, 20 Nov 2017 12​:13​:47 -0800, ronaldxs wrote​:

What about a native perl6 range loop? Couldn't there be some way for
Perl 6 / Rakudo to generate code competitive on a small range with the
"native-loop" example?

perl6 -e '
{
my int ($a, $one, $three) = (42, 1, 3);
for ^10_000_000 { $a += $one + $a%$three };
say now - ENTER now;
say $a
}'

This code will actually be turned into a `loop` loop by the optimizer at compile time. If you type the block to have a signature of `-> int $_ --> Nil` it'll get faster.

So, is there a reason we don’t do this on all ^10000 type loops that don’t expect the return value to be used? Because in my benchmarks, this is 3x as fast! With only 25% of the allocations, so a lot less GC churn!

The optimizer doesn't do anything like that for the `^$limit` case, however.

That feels more that it needs a smarter spesh to me?

Liz

@p6rt p6rt added the perf label Jan 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant