Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Re: Speed of f(), $o->m(), etc. in 03,57,58,61 [Was: File::Spec::Functions in 5.005_61] #570

Closed
p5pRT opened this issue Sep 20, 1999 · 2 comments

Comments

@p5pRT
Copy link

p5pRT commented Sep 20, 1999

Migrated from rt.perl.org#1422 (status was 'resolved')

Searchable as RT1422$

@p5pRT
Copy link
Author

p5pRT commented Sep 20, 1999

From The RT System itself

I thought so too, so I looked in Benchmark.pm and noticed that the core
logic in Benchmark​::runloop is​:

  ....
  if (ref $c eq 'CODE') {
  $subcode = "sub { for (1 .. $n) { local \$_; package $pack; &\$c; } }";
  $subref = eval $subcode;
  }
  else {
  $subcode = "sub { for (1 .. $n) { local \$_; package $pack; $c;} }";
  $subref = _doeval($subcode);
  }
  croak "runloop unable to compile '$c'​: $@​\ncode​: $subcode\n" if $@​;
  print STDERR "runloop $n '$subcode'\n" if $debug;

  $t0 = Benchmark->new(0);
  &$subref;
  $t1 = Benchmark->new($n);

So

1) For iteration limited tests, it compiles once, then runs. For CODE tests
it repeats some number of times, depending on whether or not the first run
made it. This can magnify errors due to timer granularity :-(.

2) For both CODE ref and string methods, it compiles something, but doesn't
include the compilation time in the results.

The argument against using a CODE ref test is that that extra sub { } call
adds up when doing simple-minded things like this. Here's a collation of
lines from the previous data (0) and new CODE ref based (1) data. As you can
see, the faster tests were hurt more by the added sub { } calls, probably
throwing off the results.

0​: perl v5.00503, 5 seconds per test
1​: perl v5.00503, 5 seconds per test
0​: P​::m(...)​: 6 wallclock secs ( 5.01 usr + 0.00 sys = 5.01 CPU) @​ 283922.55/s (n=1422452)
1​: P​::m(...)​: 6 wallclock secs ( 5.00 usr + 0.00 sys = 5.00 CPU) @​ 207927.80/s (n=1039639)
0​: P​::f(...)​: 6 wallclock secs ( 5.00 usr + 0.00 sys = 5.00 CPU) @​ 425330.20/s (n=2126651)
1​: P​::f(...)​: 5 wallclock secs ( 4.99 usr + 0.02 sys = 5.01 CPU) @​ 271399.00/s (n=1359709)
0​: P->m(...)​: 5 wallclock secs ( 5.00 usr + 0.00 sys = 5.00 CPU) @​ 173600.40/s (n=868002)
1​: P->m(...)​: 5 wallclock secs ( 5.00 usr + 0.00 sys = 5.00 CPU) @​ 131823.20/s (n=659116)
0​: P->f(...)​: 6 wallclock secs ( 5.02 usr + 0.00 sys = 5.02 CPU) @​ 200833.47/s (n=1008184)
1​: P->f(...)​: 5 wallclock secs ( 5.00 usr + 0.00 sys = 5.00 CPU) @​ 158528.20/s (n=792641)
0​: $o->m(...)​: 4 wallclock secs ( 5.25 usr + 0.01 sys = 5.26 CPU) @​ 230727.76/s (n=1213628)
1​: $o->m(...)​: 5 wallclock secs ( 5.00 usr + 0.00 sys = 5.00 CPU) @​ 160615.80/s (n=803079)
0​: $o->f(...)​: 6 wallclock secs ( 5.01 usr + 0.00 sys = 5.01 CPU) @​ 278775.25/s (n=1396664)
1​: $o->f(...)​: 4 wallclock secs ( 5.00 usr + 0.00 sys = 5.00 CPU) @​ 196754.40/s (n=983772)

Here's the speeds of each, sorted by speed on last night's data​:

0(string) 1(CODE)
========= =========
173600.40 131823.20 -24.07%
200833.47 158528.20 -21.06%
230727.76 160615.80 -30.39%
278775.25 196754.40 -29.42%
283922.55 207927.80 -26.77%
425330.20 271399.00 -36.19%

The faster tests slowed down more in general.

I have no real explanation for your numbers, but I think I can
understand one aspect of the results.

5.005_58 introduced Chip's OP_METHOD_NAMED patch, which adds a
ck_method() check during compilation of every method call. If
we just naively go by the number of instructions in the code
path, this should slightly increase the compilation time, but
correspondingly result in reduced run time of P->foo(). So
some of the work that used to be done at run time is done at
compile time, but your benchmark compiles it once every time
through, while also calling the method only once. This would
result in no net advantage.

The OP_METHOD_NAMED feature was why Andreas suggested that I redo
the File​::Spec timings, FWIW. Not sure how the above analysys
affects this, except that compile times are not included in
timings.

First, the 5 second limited sub { } results, then a collation of
those with the first data set.

perl v5.00503, 5 seconds per test
P​::m(...)​: 6 wallclock secs ( 5.00 usr + 0.00 sys = 5.00 CPU) @​ 207927.80/s (n=1039639)
P​::f(...)​: 5 wallclock secs ( 4.99 usr + 0.02 sys = 5.01 CPU) @​ 271399.00/s (n=1359709)
P->m(...)​: 5 wallclock secs ( 5.00 usr + 0.00 sys = 5.00 CPU) @​ 131823.20/s (n=659116)
P->f(...)​: 5 wallclock secs ( 5.00 usr + 0.00 sys = 5.00 CPU) @​ 158528.20/s (n=792641)
$o->m(...)​: 5 wallclock secs ( 5.00 usr + 0.00 sys = 5.00 CPU) @​ 160615.80/s (n=803079)
$o->f(...)​: 4 wallclock secs ( 5.00 usr + 0.00 sys = 5.00 CPU) @​ 196754.40/s (n=983772)
  Rate P​::m(...) P​::f(...) P->m(...) P->f(...) $o->m(...) $o->f(...)
  P​::m(...) 207927/s 0.00
  P​::f(...) 271399/s 30.53 0.00
  P->m(...) 131823/s -36.60 -51.43 0.00
  P->f(...) 158528/s -23.76 -41.59 20.26 0.00
  $o->m(...) 160615/s -22.75 -40.82 21.84 1.32 0.00
  $o->f(...) 196754/s -5.37 -27.50 49.26 24.11 22.50 0.00
perl v5.00557, 5 seconds per test
P​::m(...)​: 4 wallclock secs ( 5.00 usr + 0.00 sys = 5.00 CPU) @​ 218862.60/s (n=1094313)
P​::f(...)​: 6 wallclock secs ( 5.00 usr + 0.00 sys = 5.00 CPU) @​ 293210.80/s (n=1466054)
P->m(...)​: 5 wallclock secs ( 5.00 usr + 0.01 sys = 5.01 CPU) @​ 144481.04/s (n=723850)
P->f(...)​: 8 wallclock secs ( 5.46 usr + 0.02 sys = 5.48 CPU) @​ 181198.54/s (n=992968)
$o->m(...)​: 6 wallclock secs ( 5.09 usr + 0.02 sys = 5.11 CPU) @​ 178125.44/s (n=910221)
$o->f(...)​: 6 wallclock secs ( 5.19 usr + 0.00 sys = 5.19 CPU) @​ 233839.69/s (n=1213628)
  Rate P​::m(...) P​::f(...) P->m(...) P->f(...) $o->m(...) $o->f(...)
  P​::m(...) 218862/s 0.00
  P​::f(...) 293210/s 33.97 0.00
  P->m(...) 144481/s -33.99 -50.72 0.00
  P->f(...) 181198/s -17.21 -38.20 25.41 0.00
  $o->m(...) 178125/s -18.61 -39.25 23.29 -1.70 0.00
  $o->f(...) 233839/s 6.84 -20.25 61.85 29.05 31.28 0.00
perl v5.00558, 5 seconds per test
P​::m(...)​: 4 wallclock secs ( 5.84 usr + 0.00 sys = 5.84 CPU) @​ 187031.68/s (n=1092265)
P​::f(...)​: 6 wallclock secs ( 5.27 usr + 0.00 sys = 5.27 CPU) @​ 276348.01/s (n=1456354)
P->m(...)​: 7 wallclock secs ( 5.30 usr + 0.01 sys = 5.31 CPU) @​ 137132.96/s (n=728176)
P->f(...)​: 6 wallclock secs ( 5.10 usr + 0.00 sys = 5.10 CPU) @​ 183573.92/s (n=936227)
$o->m(...)​: 6 wallclock secs ( 5.69 usr + 0.01 sys = 5.70 CPU) @​ 169081.23/s (n=963763)
$o->f(...)​: 4 wallclock secs ( 5.29 usr + 0.02 sys = 5.31 CPU) @​ 237345.76/s (n=1260306)
  Rate P​::m(...) P​::f(...) P->m(...) P->f(...) $o->m(...) $o->f(...)
  P​::m(...) 187031/s 0.00
  P​::f(...) 276348/s 47.75 0.00
  P->m(...) 137132/s -26.68 -50.38 0.00
  P->f(...) 183573/s -1.85 -33.57 33.87 0.00
  $o->m(...) 169081/s -9.60 -38.82 23.30 -7.89 0.00
  $o->f(...) 237345/s 26.90 -14.11 73.08 29.29 40.37 0.00
perl v5.00561, 5 seconds per test
P​::m(...)​: 6 wallclock secs ( 5.00 usr + 0.00 sys = 5.00 CPU) @​ 195422.40/s (n=977112)
P​::f(...)​: 6 wallclock secs ( 5.00 usr + 0.01 sys = 5.01 CPU) @​ 282215.97/s (n=1413902)
P->m(...)​: 4 wallclock secs ( 5.03 usr + 0.00 sys = 5.03 CPU) @​ 134974.35/s (n=678921)
P->f(...)​: 6 wallclock secs ( 5.01 usr + 0.00 sys = 5.01 CPU) @​ 186203.99/s (n=932882)
$o->m(...)​: 5 wallclock secs ( 5.00 usr + 0.01 sys = 5.01 CPU) @​ 171269.66/s (n=858061)
$o->f(...)​: 4 wallclock secs ( 5.00 usr + 0.00 sys = 5.00 CPU) @​ 259902.00/s (n=1299510)
  Rate P​::m(...) P​::f(...) P->m(...) P->f(...) $o->m(...) $o->f(...)
  P​::m(...) 195422/s 0.00
  P​::f(...) 282215/s 44.41 0.00
  P->m(...) 134974/s -30.93 -52.17 0.00
  P->f(...) 186203/s -4.72 -34.02 37.96 0.00
  $o->m(...) 171269/s -12.36 -39.31 26.89 -8.02 0.00
  $o->f(...) 259901/s 32.99 -7.91 92.56 39.58 51.75 0.00

[barries@​jester perls]$ ./flip data
P​::m(...)​:
  Rate 03 57 58 61
  03 207927/s 0.00
  57 218862/s 5.26 0.00
  58 187031/s -10.05 -14.54 0.00
  61 195422/s -6.01 -10.71 4.49 0.00

P​::f(...)​:
  Rate 03 57 58 61
  03 271399/s 0.00
  57 293210/s 8.04 0.00
  58 276348/s 1.82 -5.75 0.00
  61 282215/s 3.99 -3.75 2.12 0.00

P->m(...)​:
  Rate 03 57 58 61
  03 131823/s 0.00
  57 144481/s 9.60 0.00
  58 137132/s 4.03 -5.09 0.00
  61 134974/s 2.39 -6.58 -1.57 0.00

P->f(...)​:
  Rate 03 57 58 61
  03 158528/s 0.00
  57 181198/s 14.30 0.00
  58 183573/s 15.80 1.31 0.00
  61 186203/s 17.46 2.76 1.43 0.00

$o->m(...)​:
  Rate 03 57 58 61
  03 160615/s 0.00
  57 178125/s 10.90 0.00
  58 169081/s 5.27 -5.08 0.00
  61 171269/s 6.63 -3.85 1.29 0.00

$o->f(...)​:
  Rate 03 57 58 61
  03 196754/s 0.00
  57 233839/s 18.85 0.00
  58 237345/s 20.63 1.50 0.00
  61 259901/s 32.09 11.15 9.50 0.00

#####################################################################

  0 = '' based tests
  1 = CODE ref based tests

[barries@​jester perls]$ ./cft fkipped.p fkipped
0​: P​::m(...)​:
1​: P​::m(...)​:
0​: Rate 03 57 58 61
1​: Rate 03 57 58 61
0​: 03 283922/s 0.00
1​: 03 207927/s 0.00
0​: 57 318413/s 12.15 0.00
1​: 57 218862/s 5.26 0.00
0​: 58 259137/s -8.73 -18.62 0.00
1​: 58 187031/s -10.05 -14.54 0.00
0​: 61 272521/s -4.02 -14.41 5.16 0.00
1​: 61 195422/s -6.01 -10.71 4.49 0.00
0​:
1​:
0​: P​::f(...)​:
1​: P​::f(...)​:
0​: Rate 03 57 58 61
1​: Rate 03 57 58 61
0​: 03 425330/s 0.00
1​: 03 271399/s 0.00
0​: 57 489074/s 14.99 0.00
1​: 57 293210/s 8.04 0.00
0​: 58 445731/s 4.80 -8.86 0.00
1​: 58 276348/s 1.82 -5.75 0.00
0​: 61 438763/s 3.16 -10.29 -1.56 0.00
1​: 61 282215/s 3.99 -3.75 2.12 0.00
0​:
1​:
0​: P->m(...)​:
1​: P->m(...)​:
0​: Rate 03 57 58 61
1​: Rate 03 57 58 61
0​: 03 173600/s 0.00
1​: 03 131823/s 0.00
0​: 57 186648/s 7.52 0.00
1​: 57 144481/s 9.60 0.00
0​: 58 175053/s 0.84 -6.21 0.00
1​: 58 137132/s 4.03 -5.09 0.00
0​: 61 171981/s -0.93 -7.86 -1.75 0.00
1​: 61 134974/s 2.39 -6.58 -1.57 0.00
0​:
1​:
0​: P->f(...)​:
1​: P->f(...)​:
0​: Rate 03 57 58 61
1​: Rate 03 57 58 61
0​: 03 200833/s 0.00
1​: 03 158528/s 0.00
0​: 57 264087/s 31.50 0.00
1​: 57 181198/s 14.30 0.00
0​: 58 248774/s 23.87 -5.80 0.00
1​: 58 183573/s 15.80 1.31 0.00
0​: 61 243321/s 21.16 -7.86 -2.19 0.00
1​: 61 186203/s 17.46 2.76 1.43 0.00
0​:
1​:
0​: $o->m(...)​:
1​: $o->m(...)​:
0​: Rate 03 57 58 61
1​: Rate 03 57 58 61
0​: 03 230727/s 0.00
1​: 03 160615/s 0.00
0​: 57 251558/s 9.03 0.00
1​: 57 178125/s 10.90 0.00
0​: 58 223504/s -3.13 -11.15 0.00
1​: 58 169081/s 5.27 -5.08 0.00
0​: 61 230123/s -0.26 -8.52 2.96 0.00
1​: 61 171269/s 6.63 -3.85 1.29 0.00
0​:
1​:
0​: $o->f(...)​:
1​: $o->f(...)​:
0​: Rate 03 57 58 61
1​: Rate 03 57 58 61
0​: 03 278775/s 0.00
1​: 03 196754/s 0.00
0​: 57 343551/s 23.24 0.00
1​: 57 233839/s 18.85 0.00
0​: 58 359682/s 29.02 4.70 0.00
1​: 58 237345/s 20.63 1.50 0.00
0​: 61 386196/s 38.53 12.41 7.37 0.00
1​: 61 259901/s 32.09 11.15 9.50 0.00
0​:
1​:

[barries@​jester perls]$

- Barrie

@p5pRT
Copy link
Author

p5pRT commented Apr 22, 2003

@iabyn - Status changed from 'stalled' to 'resolved'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant