Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

COW speedup lost after e8c6a474 #13878

Open
p5pRT opened this issue May 28, 2014 · 26 comments
Open

COW speedup lost after e8c6a474 #13878

p5pRT opened this issue May 28, 2014 · 26 comments

Comments

@p5pRT
Copy link

p5pRT commented May 28, 2014

Migrated from rt.perl.org#121975 (status was 'open')

Searchable as RT121975$

@p5pRT
Copy link
Author

p5pRT commented May 28, 2014

From @rjbs

I have this test program​:

  $x = "x" x 1_000_000;
  $y = $x for 1..1_000_000;

On my box (OS X, will attach perl -V shortly) this program takes 90s to run on 5.18 and 5.20. It takes a fraction of a second on 5.19.11.

The change in speed occurs at​:

  commit e8c6a47
  Author​: Yves Orton <demerphq@​gmail.com>
  Date​: Sun May 11 12​:37​:33 2014 +0200

  Implement "max waste" thresholds to avoid problems with COW and deliberately overallocated pvs

Matthew Horsfall reports no slowdown on his Ubuntu box.

Obviously it'd be great to get this sorted out for 5.20.1.

--
rjbs

@p5pRT
Copy link
Author

p5pRT commented May 28, 2014

From @rjbs

Summary of my perl5 (revision 5 version 20 subversion 0) configuration​:
 
  Platform​:
  osname=darwin, osvers=13.2.0, archname=darwin-2level
  uname='darwin walrus.local 13.2.0 darwin kernel version 13.2.0​: thu apr 17 23​:03​:13 pdt 2014; root​:xnu-2422.100.13~1release_x86_64 x86_64 '
  config_args='-Dprefix=/Users/rjbs/.plenv/versions/20.0 -de -Dusedevel -A'eval​:scriptdir=/Users/rjbs/.plenv/versions/20.0/bin''
  hint=recommended, useposix=true, d_sigaction=define
  useithreads=undef, usemultiplicity=undef
  use64bitint=define, use64bitall=define, uselongdouble=undef
  usemymalloc=n, bincompat5005=undef
  Compiler​:
  cc='cc', ccflags ='-fno-common -DPERL_DARWIN -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include -I/opt/local/include',
  optimize='-O3',
  cppflags='-fno-common -DPERL_DARWIN -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include -I/opt/local/include'
  ccversion='', gccversion='4.2.1 Compatible Apple LLVM 5.1 (clang-503.0.40)', gccosandvers=''
  intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678
  d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
  ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
  alignbytes=8, prototype=define
  Linker and Libraries​:
  ld='env MACOSX_DEPLOYMENT_TARGET=10.3 cc', ldflags =' -fstack-protector -L/usr/local/lib -L/opt/local/lib'
  libpth=/usr/local/lib /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../lib/clang/5.1/lib /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib /usr/lib /opt/local/lib
  libs=-lgdbm -ldbm -ldl -lm -lutil -lc
  perllibs=-ldl -lm -lutil -lc
  libc=, so=dylib, useshrplib=false, libperl=libperl.a
  gnulibc_version=''
  Dynamic Linking​:
  dlsrc=dl_dlopen.xs, dlext=bundle, d_dlsymun=undef, ccdlflags=' '
  cccdlflags=' ', lddlflags=' -bundle -undefined dynamic_lookup -L/usr/local/lib -L/opt/local/lib -fstack-protector'

Characteristics of this binary (from libperl)​:
  Compile-time options​: HAS_TIMES PERLIO_LAYERS PERL_DONT_CREATE_GVSV
  PERL_HASH_FUNC_ONE_AT_A_TIME_HARD PERL_MALLOC_WRAP
  PERL_NEW_COPY_ON_WRITE PERL_PRESERVE_IVUV
  PERL_USE_DEVEL USE_64_BIT_ALL USE_64_BIT_INT
  USE_LARGE_FILES USE_LOCALE USE_LOCALE_COLLATE
  USE_LOCALE_CTYPE USE_LOCALE_NUMERIC USE_PERLIO
  USE_PERL_ATOF
  Built under darwin
  Compiled at May 26 2014 15​:17​:02
  %ENV​:
  PERLDOC="-n/opt/local/bin/groff"
  PERL_AUTOINSTALL="--skipdeps"
  PERL_MAILERS="sendmail​:/Users/rjbs/bin/sendmail"
  @​INC​:
  /Users/rjbs/.plenv/versions/20.0/lib/perl5/site_perl/5.20.0/darwin-2level
  /Users/rjbs/.plenv/versions/20.0/lib/perl5/site_perl/5.20.0
  /Users/rjbs/.plenv/versions/20.0/lib/perl5/5.20.0/darwin-2level
  /Users/rjbs/.plenv/versions/20.0/lib/perl5/5.20.0
  .

@p5pRT
Copy link
Author

p5pRT commented May 28, 2014

From @rjbs

On Wed May 28 09​:04​:19 2014, rjbs wrote​:

I have this test program​:

$x = "x" x 1_000_000;
$y = $x for 1..1_000_000;

On my box (OS X, will attach perl -V shortly) this program takes 90s
to run on 5.18 and 5.20. It takes a fraction of a second on 5.19.11.

The change in speed occurs at​:

commit e8c6a47
Author​: Yves Orton <demerphq@​gmail.com>
Date​: Sun May 11 12​:37​:33 2014 +0200

Implement "max waste" thresholds to avoid problems with COW and
deliberately overallocated pvs

Matthew Horsfall reports no slowdown on his Ubuntu box.

Obviously it'd be great to get this sorted out for 5.20.1.

I foolishly put this bug in the wrong place to start, and am replying to it here simply to trigger a mail to p5p.

--
rjbs

@p5pRT
Copy link
Author

p5pRT commented May 28, 2014

From @jkeenan

On Wed May 28 09​:04​:19 2014, rjbs wrote​:

I have this test program​:

$x = "x" x 1_000_000;
$y = $x for 1..1_000_000;

On my box (OS X, will attach perl -V shortly) this program takes 90s
to run on 5.18 and 5.20. It takes a fraction of a second on 5.19.11.

[snip]

Matthew Horsfall reports no slowdown on his Ubuntu box.

Obviously it'd be great to get this sorted out for 5.20.1.

I am not getting a slowdown on either of two Linux machines. In what follows I use only released versions, compiled from source without threading.

#1​: current laptop
###
$ uname -a
Linux zareason 3.13.0-27-generic #50-Ubuntu SMP Thu May 15 18​:06​:16 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

$ which perl
/home/jkeenan/perl5/perlbrew/perls/perl-5.20.0/bin/perl

$ time perl ~/learn/perl/rjbs_speed.pl

real 0m0.061s
user 0m0.058s
sys 0m0.003s

$ perlbrew switch perl-5.18.2

$ which perl
/home/jkeenan/perl5/perlbrew/perls/perl-5.18.2/bin/perl

$ time perl ~/learn/perl/rjbs_speed.pl

real 1m17.144s
user 1m17.167s
sys 0m0.045s
###

#2​: older Linode
###
$ uname -a
Linux li11-226 2.6.18.8-linode22 #1 SMP Tue Nov 10 16​:12​:12 UTC 2009 i686 GNU/Linux

$ perl -v

This is perl 5, version 20, subversion 0 (v5.20.0) built for i686-linux

$ time perl rjbs_speed.pl

real 0m0.194s
user 0m0.144s
sys 0m0.002s

$ time /usr/local/bin/perl5.18.1 rjbs_speed.pl

real 1m43.419s
user 1m42.182s
sys 0m0.002s
###

Could there be an OS-specific aspect to this problem?

Thank you very much.
Jim Keenan

@p5pRT
Copy link
Author

p5pRT commented May 28, 2014

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented May 28, 2014

From @jkeenan

On Wed May 28 13​:33​:40 2014, jkeenan wrote​:

On Wed May 28 09​:04​:19 2014, rjbs wrote​:

I have this test program​:

$x = "x" x 1_000_000;
$y = $x for 1..1_000_000;

On my box (OS X, will attach perl -V shortly) this program takes 90s
to run on 5.18 and 5.20. It takes a fraction of a second on 5.19.11.

[snip]

Matthew Horsfall reports no slowdown on his Ubuntu box.

Obviously it'd be great to get this sorted out for 5.20.1.

I am not getting a slowdown on either of two Linux machines. In what
follows I use only released versions, compiled from source without
threading.

[snip Linux results ]

Could there be an OS-specific aspect to this problem?

Yes, there could be. I was able to reproduce rjbs's results on Darwin -- a very old Darwin! (I had to modify rjbs's program to take a command-line argument for number of iterations.)

I compared 5.18.0 and 5.20.0, both built from source. There did not appear to be significant differences. Performance degraded rapidly as the number of iterations increased; I control-C-ed runs of 300,000.

However, I then checked out the commit immediately preceding the one rjbs cited and built a perl from that. Using that, rjbs's program went like lightning. I was able to do a run of 3,000,000 with no problem.

Please see attached file. Let's get reports from other OSes.

Thank you very much.
Jim Keenan

@p5pRT
Copy link
Author

p5pRT commented May 28, 2014

From @jkeenan

# 5.18.0
$ time /usr/local/bin/perl5.18.0 rjbs_speed.pl 100000
real 0m3.379s
user 0m3.239s
sys 0m0.035s

$ time /usr/local/bin/perl5.18.0 rjbs_speed.pl 150000
real 0m7.591s
user 0m7.323s
sys 0m0.071s

$ time /usr/local/bin/perl5.18.0 rjbs_speed.pl 200000
real 0m38.278s
user 0m36.926s
sys 0m0.374s

# v5.19.11-45-gc8180b0; see perl -V below
$ time ./perl -Ilib ~/learn/perl/rjbs_speed.pl 100000
real 0m0.181s
user 0m0.063s
sys 0m0.020s

$ time ./perl -Ilib ~/learn/perl/rjbs_speed.pl 150000
real 0m0.164s
user 0m0.092s
sys 0m0.021s

$ time ./perl -Ilib ~/learn/perl/rjbs_speed.pl 200000
real 0m0.185s
user 0m0.122s
sys 0m0.022s

$ time ./perl -Ilib ~/learn/perl/rjbs_speed.pl 3000000
real 0m1.953s
user 0m1.759s
sys 0m0.038s

# 5.20.0
$ time perl rjbs_speed.pl 100000
real 0m3.352s
user 0m3.228s
sys 0m0.034s

$ time perl rjbs_speed.pl 150000
real 0m7.800s
user 0m7.532s
sys 0m0.068s

$ time perl rjbs_speed.pl 200000
real 0m35.307s
user 0m34.186s
sys 0m0.323s

#####
$ ./perl -Ilib -V
Summary of my perl5 (revision 5 version 19 subversion 12) configuration​:
  Commit id​: c8180b0
  Platform​:
  osname=darwin, osvers=8.11.0, archname=darwin-2level
  uname='darwin macintosh-9.local 8.11.0 darwin kernel version 8.11.0​: wed oct 10 18​:26​:00 pdt 2007; root​:xnu-792.24.17~1release_ppc power macintosh powerpc '
  config_args='-des -Dusedevel'

@p5pRT
Copy link
Author

p5pRT commented May 28, 2014

From @demerphq

On 28 May 2014 23​:53, James E Keenan via RT <perlbug-followup@​perl.org>wrote​:

On Wed May 28 13​:33​:40 2014, jkeenan wrote​:

On Wed May 28 09​:04​:19 2014, rjbs wrote​:

I have this test program​:

$x = "x" x 1_000_000;
$y = $x for 1..1_000_000;

On my box (OS X, will attach perl -V shortly) this program takes 90s
to run on 5.18 and 5.20. It takes a fraction of a second on 5.19.11.

[snip]

Matthew Horsfall reports no slowdown on his Ubuntu box.

Obviously it'd be great to get this sorted out for 5.20.1.

I am not getting a slowdown on either of two Linux machines. In what
follows I use only released versions, compiled from source without
threading.

[snip Linux results ]

Could there be an OS-specific aspect to this problem?

Yes, there could be. I was able to reproduce rjbs's results on Darwin --
a very old Darwin! (I had to modify rjbs's program to take a command-line
argument for number of iterations.)

I compared 5.18.0 and 5.20.0, both built from source. There did not
appear to be significant differences. Performance degraded rapidly as the
number of iterations increased; I control-C-ed runs of 300,000.

However, I then checked out the commit immediately preceding the one rjbs
cited and built a perl from that. Using that, rjbs's program went like
lightning. I was able to do a run of 3,000,000 with no problem.

Can you add something like

use Devel​::Peek;
Dump($x);

and then send us the last couple of lines. It will print a lot of "x"s and
then print something like this​:

  CUR = 1000000
  LEN = 1000002
  COW_REFCNT = 1

Id like to see that. (We need to patch Devel​::Peek dump to be a little
less annoying with long strings)

Yves

@p5pRT
Copy link
Author

p5pRT commented May 28, 2014

From @rjbs

* demerphq <demerphq@​gmail.com> [2014-05-28T19​:11​:21]

use Devel​::Peek;
Dump($x);

and then send us the last couple of lines. It will print a lot of "x"s and
then print something like this​:

I would be delighted to do so!

(type type type)

I ran it piped to 'grep -v xxxxxx' and I get, on perl 5.20.0​:

  ~$ perl terastring.pl |& grep -v xxxxxx
  SV = PV(0x7fd923004070) at 0x7fd92302d0f0
  REFCNT = 1
  FLAGS = (POK,pPOK)
  CUR = 1000000
  LEN = 1003520

When I use perl 5.19.11​:

  ~$ perl terastring.pl |& grep -v xxxxxx
  SV = PV(0x7ffe73802e70) at 0x7ffe73023cf0
  REFCNT = 1
  FLAGS = (POK,IsCOW,pPOK)
  CUR = 1000000
  LEN = 1003520
  COW_REFCNT = 1

--
rjbs

@p5pRT
Copy link
Author

p5pRT commented May 29, 2014

From @jkeenan

On Wed May 28 16​:12​:07 2014, demerphq wrote​:

Can you add something like

use Devel​::Peek;
Dump($x);

and then send us the last couple of lines. It will print a lot of "x"s
and
then print something like this​:

CUR = 1000000
LEN = 1000002
COW_REFCNT = 1

Id like to see that. (We need to patch Devel​::Peek dump to be a little
less annoying with long strings)

Yves

Results on older Darwin appear similar to what rjbs got. See file attached.

@p5pRT
Copy link
Author

p5pRT commented May 29, 2014

From @jkeenan

$ /usr/local/bin/perl5.18.0 121975-rjbs_speed.pl 100000 2> xyz; grep -v xxxxxx xyz;rm xyz
SV = PV(0x1803558) at 0x180f810
  REFCNT = 1
  FLAGS = (POK,pPOK)
  CUR = 100000
  LEN = 102400

$ ./perl -v | head -2 | tail -1
This is perl 5, version 19, subversion 12 (v5.19.12 (v5.19.11-45-gc8180b0)) built for darwin-2level
$ ./perl -Ilib ~/learn/perl/p5p/121975-rjbs_speed.pl 100000 2>xyz;grep -v xxxxxx xyz;rm xyz
SV = PV(0x2803528) at 0x2810110
  REFCNT = 1
  FLAGS = (POK,IsCOW,pPOK)
  CUR = 100000
  LEN = 102400
  COW_REFCNT = 1

$ /usr/local/bin/perl5.20.0 121975-rjbs_speed.pl 100000 2> xyz; grep -v xxxxxx xyz;rm xyz
SV = PV(0x1803560) at 0x1810180
  REFCNT = 1
  FLAGS = (POK,pPOK)
  CUR = 100000
  LEN = 102400

@p5pRT
Copy link
Author

p5pRT commented May 29, 2014

From @demerphq

On 29 May 2014 01​:31, Ricardo Signes <perl.p5p@​rjbs.manxome.org> wrote​:

* demerphq <demerphq@​gmail.com> [2014-05-28T19​:11​:21]

use Devel​::Peek;
Dump($x);

and then send us the last couple of lines. It will print a lot of "x"s
and
then print something like this​:

I would be delighted to do so!

(type type type)

I ran it piped to 'grep -v xxxxxx' and I get, on perl 5.20.0​:

~$ perl terastring.pl |& grep -v xxxxxx
SV = PV(0x7fd923004070) at 0x7fd92302d0f0
REFCNT = 1
FLAGS = (POK,pPOK)
CUR = 1000000
LEN = 1003520

When I use perl 5.19.11​:

~$ perl terastring.pl |& grep -v xxxxxx
SV = PV(0x7ffe73802e70) at 0x7ffe73023cf0
REFCNT = 1
FLAGS = (POK,IsCOW,pPOK)
CUR = 1000000
LEN = 1003520
COW_REFCNT = 1

So there is the reason. For some reason the slow builds are preallocating a
lot more bytes than the fast build is, and by doing so it exceeds
the SV_COW_MAX_WASTE_THRESHOLD or SV_COWBUF_MAX_WASTE_THRESHOLD of 80,
causing the string to not be COW'ed.

Obviously you *could* raise these thresholds, for instance to 4000 and this
script should be "fast" again.

However I think the real problem we need to get to the bottom of is why it
is preallocating so much extra space on the slow builds.

On Linux we see this​:

$ cat ../t2.pl
$x = "x" x 1_000_000;
$y = $x for 1..1_000_000;
use Devel​::Peek;
Dump($x);

$ time ./perl -Ilib ../t2.pl 2>&1 | grep -v xxxx
SV = PV(0x1a23d80) at 0x1a42808
  REFCNT = 1
  FLAGS = (POK,IsCOW,pPOK)
  CUR = 1000000
  LEN = 1000002
  COW_REFCNT = 1

real 0m0.139s
user 0m0.132s
sys 0m0.004s

Note the LEN-CUR == 2 compared to your run where LEN-CUR==3520. If we look
at James Keenans results​:

SV = PV(0x1803558) at 0x180f810
  REFCNT = 1
  FLAGS = (POK,pPOK)
  CUR = 100000
  LEN = 102400

We can see it is LEN-CUR==2400.

I suspect the Linux results are because of this​:

./perl -Ilib -V​:d_malloc_size
d_malloc_size='undef';

Which affects the following logic in sv.c​:

#ifdef Perl_safesysmalloc_size
  /* Do this here, do it once, do it right, and then we will never get
  called back into sv_grow() unless there really is some growing
  needed. */
  SvLEN_set(sv, Perl_safesysmalloc_size(s));
#else
  SvLEN_set(sv, newlen);
#endif

In Linux we compile the second branch, and in OSX and BSD's I think that
d_malloc_size is defined, so they do the first branch, and on linux we do
the second branch.

When we do something like

my $x= "x" x 1000000;

we end up doing SvGROW(sv,1000000);

Assuming $x was originally empty we will try to malloc() K+2 bytes, one
byte extra to guarantee room to null terminate the buffer, and one extra
byte for the cow-refcount should this string become COW.

On linux this ends up with SvLEN()==1000002 and SvCUR()==1000000.

On systems where Perl_safesysmalloc_size() is defined we set the SvLEN() to
the size that malloc() has actually allocated for us. This means we can
avoid some realloc() calls later (which would be fast anyway), but
crucially it means we can't tell the difference between "deliberately
preallocated" and "preallocated by virtue of the whims of malloc".

Basically there is a big difference when Perl or a perl user decides to
deliberately preallocate (in which case we probably care about the
wastage), and when malloc() does it on our behalf (in which case we
probably don't [insert hand waving here]).

There are multiple optimisations at multiple levels operating and clashing
here​:

1. Deliberate preallocation of a buffer - useful on OS'es with a braindead
realloc() and in code like sv_gets() where you avoid a bunch of checks by
preallocating for the "worst case" when processing block-read data.
2. Unwarranted chumminess with malloc() on systems with d_malloc_size
3. COW.

So one option would be to stop using Perl_safesysmalloc_size(). Another
would be to somehow make the wastage checks Perl_safesysmalloc_size()
aware, probaby by disabling some of the checks if it is available, or maybe
by somehow using the function as part of the wastage checks.

Regardless I feel like there is a bit of a clash of concepts here that we
need to think through properly before COW is really going to settle down.

--
perl -Mre=debug -e "/just|another|perl|hacker/"

@p5pRT
Copy link
Author

p5pRT commented May 29, 2014

From @demerphq

On 29 May 2014 11​:11, demerphq <demerphq@​gmail.com> wrote​:

So one option would be to stop using Perl_safesysmalloc_size().

As this is the easiest to do to prove this is actually the problem I have
pushed​:

commit ce861ea
Author​: Yves Orton <demerphq@​gmail.com>
Date​: Thu May 29 11​:33​:56 2014 +0200

  temporary fix for [perl #121975] COW speedup lost after e8c6a47

  Disable use of Perl_safesysmalloc_size by default. Only use it when
PERL_USE_MALLOC_SIZE is defined.

  Using Perl_safesysmalloc_size() perl cannot tell a deliberately
preallocated buffer
  which we probably dont want to COW from a malloc() preallocated buffer
where we probably
  dont care. Hopefully this fixes the slowdown observed on some platforms.

How this affects performance as a whole on the affected platforms I do not
know. I suspect not much.

Yves

--
perl -Mre=debug -e "/just|another|perl|hacker/"

@p5pRT
Copy link
Author

p5pRT commented May 29, 2014

From @bulk88

On Wed May 28 13​:33​:40 2014, jkeenan wrote​:

$ time perl ~/learn/perl/rjbs_speed.pl

Where is rjbs_speed.pl? Its not attached to this ticket.

--
bulk88 ~ bulk88 at hotmail.com

@p5pRT
Copy link
Author

p5pRT commented May 29, 2014

From @rjbs

* demerphq <demerphq@​gmail.com> [2014-05-29T05​:36​:13]

commit ce861ea

As you suspected​:

  ~/code/perl5$ time ./perl -I lib ~/terastring.pl |& grep -v xx
  SV = PV(0x7f7f53805098) at 0x7f7f53831538
  REFCNT = 1
  FLAGS = (POK,IsCOW,pPOK)
  CUR = 1000000
  LEN = 1000002
  COW_REFCNT = 1
  ./perl -I lib ~/terastring.pl 2>&1 0.20s user 0.01s system 98% cpu 0.206 total
  grep -v xx 0.00s user 0.00s system 2% cpu 0.206 total

--
rjbs

@p5pRT
Copy link
Author

p5pRT commented May 29, 2014

From @rjbs

* bulk88 via RT <perlbug-followup@​perl.org> [2014-05-29T07​:49​:08]

On Wed May 28 13​:33​:40 2014, jkeenan wrote​:

$ time perl ~/learn/perl/rjbs_speed.pl

Where is rjbs_speed.pl? Its not attached to this ticket.

It's the two-line program described in the ticket. With Devel​::Peek added​:

  # terastring.perl
  use Devel​::Peek;
  $x = "x" x 1_000_000;
  $y = $x for 1..1_000_000;

  Dump($x);

--
rjbs

@p5pRT
Copy link
Author

p5pRT commented May 29, 2014

From @jkeenan

On 5/29/14 7​:49 AM, bulk88 via RT wrote​:

On Wed May 28 13​:33​:40 2014, jkeenan wrote​:

$ time perl ~/learn/perl/rjbs_speed.pl

Where is rjbs_speed.pl? Its not attached to this ticket.

It was just rjbs's 2-liner put into a program, then provision for
command-line argument added, then later the Devel​::Peek stuff added.
Nothing special. Attached.

@p5pRT
Copy link
Author

p5pRT commented May 29, 2014

From @jkeenan

121975-rjbs_speed.pl

@p5pRT
Copy link
Author

p5pRT commented May 29, 2014

From @jhi

On Thursday-201405-29, 8​:02, Ricardo Signes wrote​:

* demerphq <demerphq@​gmail.com> [2014-05-29T05​:36​:13]

commit ce861ea

As you suspected​:

~/code/perl5$ time ./perl -I lib ~/terastring.pl |& grep -v xx
SV = PV(0x7f7f53805098) at 0x7f7f53831538
REFCNT = 1
FLAGS = (POK,IsCOW,pPOK)
CUR = 1000000
LEN = 1000002
COW_REFCNT = 1
./perl -I lib ~/terastring.pl 2>&1 0.20s user 0.01s system 98% cpu 0.206 total
grep -v xx 0.00s user 0.00s system 2% cpu 0.206 total

FWIW, in an OS X 10.9.3 (I tweaked the for loop to do the copy "only"
100K times, since I got bored waiting)​:

* with blead

time ./perl -Ilib /tmp/ts.pl 2>&1 |grep -v xxx
SV = PV(0x7ff5d1804070) at 0x7ff5d182d108
  REFCNT = 1
  FLAGS = (POK,IsCOW,pPOK)
  CUR = 1000000
  LEN = 1000002
  COW_REFCNT = 1
r 0.20s u 0.14s s 0.01s 73% "./perl -Ilib /tmp/ts.pl 2>&1"
r 0.20s u 0.00s s 0.01s 4% "grep -v xxx"

* with 5.20.0

time ./perl -Ilib /tmp/ts.pl 2>&1 |grep -v xxx
SV = PV(0x7fcefb004070) at 0x7fcefb02d108
  REFCNT = 1
  FLAGS = (POK,pPOK)
  CUR = 1000000
  LEN = 1003520
r 11.71s u 11.56s s 0.04s 99% "./perl -Ilib /tmp/ts.pl 2>&1"
r 11.71s u 0.00s s 0.01s 0% "grep -v xxx"

* with 5.18.2 (from macports)

time perl /tmp/ts.pl 2>&1 |grep -v xxx
SV = PV(0x7f9802004070) at 0x7f98020286b8
  REFCNT = 1
  FLAGS = (POK,pPOK)
  CUR = 1000000
  LEN = 1003520
r 11.30s u 11.24s s 0.04s 99% "perl /tmp/ts.pl 2>&1"
r 11.30s u 0.00s s 0.01s 0% "grep -v xxx"

@p5pRT
Copy link
Author

p5pRT commented May 29, 2014

From @bulk88

On Wed May 28 13​:33​:40 2014, jkeenan wrote​:

Could there be an OS-specific aspect to this problem?

Thank you very much.
Jim Keenan

strawberry 5.18, VC blead 5.21, AP 5.10


Owner@​delld620 ~
$ "C​:/sperl/perl/bin/perl.exe" -v

This is perl 5, version 18, subversion 0 (v5.18.0) built for MSWin32-x86-multi-thread-64int

Copyright 1987-2013, Larry Wall

Perl may be copied only under the terms of either the Artistic License or the
GNU General Public License, which may be found in the Perl 5 source kit.

Complete documentation for Perl, including FAQ lists, should be found on
this system using "man perl" or "perldoc perl". If you have access to the
Internet, point your browser at http​://www.perl.org/, the Perl Home Page.

Owner@​delld620 ~
$ time "C​:/sperl/perl/bin/perl.exe" "C​:\perl521\rjbsspeed.pl" 100000

real 0m2.203s
user 0m0.015s
sys 0m0.031s

Owner@​delld620 ~
$ "C​:/perl521/bin/perl.exe" -v

This is perl 5, version 21, subversion 1 (v5.21.1 (v5.21.0-29-g1abbcfa)) built for MSWin32-x86-multi-thread
(with 4 registered patches, see perl -V for more detail)

Copyright 1987-2014, Larry Wall

Perl may be copied only under the terms of either the Artistic License or the
GNU General Public License, which may be found in the Perl 5 source kit.

Complete documentation for Perl, including FAQ lists, should be found on
this system using "man perl" or "perldoc perl". If you have access to the
Internet, point your browser at http​://www.perl.org/, the Perl Home Page.

Owner@​delld620 ~
$ time "C​:/perl521/bin/perl.exe" "C​:\perl521\rjbsspeed.pl" 100000

real 0m0.094s
user 0m0.046s
sys 0m0.000s

Owner@​delld620 ~
$ "C​:/perl/bin/perl.exe" -v

This is perl, v5.10.0 built for MSWin32-x86-multi-thread
(with 5 registered patches, see perl -V for more detail)

Copyright 1987-2007, Larry Wall

Binary build 1003 [285500] provided by ActiveState http​://www.ActiveState.com
Built May 13 2008 16​:52​:49

Perl may be copied only under the terms of either the Artistic License or the
GNU General Public License, which may be found in the Perl 5 source kit.

Complete documentation for Perl, including FAQ lists, should be found on
this system using "man perl" or "perldoc perl". If you have access to the
Internet, point your browser at http​://www.perl.org/, the Perl Home Page.

Owner@​delld620 ~
$ time "C​:/perl/bin/perl.exe" "C​:\perl521\rjbsspeed.pl" 100000

real 0m2.188s
user 0m0.015s
sys 0m0.031s

Owner@​delld620 ~
$


--
bulk88 ~ bulk88 at hotmail.com

@p5pRT
Copy link
Author

p5pRT commented May 29, 2014

From @jkeenan

On Thu May 29 05​:03​:15 2014, perl.p5p@​rjbs.manxome.org wrote​:

* demerphq <demerphq@​gmail.com> [2014-05-29T05​:36​:13]

commit ce861ea

As you suspected​:

~/code/perl5$ time ./perl -I lib ~/terastring.pl |& grep -v xx
SV = PV(0x7f7f53805098) at 0x7f7f53831538
REFCNT = 1
FLAGS = (POK,IsCOW,pPOK)
CUR = 1000000
LEN = 1000002
COW_REFCNT = 1
./perl -I lib ~/terastring.pl 2>&1 0.20s user 0.01s system 98% cpu
0.206 total
grep -v xx 0.00s user 0.00s system 2% cpu 0.206 total

Confirmable on the older Darwin as well​:

###
$ ./perl -v |head -2 |tail -1
This is perl 5, version 21, subversion 1 (v5.21.1 (v5.21.0-95-gc02d599)) built for darwin-2level

$ time ./perl -Ilib ~/learn/perl/p5p/121975-rjbs_speed.pl 150000
real 0m0.160s
user 0m0.092s
sys 0m0.021s

$ time ./perl -Ilib ~/learn/perl/p5p/121975-rjbs_speed.pl 200000
real 0m0.194s
user 0m0.125s
sys 0m0.022s

$ time ./perl -Ilib ~/learn/perl/p5p/121975-rjbs_speed.pl 3000000
real 0m1.917s
user 0m1.810s
sys 0m0.038s
###

And with the Devel​::Peek check​:

###
$ ./perl -Ilib ~/learn/perl/p5p/121975-rjbs_speed.pl 150000 2>xyz;grep -v xxxxxx xyz;rm xyz
SV = PV(0x2803528) at 0x2810110
  REFCNT = 1
  FLAGS = (POK,IsCOW,pPOK)
  CUR = 150000
  LEN = 150004
  COW_REFCNT = 1
###

@p5pRT
Copy link
Author

p5pRT commented Sep 8, 2014

From @cpansprout

On Thu May 29 02​:12​:26 2014, demerphq wrote​:

Obviously you *could* raise these thresholds, for instance to 4000 and this
script should be "fast" again.

However I think the real problem we need to get to the bottom of is why it
is preallocating so much extra space on the slow builds.

How about making the threshold a percentage, rather than a fixed number of bytes?

Regardless I feel like there is a bit of a clash of concepts here that we
need to think through properly before COW is really going to settle down.

Well, at least 5.20 was no slower than 5.18 in that case, and faster in others. :-)

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Feb 26, 2016

From @mauke

On Thu May 29 02​:36​:38 2014, demerphq wrote​:

On 29 May 2014 11​:11, demerphq <demerphq@​gmail.com> wrote​:

So one option would be to stop using Perl_safesysmalloc_size().

As this is the easiest to do to prove this is actually the problem I have
pushed​:

commit ce861ea
Author​: Yves Orton <demerphq@​gmail.com>
Date​: Thu May 29 11​:33​:56 2014 +0200

temporary fix for \[perl \#121975\] COW speedup lost after e8c6a474

Disable use of Perl\_safesysmalloc\_size by default\. Only use it when

PERL_USE_MALLOC_SIZE is defined.

Using Perl\_safesysmalloc\_size\(\) perl cannot tell a deliberately

preallocated buffer
which we probably dont want to COW from a malloc() preallocated buffer
where we probably
dont care. Hopefully this fixes the slowdown observed on some platforms.

How this affects performance as a whole on the affected platforms I do not
know. I suspect not much.

Any news? Should this ticket stay open? (It's listed in perl5201delta.)

@p5pRT
Copy link
Author

p5pRT commented Feb 26, 2016

From @jkeenan

On Fri Feb 26 10​:31​:01 2016, mauke- wrote​:

On Thu May 29 02​:36​:38 2014, demerphq wrote​:

On 29 May 2014 11​:11, demerphq <demerphq@​gmail.com> wrote​:

So one option would be to stop using Perl_safesysmalloc_size().

As this is the easiest to do to prove this is actually the problem I
have
pushed​:

commit ce861ea
Author​: Yves Orton <demerphq@​gmail.com>
Date​: Thu May 29 11​:33​:56 2014 +0200

temporary fix for [perl #121975] COW speedup lost after e8c6a47

Disable use of Perl_safesysmalloc_size by default. Only use it when
PERL_USE_MALLOC_SIZE is defined.

Using Perl_safesysmalloc_size() perl cannot tell a deliberately
preallocated buffer
which we probably dont want to COW from a malloc() preallocated
buffer
where we probably
dont care. Hopefully this fixes the slowdown observed on some
platforms.

How this affects performance as a whole on the affected platforms I
do not
know. I suspect not much.

Any news? Should this ticket stay open? (It's listed in
perl5201delta.)

On the older Darwin on which I reported, Yves' patch, while it did not get back the speeds reported up through v5.19.11-45-gc8180b0, did improve performance considerably. So I don't need to have the ticket remain open ... but let's see what rjbs has to say, particularly on a more up-to-date Darwin.

Thank you very much.

--
James E Keenan (jkeenan@​cpan.org)

@p5pRT
Copy link
Author

p5pRT commented Feb 29, 2016

From @iabyn

On Fri, Feb 26, 2016 at 03​:56​:36PM -0800, James E Keenan via RT wrote​:

On the older Darwin on which I reported, Yves' patch, while it did not
get back the speeds reported up through v5.19.11-45-gc8180b0, did
improve performance considerably. So I don't need to have the ticket
remain open ... but let's see what rjbs has to say, particularly on a
more up-to-date Darwin.

I think the ticket should stay open. There's still a whole bunch of
stuff related to COW and thresholds that could do with some love.
It's on my todo list.

--
You're only as old as you look.

@p5pRT
Copy link
Author

p5pRT commented Mar 5, 2016

From @rjbs

On Mon Feb 29 02​:11​:53 2016, davem wrote​:

On Fri, Feb 26, 2016 at 03​:56​:36PM -0800, James E Keenan via RT wrote​:

On the older Darwin on which I reported, Yves' patch, while it did not
get back the speeds reported up through v5.19.11-45-gc8180b0, did
improve performance considerably. So I don't need to have the ticket
remain open ... but let's see what rjbs has to say, particularly on a
more up-to-date Darwin.

I think the ticket should stay open. There's still a whole bunch of
stuff related to COW and thresholds that could do with some love.
It's on my todo list.

I'd probably like to see those become their own tickets, but as it stands I'm happy to let you own this issue and the structure of the tickets to handle it. ;)

--
--
rjbs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant