Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slowdown in m//g on COW strings of certain lengths #15266

Open
p5pRT opened this issue Apr 7, 2016 · 19 comments
Open

Slowdown in m//g on COW strings of certain lengths #15266

p5pRT opened this issue Apr 7, 2016 · 19 comments

Comments

@p5pRT
Copy link

p5pRT commented Apr 7, 2016

Migrated from rt.perl.org#127855 (status was 'open')

Searchable as RT127855$

@p5pRT
Copy link
Author

p5pRT commented Apr 7, 2016

From sschoeling@linet-services.de

Created by sschoeling@linet-services.de

The following oneliner creates a large string, creates a COW view with offset 1
and m//g walks through it​:

  perl -e 'my $f = "0" x (4e6+1); $g = substr($f, 1); 1 while $g =~ m/0/g;'

On my machines this takes 0.1s-2s depending on CPU and perl version.

Changing the string length to 4e6 will hang. 4e6-1 is fine again.

This seems to be the case for all lengths that are multiples of 256. It's not
noticable for strings with 100k bytes or less, but grows quadratically from there.

Some times from this machine​:
128000​: 0.687
128001​: 0.016

256000​: 3.819
256001​: 0.028

512000​: 15.614
512001​: 0.055

perl versions​:
5.18.2 seems to be good
5.19.11 is the earliest buggy version I have available
every version after 5.20 up to 5.22.0 is bad.

I don't currently have a 5.23 compiled. Sorry.

Perl Info

Flags:
    category=core
    severity=low

Site configuration information for perl 5.22.0:

Configured by sschoeling at Tue Jun  2 10:19:46 CEST 2015.

Summary of my perl5 (revision 5 version 22 subversion 0) configuration:
  Commit id: 70f63a4c7dba89e8e48b44de7978faae4319e693
  Platform:
    osname=linux, osvers=3.13.0-53-generic, archname=x86_64-linux
    uname='linux plum-chiew 3.13.0-53-generic #89-ubuntu smp wed may 20 10:34:39 utc 2015 x86_64 x86_64 x86_64 gnulinux '
    config_args=''
    hint=recommended, useposix=true, d_sigaction=define
    useithreads=undef, usemultiplicity=undef
    use64bitint=define, use64bitall=define, uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='cc', ccflags ='-fwrapv -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
    optimize='-O2',
    cppflags='-fwrapv -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include'
    ccversion='', gccversion='4.8.2', gccosandvers=''
    intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678, doublekind=3
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16, longdblkind=3
    ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
    alignbytes=8, prototype=define
  Linker and Libraries:
    ld='cc', ldflags =' -fstack-protector -L/usr/local/lib'
    libpth=/usr/local/lib /usr/lib/gcc/x86_64-linux-gnu/4.8/include-fixed /usr/include/x86_64-linux-gnu /usr/lib /lib/x86_64-linux-gnu /lib/../lib /usr/lib/x86_64-linux-gnu /usr/lib/../lib /lib
    libs=-lpthread -lnsl -ldl -lm -lcrypt -lutil -lc
    perllibs=-lpthread -lnsl -ldl -lm -lcrypt -lutil -lc
    libc=libc-2.19.so, so=so, useshrplib=false, libperl=libperl.a
    gnulibc_version='2.19'
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E'
    cccdlflags='-fPIC', lddlflags='-shared -O2 -L/usr/local/lib -fstack-protector'



@INC for perl 5.22.0:
    /home/sschoeling/lib/perl5/site_perl/5.22.0/x86_64-linux
    /home/sschoeling/lib/perl5/site_perl/5.22.0
    /home/sschoeling/lib/perl5/5.22.0/x86_64-linux
    /home/sschoeling/lib/perl5/5.22.0
    /home/sschoeling/lib/perl5/site_perl/5.18.0
    /home/sschoeling/lib/perl5/site_perl
    .


Environment for perl 5.22.0:
    HOME=/home/sschoeling
    LANG=en_US.UTF-8
    LANGUAGE=en_US
    LC_ADDRESS=de_DE.UTF-8
    LC_IDENTIFICATION=de_DE.UTF-8
    LC_MEASUREMENT=de_DE.UTF-8
    LC_MONETARY=de_DE.UTF-8
    LC_NAME=de_DE.UTF-8
    LC_NUMERIC=de_DE.UTF-8
    LC_PAPER=de_DE.UTF-8
    LC_TELEPHONE=de_DE.UTF-8
    LC_TIME=de_DE.UTF-8
    LD_LIBRARY_PATH (unset)
    LOGDIR (unset)
    PATH=/opt/sschoeling/perlbrew/bin:/home/sschoeling/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games
    PERLBREW_BASHRC_VERSION=0.66
    PERLBREW_HOME=/home/sschoeling/.perlbrew
    PERLBREW_ROOT=/opt/sschoeling/perlbrew
    PERL_BADLANG (unset)
    SHELL=/bin/bash

@p5pRT
Copy link
Author

p5pRT commented Apr 8, 2016

From @jkeenan

On Thu Apr 07 12​:20​:45 2016, sschoeling@​linet-services.de wrote​:

This is a bug report for perl from sschoeling@​linet-services.de,
generated with the help of perlbug 1.40 running under perl 5.22.0.

-----------------------------------------------------------------
[Please describe your issue here]

The following oneliner creates a large string, creates a COW view with
offset 1
and m//g walks through it​:

perl -e 'my $f = "0" x (4e6+1); $g = substr($f, 1); 1 while $g =~
m/0/g;'

On my machines this takes 0.1s-2s depending on CPU and perl version.

Changing the string length to 4e6 will hang. 4e6-1 is fine again.

This seems to be the case for all lengths that are multiples of 256.
It's not
noticable for strings with 100k bytes or less, but grows quadratically
from there.

Some times from this machine​:
128000​: 0.687
128001​: 0.016

256000​: 3.819
256001​: 0.028

512000​: 15.614
512001​: 0.055

perl versions​:
5.18.2 seems to be good
5.19.11 is the earliest buggy version I have available
every version after 5.20 up to 5.22.0 is bad.

I don't currently have a 5.23 compiled. Sorry.

Confirmed on blead.

#####
$ ./perl -v | head -2 | tail -1
This is perl 5, version 23, subversion 10 (v5.23.10 (v5.23.9-56-g87c118b)) built for x86_64-linux

[perl] 6 $ time ./perl -e 'my $f = "0" x (4e6+1); $g = substr($f, 1); 1 while $g =~ m/0/g;'

real 0m0.302s
user 0m0.298s
sys 0m0.004s
[perl] 7 $ time ./perl -e 'my $f = "0" x (4e6); $g = substr($f, 1); 1 while $g =~ m/0/g;'
^C

real 0m14.689s
user 0m14.688s
sys 0m0.008s
[perl] 8 $ time ./perl -e 'my $f = "0" x (4e6-1); $g = substr($f, 1); 1 while $g =~ m/0/g;'

real 0m0.294s
user 0m0.290s
sys 0m0.004s
[perl] 9 $ time ./perl -e 'my $f = "0" x (4e6+256+1); $g = substr($f, 1); 1 while $g =~ m/0/g;'

real 0m0.293s
user 0m0.290s
sys 0m0.004s
[perl] 10 $ time ./perl -e 'my $f = "0" x (4e6+256-1); $g = substr($f, 1); 1 while $g =~ m/0/g;'

real 0m0.301s
user 0m0.297s
sys 0m0.004s
[perl] 11 $ time ./perl -e 'my $f = "0" x (4e6+256); $g = substr($f, 1); 1 while $g =~ m/0/g;'
^C

real 0m7.211s
user 0m7.194s
sys 0m0.016s
[perl] 12 $ time ./perl -e 'my $f = "0" x (4e6-256-1); $g = substr($f, 1); 1 while $g =~ m/0/g;'

real 0m0.287s
user 0m0.283s
sys 0m0.004s
[perl] 13 $ time ./perl -e 'my $f = "0" x (4e6-256+1); $g = substr($f, 1); 1 while $g =~ m/0/g;'

real 0m0.294s
user 0m0.287s
sys 0m0.008s
[perl] 14 $ time ./perl -e 'my $f = "0" x (4e6-256); $g = substr($f, 1); 1 while $g =~ m/0/g;'
^C

real 0m6.437s
user 0m6.432s
sys 0m0.008s

#####

--
James E Keenan (jkeenan@​cpan.org)

@p5pRT
Copy link
Author

p5pRT commented Apr 8, 2016

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Apr 8, 2016

From @demerphq

On 8 April 2016 at 12​:41, James E Keenan via RT
<perlbug-followup@​perl.org> wrote​:

On Thu Apr 07 12​:20​:45 2016, sschoeling@​linet-services.de wrote​:

This is a bug report for perl from sschoeling@​linet-services.de,
generated with the help of perlbug 1.40 running under perl 5.22.0.

-----------------------------------------------------------------
[Please describe your issue here]

The following oneliner creates a large string, creates a COW view with
offset 1
and m//g walks through it​:

perl -e 'my $f = "0" x (4e6+1); $g = substr($f, 1); 1 while $g =~
m/0/g;'

On my machines this takes 0.1s-2s depending on CPU and perl version.

Changing the string length to 4e6 will hang. 4e6-1 is fine again.

This seems to be the case for all lengths that are multiples of 256.
It's not
noticable for strings with 100k bytes or less, but grows quadratically
from there.

Some times from this machine​:
128000​: 0.687
128001​: 0.016

256000​: 3.819
256001​: 0.028

512000​: 15.614
512001​: 0.055

perl versions​:
5.18.2 seems to be good
5.19.11 is the earliest buggy version I have available
every version after 5.20 up to 5.22.0 is bad.

I don't currently have a 5.23 compiled. Sorry.

Confirmed on blead.

#####
$ ./perl -v | head -2 | tail -1
This is perl 5, version 23, subversion 10 (v5.23.10 (v5.23.9-56-g87c118b)) built for x86_64-linux

[perl] 6 $ time ./perl -e 'my $f = "0" x (4e6+1); $g = substr($f, 1); 1 while $g =~ m/0/g;'

real 0m0.302s
user 0m0.298s
sys 0m0.004s
[perl] 7 $ time ./perl -e 'my $f = "0" x (4e6); $g = substr($f, 1); 1 while $g =~ m/0/g;'
^C

real 0m14.689s
user 0m14.688s
sys 0m0.008s
[perl] 8 $ time ./perl -e 'my $f = "0" x (4e6-1); $g = substr($f, 1); 1 while $g =~ m/0/g;'

real 0m0.294s
user 0m0.290s
sys 0m0.004s
[perl] 9 $ time ./perl -e 'my $f = "0" x (4e6+256+1); $g = substr($f, 1); 1 while $g =~ m/0/g;'

real 0m0.293s
user 0m0.290s
sys 0m0.004s
[perl] 10 $ time ./perl -e 'my $f = "0" x (4e6+256-1); $g = substr($f, 1); 1 while $g =~ m/0/g;'

real 0m0.301s
user 0m0.297s
sys 0m0.004s
[perl] 11 $ time ./perl -e 'my $f = "0" x (4e6+256); $g = substr($f, 1); 1 while $g =~ m/0/g;'
^C

real 0m7.211s
user 0m7.194s
sys 0m0.016s
[perl] 12 $ time ./perl -e 'my $f = "0" x (4e6-256-1); $g = substr($f, 1); 1 while $g =~ m/0/g;'

real 0m0.287s
user 0m0.283s
sys 0m0.004s
[perl] 13 $ time ./perl -e 'my $f = "0" x (4e6-256+1); $g = substr($f, 1); 1 while $g =~ m/0/g;'

real 0m0.294s
user 0m0.287s
sys 0m0.008s
[perl] 14 $ time ./perl -e 'my $f = "0" x (4e6-256); $g = substr($f, 1); 1 while $g =~ m/0/g;'
^C

real 0m6.437s
user 0m6.432s
sys 0m0.008s

#####

I bet this is related to making $& work properly, and possibly how
temps are freed.

My bet is in the slow cases we are copying the string every character.

We used to not copy the subject of a //g in scalar context to make
things like this fast. When COW was introduced we switched to using it
to using COW to copy.

This fixed bugs like this​:

$foo="bar";
$ok= $foo=~/(...)/g;
print $1; # prints "bar"
$foo="bah";
print $1; # prints "bah"

My initial thought is somewhere this is going wrong. That it happens
on strings whose length is a multiple of 256 suggests it even more. We
can only have 255 references to a string before it has to be copied,
AND we store the refcount in the *last* char in the string.

Yves

--
perl -Mre=debug -e "/just|another|perl|hacker/"

@p5pRT
Copy link
Author

p5pRT commented Apr 8, 2016

From @demerphq

On 8 April 2016 at 12​:53, demerphq <demerphq@​gmail.com> wrote​:

On 8 April 2016 at 12​:41, James E Keenan via RT
<perlbug-followup@​perl.org> wrote​:

On Thu Apr 07 12​:20​:45 2016, sschoeling@​linet-services.de wrote​:

This is a bug report for perl from sschoeling@​linet-services.de,
generated with the help of perlbug 1.40 running under perl 5.22.0.

-----------------------------------------------------------------
[Please describe your issue here]

The following oneliner creates a large string, creates a COW view with
offset 1
and m//g walks through it​:

perl -e 'my $f = "0" x (4e6+1); $g = substr($f, 1); 1 while $g =~
m/0/g;'

On my machines this takes 0.1s-2s depending on CPU and perl version.

Changing the string length to 4e6 will hang. 4e6-1 is fine again.

This seems to be the case for all lengths that are multiples of 256.
It's not
noticable for strings with 100k bytes or less, but grows quadratically
from there.

Some times from this machine​:
128000​: 0.687
128001​: 0.016

256000​: 3.819
256001​: 0.028

512000​: 15.614
512001​: 0.055

perl versions​:
5.18.2 seems to be good
5.19.11 is the earliest buggy version I have available
every version after 5.20 up to 5.22.0 is bad.

I don't currently have a 5.23 compiled. Sorry.

Confirmed on blead.

#####
$ ./perl -v | head -2 | tail -1
This is perl 5, version 23, subversion 10 (v5.23.10 (v5.23.9-56-g87c118b)) built for x86_64-linux

[perl] 6 $ time ./perl -e 'my $f = "0" x (4e6+1); $g = substr($f, 1); 1 while $g =~ m/0/g;'

real 0m0.302s
user 0m0.298s
sys 0m0.004s
[perl] 7 $ time ./perl -e 'my $f = "0" x (4e6); $g = substr($f, 1); 1 while $g =~ m/0/g;'
^C

real 0m14.689s
user 0m14.688s
sys 0m0.008s
[perl] 8 $ time ./perl -e 'my $f = "0" x (4e6-1); $g = substr($f, 1); 1 while $g =~ m/0/g;'

real 0m0.294s
user 0m0.290s
sys 0m0.004s
[perl] 9 $ time ./perl -e 'my $f = "0" x (4e6+256+1); $g = substr($f, 1); 1 while $g =~ m/0/g;'

real 0m0.293s
user 0m0.290s
sys 0m0.004s
[perl] 10 $ time ./perl -e 'my $f = "0" x (4e6+256-1); $g = substr($f, 1); 1 while $g =~ m/0/g;'

real 0m0.301s
user 0m0.297s
sys 0m0.004s
[perl] 11 $ time ./perl -e 'my $f = "0" x (4e6+256); $g = substr($f, 1); 1 while $g =~ m/0/g;'
^C

real 0m7.211s
user 0m7.194s
sys 0m0.016s
[perl] 12 $ time ./perl -e 'my $f = "0" x (4e6-256-1); $g = substr($f, 1); 1 while $g =~ m/0/g;'

real 0m0.287s
user 0m0.283s
sys 0m0.004s
[perl] 13 $ time ./perl -e 'my $f = "0" x (4e6-256+1); $g = substr($f, 1); 1 while $g =~ m/0/g;'

real 0m0.294s
user 0m0.287s
sys 0m0.008s
[perl] 14 $ time ./perl -e 'my $f = "0" x (4e6-256); $g = substr($f, 1); 1 while $g =~ m/0/g;'
^C

real 0m6.437s
user 0m6.432s
sys 0m0.008s

#####

I bet this is related to making $& work properly, and possibly how
temps are freed.

My bet is in the slow cases we are copying the string every character.

We used to not copy the subject of a //g in scalar context to make
things like this fast. When COW was introduced we switched to using it
to using COW to copy.

This fixed bugs like this​:

$foo="bar";
$ok= $foo=~/(...)/g;
print $1; # prints "bar"
$foo="bah";
print $1; # prints "bah"

My initial thought is somewhere this is going wrong. That it happens
on strings whose length is a multiple of 256 suggests it even more. We
can only have 255 references to a string before it has to be copied,
AND we store the refcount in the *last* char in the string.

We are copying the subject string every 0 in the string. I dont know
why yet. Possibly because substr doesnt overallocate.

norole​:yorton@​Styx​:blead​:/git_tree/perl$ time ./perl -e 'my $f = ("0"
x (100)) . ("1" x (4e6 -100)); $g = substr($f, 1); 1 while $g =~
m/0/g;'

real 0m0.075s
user 0m0.063s
sys 0m0.012s
norole​:yorton@​Styx​:blead​:/git_tree/perl$ time ./perl -e 'my $n=1000;
my $f = ("0" x ($n)) . ("1" x (4e6 - $n)); $g = substr($f, 1); 1 while
$g =~ m/0/g;'

real 0m0.579s
user 0m0.570s
sys 0m0.012s
norole​:yorton@​Styx​:blead​:/git_tree/perl$ time ./perl -e 'my $n=10000;
my $f = ("0" x ($n)) . ("1" x (4e6 - $n)); $g = substr($f, 1); 1 while
$g =~ m/0/g;'

real 0m5.647s
user 0m5.638s
sys 0m0.008s

--
perl -Mre=debug -e "/just|another|perl|hacker/"

@p5pRT
Copy link
Author

p5pRT commented Apr 8, 2016

From @demerphq

On 8 April 2016 at 13​:06, demerphq <demerphq@​gmail.com> wrote​:

We are copying the subject string every 0 in the string. I dont know
why yet. Possibly because substr doesnt overallocate.

This was the reason. Perl_sv_setpvn() was only overallocating 1 byte
(for the null), and not 2 bytes which could hold the refcount.

I have pushed three patches for this, the last changes SvGROW() to
overallocate by 2 always under COW. IMO this should fix a lot of
things, and theory should not break anything. If it turns out practice
and theory differ we can revert bcc9f60 and go back to the more
specific fix. Assuming we dont revert in a future commit we can
probably cleannup a lot of calls to SvGROW, which almost always add 1
or 2 to a base value, which would not longer be needed.

commit bcc9f60
Author​: Yves Orton <demerphq@​gmail.com>
Date​: Fri Apr 8 21​:25​:20 2016 +0200

  More generalized fix for #127855, overallocate in SvGROW and not
just sv_grow()

  If we overallocate in SvGROW() and not just sv_grow() we can ensure
  we have more cases where we can COW. We need to ensure we always
  have room for a null and a reference count.

commit e19cb11
Author​: Yves Orton <demerphq@​gmail.com>
Date​: Fri Apr 8 20​:46​:43 2016 +0200

  fix #127855, in Perl_sv_setpvn() we have to overallocate to enable COW

  We need to overallocate by 2 to do COW strings. One for the null,
  one for the refcount.

commit d14d585
Author​: Yves Orton <demerphq@​gmail.com>
Date​: Fri Apr 8 21​:17​:34 2016 +0200

  test for #127855 - Slowdown in m//g on COW strings of certain lengths

Yves

@p5pRT
Copy link
Author

p5pRT commented Apr 9, 2016

From @jkeenan

On Fri Apr 08 12​:43​:02 2016, demerphq wrote​:

On 8 April 2016 at 13​:06, demerphq <demerphq@​gmail.com> wrote​:

We are copying the subject string every 0 in the string. I dont know
why yet. Possibly because substr doesnt overallocate.

This was the reason. Perl_sv_setpvn() was only overallocating 1 byte
(for the null), and not 2 bytes which could hold the refcount.

I have pushed three patches for this, the last changes SvGROW() to
overallocate by 2 always under COW. IMO this should fix a lot of
things, and theory should not break anything. If it turns out practice
and theory differ we can revert bcc9f60 and go back to the more
specific fix. Assuming we dont revert in a future commit we can
probably cleannup a lot of calls to SvGROW, which almost always add 1
or 2 to a base value, which would not longer be needed.

commit bcc9f60
Author​: Yves Orton <demerphq@​gmail.com>
Date​: Fri Apr 8 21​:25​:20 2016 +0200

More generalized fix for \#127855\, overallocate in SvGROW and not

just sv_grow()

If we overallocate in SvGROW\(\) and not just sv\_grow\(\) we can ensure
we have more cases where we can COW\. We need to ensure we always
have room for a null and a reference count\.

commit e19cb11
Author​: Yves Orton <demerphq@​gmail.com>
Date​: Fri Apr 8 20​:46​:43 2016 +0200

fix \#127855\, in Perl\_sv\_setpvn\(\) we have to overallocate to enable COW

We need to overallocate by 2 to do COW strings\. One for the null\,
one for the refcount\.

commit d14d585
Author​: Yves Orton <demerphq@​gmail.com>
Date​: Fri Apr 8 21​:17​:34 2016 +0200

test for \#127855 \- Slowdown in m//g on COW strings of certain lengths

Yves

These patches worked for me on both Linux/x86_64 and older Darwin/PPC.
--
James E Keenan (jkeenan@​cpan.org)

@p5pRT
Copy link
Author

p5pRT commented Apr 9, 2016

From @iabyn

On Fri, Apr 08, 2016 at 09​:42​:22PM +0200, demerphq wrote​:

commit d14d585
Author​: Yves Orton <demerphq@​gmail.com>
Date​: Fri Apr 8 21​:17​:34 2016 +0200

test for \#127855 \- Slowdown in m//g on COW strings of certain lengths

I took the libery of pushing a couple of updates to that​:

commit 31f8924
Author​: David Mitchell <davem@​iabyn.com>
AuthorDate​: Sat Apr 9 13​:17​:21 2016 +0100
Commit​: David Mitchell <davem@​iabyn.com>
CommitDate​: Sat Apr 9 13​:37​:56 2016 +0100

  new perf test in pat.t​: avoid timing failure
 
  A new performance test in re/pat.t added by v5.23.9-58-gd14d585 could
  occasionally fails to a timing issue. It was checking that the the
  test took less than 1 sec to run; but since the clock usually has a
  granularity of 1 sec, it could fail if the test ran over a tick boundary.
 
  Change the condition to <= 1 sec, and increase the time the test takes on
  a bad perl - it was taking 4sec on my system; it now takes about 14sec,
  so there's less chance of a bad perl passing.

M t/re/pat.t

commit 68b940a
Author​: David Mitchell <davem@​iabyn.com>
AuthorDate​: Sat Apr 9 16​:43​:20 2016 +0100
Commit​: David Mitchell <davem@​iabyn.com>
CommitDate​: Sat Apr 9 16​:43​:20 2016 +0100

  move perf test from re/pat.t to re/speed.t
 
  These days we generally put re tests which rely on timing in a separate
  file, t/re/speed.t; move a recently-added test there.

M t/re/pat.t
M t/re/speed.t

--
Indomitable in retreat, invincible in advance, insufferable in victory
  -- Churchill on Montgomery

@p5pRT
Copy link
Author

p5pRT commented Apr 11, 2016

From @demerphq

On 9 April 2016 at 17​:50, Dave Mitchell <davem@​iabyn.com> wrote​:

On Fri, Apr 08, 2016 at 09​:42​:22PM +0200, demerphq wrote​:

commit d14d585
Author​: Yves Orton <demerphq@​gmail.com>
Date​: Fri Apr 8 21​:17​:34 2016 +0200

test for \#127855 \- Slowdown in m//g on COW strings of certain lengths

I took the libery of pushing a couple of updates to that​:

Sounds good. Sorry I was not aware of/failed to notice re/speed.t

commit 31f8924
Author​: David Mitchell <davem@​iabyn.com>
AuthorDate​: Sat Apr 9 13​:17​:21 2016 +0100
Commit​: David Mitchell <davem@​iabyn.com>
CommitDate​: Sat Apr 9 13​:37​:56 2016 +0100

new perf test in pat\.t&#8203;: avoid timing failure

A new performance test in re/pat\.t added by v5\.23\.9\-58\-gd14d585 could
occasionally fails to a timing issue\. It was checking that the the
test took less than 1 sec to run; but since the clock usually has a
granularity of 1 sec\, it could fail if the test ran over a tick boundary\.

Change the condition to \<= 1 sec\, and increase the time the test takes on
a bad perl \- it was taking 4sec on my system; it now takes about 14sec\,
so there's less chance of a bad perl passing\.

M t/re/pat.t

commit 68b940a
Author​: David Mitchell <davem@​iabyn.com>
AuthorDate​: Sat Apr 9 16​:43​:20 2016 +0100
Commit​: David Mitchell <davem@​iabyn.com>
CommitDate​: Sat Apr 9 16​:43​:20 2016 +0100

move perf test from re/pat\.t to re/speed\.t

These days we generally put re tests which rely on timing in a separate
file\, t/re/speed\.t; move a recently\-added test there\.

M t/re/pat.t
M t/re/speed.t

--
Indomitable in retreat, invincible in advance, insufferable in victory
-- Churchill on Montgomery

--
perl -Mre=debug -e "/just|another|perl|hacker/"

@p5pRT
Copy link
Author

p5pRT commented Apr 28, 2016

From @iabyn

On Fri, Apr 08, 2016 at 09​:42​:22PM +0200, demerphq wrote​:

On 8 April 2016 at 13​:06, demerphq <demerphq@​gmail.com> wrote​:

We are copying the subject string every 0 in the string. I dont know
why yet. Possibly because substr doesnt overallocate.

This was the reason. Perl_sv_setpvn() was only overallocating 1 byte
(for the null), and not 2 bytes which could hold the refcount.

I have pushed three patches for this, the last changes SvGROW() to
overallocate by 2 always under COW. IMO this should fix a lot of
things, and theory should not break anything. If it turns out practice
and theory differ we can revert bcc9f60 and go back to the more
specific fix. Assuming we dont revert in a future commit we can
probably cleannup a lot of calls to SvGROW, which almost always add 1
or 2 to a base value, which would not longer be needed.

commit bcc9f60
Author​: Yves Orton <demerphq@​gmail.com>
Date​: Fri Apr 8 21​:25​:20 2016 +0200

More generalized fix for \#127855\, overallocate in SvGROW and not

just sv_grow()

If we overallocate in SvGROW\(\) and not just sv\_grow\(\) we can ensure
we have more cases where we can COW\. We need to ensure we always
have room for a null and a reference count\.

This is causing the breakage described in
  [perl #127915] $=x~0 segfaults Perl 5.24.0-RC1-2-gde1d2c7

Basically by making SvGROW(sv,len) include a calculation involving
(len+2), the calculation can wrap for large len's, resulting in a small
or skipped realloc and trampled buffers.

I suggest for 5.24 we just revert this commit, and readdress the issue
more thoroughly afterwards. I already have SV growing on my Big List of
Things to Look At Properly Something.

--
"I do not resent criticism, even when, for the sake of emphasis,
it parts for the time with reality".
  -- Winston Churchill, House of Commons, 22nd Jan 1941.

@p5pRT
Copy link
Author

p5pRT commented Apr 28, 2016

From @demerphq

On 28 April 2016 at 11​:20, Dave Mitchell <davem@​iabyn.com> wrote​:

On Fri, Apr 08, 2016 at 09​:42​:22PM +0200, demerphq wrote​:

On 8 April 2016 at 13​:06, demerphq <demerphq@​gmail.com> wrote​:

We are copying the subject string every 0 in the string. I dont know
why yet. Possibly because substr doesnt overallocate.

This was the reason. Perl_sv_setpvn() was only overallocating 1 byte
(for the null), and not 2 bytes which could hold the refcount.

I have pushed three patches for this, the last changes SvGROW() to
overallocate by 2 always under COW. IMO this should fix a lot of
things, and theory should not break anything. If it turns out practice
and theory differ we can revert bcc9f60 and go back to the more
specific fix. Assuming we dont revert in a future commit we can
probably cleannup a lot of calls to SvGROW, which almost always add 1
or 2 to a base value, which would not longer be needed.

commit bcc9f60
Author​: Yves Orton <demerphq@​gmail.com>
Date​: Fri Apr 8 21​:25​:20 2016 +0200

More generalized fix for \#127855\, overallocate in SvGROW and not

just sv_grow()

If we overallocate in SvGROW\(\) and not just sv\_grow\(\) we can ensure
we have more cases where we can COW\. We need to ensure we always
have room for a null and a reference count\.

This is causing the breakage described in
[perl #127915] $=x~0 segfaults Perl 5.24.0-RC1-2-gde1d2c7

Basically by making SvGROW(sv,len) include a calculation involving
(len+2), the calculation can wrap for large len's, resulting in a small
or skipped realloc and trampled buffers.

I suggest for 5.24 we just revert this commit, and readdress the issue
more thoroughly afterwards. I already have SV growing on my Big List of
Things to Look At Properly Something.

That will leave the original bug fixed right? Will we have this
problem in sv_grow() as well?

Yves
--
perl -Mre=debug -e "/just|another|perl|hacker/"

@p5pRT
Copy link
Author

p5pRT commented May 2, 2016

From @iabyn

On Thu, Apr 28, 2016 at 11​:30​:17AM +0200, demerphq wrote​:

On 28 April 2016 at 11​:20, Dave Mitchell <davem@​iabyn.com> wrote​:

On Fri, Apr 08, 2016 at 09​:42​:22PM +0200, demerphq wrote​:

On 8 April 2016 at 13​:06, demerphq <demerphq@​gmail.com> wrote​:

We are copying the subject string every 0 in the string. I dont know
why yet. Possibly because substr doesnt overallocate.

This was the reason. Perl_sv_setpvn() was only overallocating 1 byte
(for the null), and not 2 bytes which could hold the refcount.

I have pushed three patches for this, the last changes SvGROW() to
overallocate by 2 always under COW. IMO this should fix a lot of
things, and theory should not break anything. If it turns out practice
and theory differ we can revert bcc9f60 and go back to the more
specific fix. Assuming we dont revert in a future commit we can
probably cleannup a lot of calls to SvGROW, which almost always add 1
or 2 to a base value, which would not longer be needed.

commit bcc9f60
Author​: Yves Orton <demerphq@​gmail.com>
Date​: Fri Apr 8 21​:25​:20 2016 +0200

More generalized fix for \#127855\, overallocate in SvGROW and not

just sv_grow()

If we overallocate in SvGROW\(\) and not just sv\_grow\(\) we can ensure
we have more cases where we can COW\. We need to ensure we always
have room for a null and a reference count\.

This is causing the breakage described in
[perl #127915] $=x~0 segfaults Perl 5.24.0-RC1-2-gde1d2c7

Basically by making SvGROW(sv,len) include a calculation involving
(len+2), the calculation can wrap for large len's, resulting in a small
or skipped realloc and trampled buffers.

I suggest for 5.24 we just revert this commit, and readdress the issue
more thoroughly afterwards. I already have SV growing on my Big List of
Things to Look At Properly Something.

That will leave the original bug fixed right? Will we have this
problem in sv_grow() as well?

Turns out that reverting just the one commit introduced a subtle bug with

  my $s = some_string;
  $s = "$s";

Where pp_stringify() and sv_copypv() (which it calls) don't check whether
the target SV is the same as the src. Normally this doesn't matter, apart
from a wasted copy onto itself. However, once SvGROW starts
over-allocating, the SV;s buffer gets realloced, and then the copy is done
from a freed buffer. This doesn't seem to occur wiuth both commits
present, since the general over-allocation in SvGROW() means that $s
already has plenty of head room when sv_setpvn() calls SvGROW(sv, len+1)
and so doen't get realloced. With just the sv_setpvn() fix, it *can* get
re-alloced.

However, I now think both commits are wrong. There is already a mechanism
in sv_grow() to allocate +1 for COW, it just wasn't being triggered
when SvLEN's lower 8 bits were all zeros, as explained by the following
commit, which is part of the branch smoke-me/davem/cow1. This branch​:

* reverts your two commits
* adds a test for $s = "$s"
* adds the commit below.

Rik, I think our two options at this stage are​:

a) Just revert the two commits for 5.24.0. This means that a slowdown
  which appeared in 5.20.0 on /.../g for certain long strings will still
  be present in 5.24.0.
b) merge my branch, which fixes that slowdown, but of course introduces
  the risk of further breakage (unlikely as I would of course think that
  is).

--
A major Starfleet emergency breaks out near the Enterprise, but
fortunately some other ships in the area are able to deal with it to
everyone's satisfaction.
  -- Things That Never Happen in "Star Trek" #13

@p5pRT
Copy link
Author

p5pRT commented May 2, 2016

From @demerphq

On 2 May 2016 at 16​:26, Dave Mitchell <davem@​iabyn.com> wrote​:

On Thu, Apr 28, 2016 at 11​:30​:17AM +0200, demerphq wrote​:

On 28 April 2016 at 11​:20, Dave Mitchell <davem@​iabyn.com> wrote​:

On Fri, Apr 08, 2016 at 09​:42​:22PM +0200, demerphq wrote​:

On 8 April 2016 at 13​:06, demerphq <demerphq@​gmail.com> wrote​:

We are copying the subject string every 0 in the string. I dont know
why yet. Possibly because substr doesnt overallocate.

This was the reason. Perl_sv_setpvn() was only overallocating 1 byte
(for the null), and not 2 bytes which could hold the refcount.

I have pushed three patches for this, the last changes SvGROW() to
overallocate by 2 always under COW. IMO this should fix a lot of
things, and theory should not break anything. If it turns out practice
and theory differ we can revert bcc9f60 and go back to the more
specific fix. Assuming we dont revert in a future commit we can
probably cleannup a lot of calls to SvGROW, which almost always add 1
or 2 to a base value, which would not longer be needed.

commit bcc9f60
Author​: Yves Orton <demerphq@​gmail.com>
Date​: Fri Apr 8 21​:25​:20 2016 +0200

More generalized fix for \#127855\, overallocate in SvGROW and not

just sv_grow()

If we overallocate in SvGROW\(\) and not just sv\_grow\(\) we can ensure
we have more cases where we can COW\. We need to ensure we always
have room for a null and a reference count\.

This is causing the breakage described in
[perl #127915] $=x~0 segfaults Perl 5.24.0-RC1-2-gde1d2c7

Basically by making SvGROW(sv,len) include a calculation involving
(len+2), the calculation can wrap for large len's, resulting in a small
or skipped realloc and trampled buffers.

I suggest for 5.24 we just revert this commit, and readdress the issue
more thoroughly afterwards. I already have SV growing on my Big List of
Things to Look At Properly Something.

That will leave the original bug fixed right? Will we have this
problem in sv_grow() as well?

Turns out that reverting just the one commit introduced a subtle bug with

my $s = some\_string;
$s = "$s";

Where pp_stringify() and sv_copypv() (which it calls) don't check whether
the target SV is the same as the src. Normally this doesn't matter, apart
from a wasted copy onto itself. However, once SvGROW starts
over-allocating, the SV;s buffer gets realloced, and then the copy is done
from a freed buffer. This doesn't seem to occur wiuth both commits
present, since the general over-allocation in SvGROW() means that $s
already has plenty of head room when sv_setpvn() calls SvGROW(sv, len+1)
and so doen't get realloced. With just the sv_setpvn() fix, it *can* get
re-alloced.

However, I now think both commits are wrong. There is already a mechanism
in sv_grow() to allocate +1 for COW, it just wasn't being triggered
when SvLEN's lower 8 bits were all zeros, as explained by the following
commit, which is part of the branch smoke-me/davem/cow1. This branch​:

* reverts your two commits
* adds a test for $s = "$s"
* adds the commit below.

While I am fine with whatever you think is best for right now I have
some concerns that I really feel do need to be addressed.

Reverting would return us to the following definition​:

# define SvGROW(sv,len) (SvLEN(sv) < (len) ? sv_grow(sv,len) : SvPVX(sv))

We compare with the bare 'len' that the caller supplies, which means
that all code that uses SvGROW(sv, wanted_len+1) will produce buffers
which are not COWable. I think that is a lot of code.

It seems to me that it is a mistake to expect the caller to keep track
of the space required for the COW refcount, and to keep track of the
space required for NULL. What happens for instance if we choose to
switch the refcount to a different type, where the required
extra-space might be more than one byte? It seems to me that in a
perfect world we should be able to overallocate however much we choose
and it shouldn't break things.

It also seems odd that logic that causes us to overallocate more often
would produce a segfault. Do you have any insight on that?

Rik, I think our two options at this stage are​:
encounter
a) Just revert the two commits for 5.24.0. This means that a slowdown
which appeared in 5.20.0 on /.../g for certain long strings will still
be present in 5.24.0.
b) merge my branch, which fixes that slowdown, but of course introduces
the risk of further breakage (unlikely as I would of course think that
is).

I am fine with either.

BTW, I think its worth noting the general history of this bug. It is
very very old. For a long time /g in scalar context did not copy its
subject var, because we considered this bug more important than the
bugs that come from not copying the subject var in a match (such as
capture vars being wrong, or even accessing outside of the string).
Later on we introduced COW, and (as I am sure you remember ;-) fixed
the /g in scalar context problem by using COW. But now we are
vulnerable to the same historic performance issues with any string
that is not COWable.

It really seems to me that when COW was added we modified sv_grow()
but failed to update SvGROW() correspondingly, and that possibly
SvGROW() has been serving more than one purpose that just happened to
be well served by a single piece of logic, but that post-COW it needs
to be split in two.

cheers,
Yves
--
perl -Mre=debug -e "/just|another|perl|hacker/"

@p5pRT
Copy link
Author

p5pRT commented May 2, 2016

From @iabyn

On Mon, May 02, 2016 at 07​:43​:39PM +0200, demerphq wrote​:

Reverting would return us to the following definition​:

# define SvGROW(sv,len) (SvLEN(sv) < (len) ? sv_grow(sv,len) : SvPVX(sv))

We compare with the bare 'len' that the caller supplies, which means
that all code that uses SvGROW(sv, wanted_len+1) will produce buffers
which are not COWable. I think that is a lot of code.

Something creating a new string, "abc"\0 say, will call SvGROW(sv, 3+1)
which will call sv_grow(), which will bump the size to (at least) 3+1+1,
giving space for the COW refcount.

Something extending an existing string will cause sv_grow() to be called
(and thus space given for the COW refcount), unless there is already spare
space in the PVX buffer. Only in the case where the extra string content's
length exactly matches the spare space in the PVX will there not be room
for a future COW refcount.

So I think this is case is (relatively) rare.

It seems to me that it is a mistake to expect the caller to keep track
of the space required for the COW refcount.

We're not (in general). sv_grow does it no behalf of the caller.

, and to keep track of the space required for NULL.

It's clearly documented since 5.004 that SvGROW's caller has to do the +1,
so I think we're ok there.

It also seems odd that logic that causes us to overallocate more often
would produce a segfault. Do you have any insight on that?

It's specifically the 'x' operator, which is where it easy to grow very
large strings. If you do ('a' x ~0) then adding extra to the length when
its already max causes the value ti wrap and only a small buffer to be
allocated. pp_repeat then fills what it thinks is a successfully allocated
large buffer with 'a's and SEGVs. In particular the code in pp_repeat
expects that itd safe to pass a max int as an arg to SvGROW() without
it wrapping and failing. Obviously pp_repeat's expectations could be
modified, or an SvGROW which does +1 could be written more defensively
to not wrap.

Rik, I think our two options at this stage are​:
encounter
a) Just revert the two commits for 5.24.0. This means that a slowdown
which appeared in 5.20.0 on /.../g for certain long strings will still
be present in 5.24.0.
b) merge my branch, which fixes that slowdown, but of course introduces
the risk of further breakage (unlikely as I would of course think that
is).

I am fine with either.

I vote for (b), if that's ok with Rik.

BTW, I think its worth noting the general history of this bug. It is
very very old. For a long time /g in scalar context did not copy its
subject var, because we considered this bug more important than the
bugs that come from not copying the subject var in a match (such as
capture vars being wrong, or even accessing outside of the string).
Later on we introduced COW, and (as I am sure you remember ;-) fixed
the /g in scalar context problem by using COW. But now we are
vulnerable to the same historic performance issues with any string
that is not COWable.

It really seems to me that when COW was added we modified sv_grow()
but failed to update SvGROW() correspondingly, and that possibly
SvGROW() has been serving more than one purpose that just happened to
be well served by a single piece of logic, but that post-COW it needs
to be split in two.

For post-5.24, I want to consider more radical options, for example
using a mechanism similar to OOK (or even to hijack OOK) to store
the COW refcount at the beginning of the string then make PVX() point to
the second byte of the malloced buffer.

--
My get-up-and-go just got up and went.

@p5pRT
Copy link
Author

p5pRT commented May 2, 2016

From @cpansprout

On Mon May 02 11​:44​:13 2016, davem wrote​:

On Mon, May 02, 2016 at 07​:43​:39PM +0200, demerphq wrote, quoting Dave Mitchell​:

b) merge my branch, which fixes that slowdown, but of course
introduces
the risk of further breakage (unlikely as I would of course
think that
is).

I am fine with either.

I vote for (b), if that's ok with Rik.

I think (b) is a good idea. The patch looks safe to me.

For post-5.24, I want to consider more radical options, for example
using a mechanism similar to OOK (or even to hijack OOK) to store
the COW refcount at the beginning of the string then make PVX() point
to
the second byte of the malloced buffer.

That was Nicholas Clark’s original proposal, but others pointed out that on some systems word-aligned string comparison was much faster than non-aligned, and that on such systems mallocked strings would be aligned. That was why I chose the end of the buffer.

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented May 3, 2016

From @demerphq

On 2 May 2016 at 20​:43, Dave Mitchell <davem@​iabyn.com> wrote​:

On Mon, May 02, 2016 at 07​:43​:39PM +0200, demerphq wrote​:

Reverting would return us to the following definition​:

# define SvGROW(sv,len) (SvLEN(sv) < (len) ? sv_grow(sv,len) : SvPVX(sv))

We compare with the bare 'len' that the caller supplies, which means
that all code that uses SvGROW(sv, wanted_len+1) will produce buffers
which are not COWable. I think that is a lot of code.

Something creating a new string, "abc"\0 say, will call SvGROW(sv, 3+1)
which will call sv_grow(), which will bump the size to (at least) 3+1+1,
giving space for the COW refcount.

Something creating a new string "abc" wont call SvGROW at all. It will
call something like newSV().

Something extending an existing string will cause sv_grow() to be called
(and thus space given for the COW refcount), unless there is already spare
space in the PVX buffer. Only in the case where the extra string content's
length exactly matches the spare space in the PVX will there not be room
for a future COW refcount.

I don't see how you say that. Any time they use either of the last two
bytes available the string will become unCOWable.

SvLEN=10, Requested = 9, Uncowable.
SVLEN=10, Requested = 10, Uncowable.

So I think this is case is (relatively) rare.

I want to see data that agrees. I think its a lot less rare than you think.

I think that lots of code does concatenate ops on SV's and will thus
occasionally produce unCOWable strings. IMO we should NEVER produce an
unCOWable string.

It seems to me that it is a mistake to expect the caller to keep track
of the space required for the COW refcount.

We're not (in general). sv_grow does it no behalf of the caller.

No. SvGROW() doesnt know about it, so it forces the user to know about it.

, and to keep track of the space required for NULL.

It's clearly documented since 5.004 that SvGROW's caller has to do the +1,
so I think we're ok there.

Just because we documented it doesnt mean its smart or right. I dont
think its either.

It also seems odd that logic that causes us to overallocate more often
would produce a segfault. Do you have any insight on that?

It's specifically the 'x' operator, which is where it easy to grow very
large strings. If you do ('a' x ~0) then adding extra to the length when
its already max causes the value ti wrap and only a small buffer to be
allocated.

Ok, thanks. So then we can fix SvGROW to not do that.

pp_repeat then fills what it thinks is a successfully allocated
large buffer with 'a's and SEGVs. In particular the code in pp_repeat
expects that itd safe to pass a max int as an arg to SvGROW() without
it wrapping and failing. Obviously pp_repeat's expectations could be
modified, or an SvGROW which does +1 could be written more defensively
to not wrap.

I think we should do the latter.

It occurs to me that the wrapping problem is a good example of why the
caller should not have to account for the trailing null, as it means
they have to manage the wrapping problem *as well*, and as far as I
can tell none of our code that uses SvGROW does.

Rik, I think our two options at this stage are​:
encounter
a) Just revert the two commits for 5.24.0. This means that a slowdown
which appeared in 5.20.0 on /.../g for certain long strings will still
be present in 5.24.0.
b) merge my branch, which fixes that slowdown, but of course introduces
the risk of further breakage (unlikely as I would of course think that
is).

I am fine with either.

I vote for (b), if that's ok with Rik.

BTW, I think its worth noting the general history of this bug. It is
very very old. For a long time /g in scalar context did not copy its
subject var, because we considered this bug more important than the
bugs that come from not copying the subject var in a match (such as
capture vars being wrong, or even accessing outside of the string).
Later on we introduced COW, and (as I am sure you remember ;-) fixed
the /g in scalar context problem by using COW. But now we are
vulnerable to the same historic performance issues with any string
that is not COWable.

It really seems to me that when COW was added we modified sv_grow()
but failed to update SvGROW() correspondingly, and that possibly
SvGROW() has been serving more than one purpose that just happened to
be well served by a single piece of logic, but that post-COW it needs
to be split in two.

For post-5.24, I want to consider more radical options, for example
using a mechanism similar to OOK (or even to hijack OOK) to store
the COW refcount at the beginning of the string then make PVX() point to
the second byte of the malloced buffer.

FWIW, I am not a fan of this idea. I was looking into making the
refcount a varint, and using as much space at the end as possible for
the refcount so that we can reference a string more than 255 times.
Consider when we hit the max count, we could do a realloc to add some
bytes for a larger refcount. If the realloc returns the original
pointer we are good. If it doesnt then we do what we currently do.

I would be less against the idea if we were to raise the space
allocation for the refcount so it could hold say an integer.

I really do not consider the debate about SvGROW settled yet. I really
do not think it works right now, and I feel like we are sweeping it
under the rug here.

Yves

--
perl -Mre=debug -e "/just|another|perl|hacker/"

@p5pRT
Copy link
Author

p5pRT commented May 3, 2016

From @demerphq

On 3 May 2016 at 08​:11, demerphq <demerphq@​gmail.com> wrote​:

I really do not consider the debate about SvGROW settled yet. I really
do not think it works right now, and I feel like we are sweeping it
under the rug here.

Er, I apologise, that could be interpreted it differently than I intended.

I think its fine to ignore this for now, and settle it in a future release.

I just feel like we should understand things better before we say that
SvGROW is correct in the face of COW, and the more I think about this
bug the more I think we can not say that. I just spent a while
reviewing a lot of SvGROW calls, and the more I review the more
uncomfortable I am about whether we are doing things right or not.

Anyway, for now lets just revert my patches and go with Daves solution
to the original bug.

cheers,
Yves

--
perl -Mre=debug -e "/just|another|perl|hacker/"

@p5pRT
Copy link
Author

p5pRT commented May 3, 2016

From @iabyn

On Mon, May 02, 2016 at 02​:01​:21PM -0700, Father Chrysostomos via RT wrote​:

For post-5.24, I want to consider more radical options, for example
using a mechanism similar to OOK (or even to hijack OOK) to store
the COW refcount at the beginning of the string then make PVX() point
to
the second byte of the malloced buffer.

That was Nicholas Clark’s original proposal, but others pointed out that
on some systems word-aligned string comparison was much faster than
non-aligned, and that on such systems mallocked strings would be
aligned. That was why I chose the end of the buffer.

Ah, that's an interesting point.

--
print+qq&$}$"$/$s$,$a$d$g$s$@​$.$q$,$​:$.$q$^$,$@​$a$$;$.$q$m&if+map{m,^\d{0\,},,${$​::{$'}}=chr($"+=$&amp;||1)}q&10m22,42}6​:17a22.3@​3;^2dg3q/s"&=~m*\d\*.*g

@p5pRT
Copy link
Author

p5pRT commented May 3, 2016

From @iabyn

[ to be read in the context of your followup-email, that we're now just
discussing post-5.24 stuff ]

On Tue, May 03, 2016 at 08​:11​:04AM +0200, demerphq wrote​:

On 2 May 2016 at 20​:43, Dave Mitchell <davem@​iabyn.com> wrote​:

On Mon, May 02, 2016 at 07​:43​:39PM +0200, demerphq wrote​:

Reverting would return us to the following definition​:

# define SvGROW(sv,len) (SvLEN(sv) < (len) ? sv_grow(sv,len) : SvPVX(sv))

We compare with the bare 'len' that the caller supplies, which means
that all code that uses SvGROW(sv, wanted_len+1) will produce buffers
which are not COWable. I think that is a lot of code.

Something creating a new string, "abc"\0 say, will call SvGROW(sv, 3+1)
which will call sv_grow(), which will bump the size to (at least) 3+1+1,
giving space for the COW refcount.

Something creating a new string "abc" wont call SvGROW at all. It will
call something like newSV().

And newSV() calls sv_grow() to allocate an initial buffer. The point I was
making is that most "normal" string creations, automatically get the
extra +1 for the cow ref in addition to the +1 NUL.

Something extending an existing string will cause sv_grow() to be called
(and thus space given for the COW refcount), unless there is already spare
space in the PVX buffer. Only in the case where the extra string content's
length exactly matches the spare space in the PVX will there not be room
for a future COW refcount.

I don't see how you say that. Any time they use either of the last two
bytes available the string will become unCOWable.

SvLEN=10, Requested = 9, Uncowable.
SVLEN=10, Requested = 10, Uncowable.

Consider the following PVX buffers, where X represents an spare byte in
the allocated buffer​:

"abc\0"
  No spare space in the buffer; any attempt to concatenate 1 or more
  bytes will trigger a call to sv_grow(), which will allocate a spare
  byte for COW.

"abc\0X"
  Concatenating 1 byte will convert the string to "abcd\0", with no
  space for COW.
  Concatenating 2 or more bytes will trigger a call to sv_grow(), which
  will allocate a spare byte for COW.

"abc\0XX"
  Concatenating 1 byte will still leave a spare byte for COW.
  Concatenating 2 bytes will convert the string to "abcde\0", with no
  space for COW.
  Concatenating 3 or more bytes will trigger a call to sv_grow(), which
  will allocate a spare byte for COW.

So in general only concatenating a number of bytes exactly equal to the
number of spare bytes in the buffer will may the string non-COWable.

For example, with this code​:

  use Devel​::Peek;
  my $s = 'x' x $ARGV[0];
  Dump $s;
  $s .= $ARGV[1];
  Dump $s;

$ perl5238 ~/tmp/p 7 '' 2>&1 | egrep 'CUR|LEN'
  CUR = 7
  LEN = 10
  CUR = 7
  LEN = 10
$ perl5238 ~/tmp/p 7 'x' 2>&1 | egrep 'CUR|LEN'
  CUR = 7
  LEN = 10
  CUR = 8
  LEN = 10
$ perl5238 ~/tmp/p 7 'xx' 2>&1 | egrep 'CUR|LEN'
  CUR = 7
  LEN = 10
  CUR = 9
  LEN = 10
$ perl5238 ~/tmp/p 7 'xxx' 2>&1 | egrep 'CUR|LEN'
  CUR = 7
  LEN = 10
  CUR = 10
  LEN = 24

It's only the 'xx' concat that leaves an uncowable SV.

I think that lots of code does concatenate ops on SV's and will thus
occasionally produce unCOWable strings. IMO we should NEVER produce an
unCOWable string.

Its a worthy goal, we just need to decide whether the means to achieve it
are reasonable.

, and to keep track of the space required for NULL.

It's clearly documented since 5.004 that SvGROW's caller has to do the +1,
so I think we're ok there.

Just because we documented it doesnt mean its smart or right. I dont
think its either.

Well we're kind of stuck with it. We could introduce a new SvGROW0 macro
say, that does its own +1, then change core to use it, and define SvGROW
in terms of it. At least that way any wrap-arounds could be handled
consistently within the macro rather than requiring every caller to check.

FWIW, I am not a fan of this idea. I was looking into making the
refcount a varint, and using as much space at the end as possible for
the refcount so that we can reference a string more than 255 times.
Consider when we hit the max count, we could do a realloc to add some
bytes for a larger refcount. If the realloc returns the original
pointer we are good. If it doesnt then we do what we currently do.

I would be less against the idea if we were to raise the space
allocation for the refcount so it could hold say an integer.

I think the cases where a string buffer is COW-shared more than 255 times
are fairly rare, and the cost is only that COW copies one in 256 times.
I'm not sure that the extra complexity of a variable length ref count
is worth it.

One thing I definitely want to do post 5.24 is introduce a startup-time
probe of the OS's malloc/realloc algorithm; from what I've seen when I
first raised this on the list and got people to run a little test program
on various platforms, is that all small allocs are internally of size A +
Bi for i=0,1,2.., regardless of the request size, but where A and B vary
across platforms and are determined when the perl executable first starts.
For example on my 64-git glibc, A=24 and B=32 (IIRC). Knowing A and B
means that sv_grow(), av_extend(), EXTEND() etc can all over-allocate in
an intelligent way.

--
All wight. I will give you one more chance. This time, I want to hear
no Wubens. No Weginalds. No Wudolf the wed-nosed weindeers.
  -- Life of Brian

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants