Slowdown in 5.10.0 regexes with atomic patterns #9751

p5pRT · 2009-05-28T15:07:40Z

Migrated from rt.perl.org#66106 (status was 'open')

Searchable as RT66106$

p5pRT · 2009-05-28T15:07:41Z

From @nwc10

Moritz Lenz mailed p5p in 479DE845.20803@casella.verplant.org
http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/2008-01/msg01314.html

Following up on a perlmonks thread: https://www.perlmonks.org/?node_id=664545

Summary: some regexps with atomic groups are much slower on 5.10.0 than
on 5.8.

Dave notes:

+ve superlinear cache issue

p5pRT · 2009-06-18T17:56:32Z

From p5p@spam.wizbit.be

Bisect from the thread: (from andreas)

----Program----

use strict;
use warnings;
my $str = "bea" x 100;
my $re = qr/(?:be|ea|a)/;
use Benchmark qw(timeit);
use Time::HiRes qw(time);
my $start = time;
timeit(500, sub { die if $str =~ m/(?>$re+)\d/ });
my $x = time - $start;
print $x, "\n";

----Output of .../p3BBpDI/perl-5.9.3@27902/bin/perl----
0.653790950775146

----EOF ($?='0')----
----Output of .../pcL5RMw/perl-5.9.3@27903/bin/perl----
23.4941339492798

----EOF ($?='0')----

Change 27903 by davem@davem-cyril on 2006/04/19 13:56:07

regmatch(): make IFMATCH use PUSH_STACK rather than fake
recursion

p5pRT · 2009-06-18T17:56:32Z

p5p@spam.wizbit.be - Status changed from 'new' to 'open'

p5pRT · 2009-07-11T09:04:13Z

From @schwern

This is a duplicate of 58280.

p5pRT · 2009-07-14T10:37:13Z

From @nwc10

On Sat Jul 11 02:04:13 2009, schwern wrote:

This is a duplicate of 58280.

It does seem to be related. Comparing a perl built immediately before
change 27903 with 27903, I see slowdowns from both bug's test cases.

$ /home/nick/Sandpit/snap5.9.x-perl-5.8.0-7777-g20f9f80/bin/perl5.9.4 58280
Rate regex regex_opt eq
regex 426178/s -- -3% -79%
regex_opt 441264/s 4% -- -78%
eq 1998449/s 369% 353% --
$ /home/nick/Sandpit/snap5.9.x-perl-5.8.0-7778-gdad7902/bin/perl5.9.4 58280
Rate regex_opt regex eq
regex_opt 450584/s -- -10% -89%
regex 503130/s 12% -- -88%
eq 4251069/s 843% 745% --

$ /home/nick/Sandpit/snap5.9.x-perl-5.8.0-7777-g20f9f80/bin/perl5.9.4 66106
0.914209127426147
$ /home/nick/Sandpit/snap5.9.x-perl-5.8.0-7778-gdad7902/bin/perl5.9.4 66106
17.3551239967346

(perls admittedly not ideal, as built with -DDEBUGGING and without any
optimisation. To repeat this, you'll need to ./Configure and fix things
up roughly like this:

./Configure ... -Dnoextensions=IPC/SysV && perl -i -ne 'print unless
/command-line/' makefile x2p/makefile && make all

change 27093 is
http://perl5.git.perl.org/perl.git/commit/dad790286e318c5c7f4b6ccd52b4fd512c87c763

regmatch(): make IFMATCH use PUSH_STACK rather than fake recursion

)

Nicholas Clark

p5pRT · 2009-07-15T15:02:40Z

From @hvds

"Nicholas Clark via RT" <perlbug-followup@perl.org> wrote:
:On Sat Jul 11 02:04:13 2009, schwern wrote:
:> This is a duplicate of 58280.
:
:It does seem to be related. Comparing a perl built immediately before
:change 27903 with 27903, I see slowdowns from both bug's test cases.
:
:
:$ /home/nick/Sandpit/snap5.9.x-perl-5.8.0-7777-g20f9f80/bin/perl5.9.4 58280
: Rate regex regex_opt eq
:regex 426178/s -- -3% -79%
:regex_opt 441264/s 4% -- -78%
:eq 1998449/s 369% 353% --
:$ /home/nick/Sandpit/snap5.9.x-perl-5.8.0-7778-gdad7902/bin/perl5.9.4 58280
: Rate regex_opt regex eq
:regex_opt 450584/s -- -10% -89%
:regex 503130/s 12% -- -88%
:eq 4251069/s 843% 745% --
:
:$ /home/nick/Sandpit/snap5.9.x-perl-5.8.0-7777-g20f9f80/bin/perl5.9.4 66106
:0.914209127426147
:$ /home/nick/Sandpit/snap5.9.x-perl-5.8.0-7778-gdad7902/bin/perl5.9.4 66106
:17.3551239967346
:
:(perls admittedly not ideal, as built with -DDEBUGGING and without any
:optimisation. To repeat this, you'll need to ./Configure and fix things
:up roughly like this:
:
:./Configure ... -Dnoextensions=IPC/SysV && perl -i -ne 'print unless
:/command-line/' makefile x2p/makefile && make all
:
:change 27093 is
:http://perl5.git.perl.org/perl.git/commit/dad790286e318c5c7f4b6ccd52b4fd512c87c763
:
:regmatch(): make IFMATCH use PUSH_STACK rather than fake recursion
:
:)

I investigated this a bit, using the test cases:

   perl -Dr -we '$s="aab"x shift;die if $s=~/(?:aa|[bc])+\d/' 4
   perl -Dr -we '$s="aab"x shift;die if $s=~/(?>(?:aa|[bc])+)\d/' 4

.. and the two perls 5.8.8 and blead.[1]

The -Dr differences show that in all 4 cases we turn on superlinear caching:
whilem: Detected a super-linear match, switching on caching...
but in the slow case, we never spot a match in the cache even when we should.
I would guess the writer and reader of the cache are for some reason looking
in different places, but I have not yet determined that.

I'll look through the code to try to understand why this fails, but
I don't have any time left today so I'm sharing what I've found so far.

Hugo

[1] The artificial '[bc]' is to avoid TRIE conversion, so as to keep the
two perl versions acting as similarly as possible.[2]

[2] In this case tries are also a big lose: 'aa|b' takes 50-100% longer
than 'aa|[bc]' in all 4 cases. I'll open a bug for this - I think for 5.12
there could be value in attempting to develop heuristics to determine
whether BRANCH or TRIE would be faster, so as to avoid the conversion
where it would hurt.

p5pRT · 2009-07-16T19:12:32Z

From @hvds

hv@crypt.org wrote:
:I investigated this a bit, using the test cases:
: perl -Dr -we '$s="aab"x shift;die if $s=/(?:aa|[bc])+\d/' 4
: perl -Dr -we '$s="aab"x shift;die if $s=/(?>(?:aa|[bc])+)\d/' 4
:.. and the two perls 5.8.8 and blead.[1]
:
:The -Dr differences show that in all 4 cases we turn on superlinear caching:
: whilem: Detected a super-linear match, switching on caching...
:but in the slow case, we never spot a match in the cache even when we should.
:I would guess the writer and reader of the cache are for some reason looking
:in different places, but I have not yet determined that.

Ah no, it's a bit trickier than that.

I've confused myself quite a bit, so some of the below may be wrong.
But this is what I *think* I've found.

1. The superlinear cache is an optimisation specific to a single opcode,
CURLYX, which handles subpatterns of the form /A{min,max}B/.

1a. In the first program the B pattern is /\d/, while in the second it
is /<nothing>/ due to the bracketing of the (?>...). But that appears to
be a red herring.

2. Once superlinearity is detected, and the cache is enabled, cache bits
are set each time on backtracking from a failure.

3. When wrapped in the cut operator (?>...), the backtracking doesn't
happen, so failures never get to be recorded in the cache.

This is why the second program exhibits exponential behaviour in blead.

4. In 5.8.8, we supported caching of either failure or success - we
picked whichever we saw first as the type to cache. In some manner
I don't understand, the second program caches successes under 5.8.8,
and thus avoids the slowdown we get in blead.

5. That dual-mode cache was removed in patch #28571), because as far
as we could tell it was always either wrong or useless.

I don't have a clear idea how to fix this: I feel the right answer may
be to put the cache-setting logic also into the SUCCEED stack unwinding,
but as currently written I think that would only be possible by going
with the slower DEBUG version of the unwinding code.

Dave, I think most of the current version of this code is yours - does
this make sense?

Hugo

p5pRT · 2009-12-16T02:51:13Z

From @obra

~dupe of 58280. Removing as a 5.12 blocker.

toddr · 2020-02-04T22:44:16Z

This continues to be the case in 5.30. Is this something still worth fixing @iabyn or @hvds ?

$>perl foo
            Rate   atomic1 posessive   atomic2    normal
atomic1    367/s        --       -0%      -96%      -96%
posessive  369/s        0%        --      -96%      -96%
atomic2   8253/s     2146%     2139%        --      -14%
normal    9635/s     2522%     2513%       17%        --

toddr · 2020-02-04T22:44:46Z

#!/usr/bin/perl
use strict;
use warnings;
use 5.010;
use Benchmark qw(cmpthese);

my $str = "bea" x 100;
my $re = qr/(?:be|ea|a)/;
cmpthese(-2, {
    atomic1   => sub { die if $str =~ m/(?>$re+)\d/ },
    atomic2   => sub { die if $str =~ m/(?>$re)+\d/ },
    normal    => sub { die if $str =~ m/$re+\d/ },
    posessive => sub { die if $str =~ m/$re++\d/ },
});
__END__

demerphq · 2020-02-05T04:02:17Z

On Tue, 4 Feb 2020 at 23:44, Todd Rinaldo ***@***.***> wrote: #!/usr/bin/perluse strict;use warnings;use 5.010;use Benchmark qw(cmpthese); my $str = "bea" x 100;my $re = qr/(?:be|ea|a)/; cmpthese(-2, { atomic1 => sub { die if $str =~ m/(?>$re+)\d/ }, atomic2 => sub { die if $str =~ m/(?>$re)+\d/ }, normal => sub { die if $str =~ m/$re+\d/ }, posessive => sub { die if $str =~ m/$re++\d/ }, }); Before we can address this ticket we need to remove the confusion from it.

Possessive and atomic1 are literally identical. They produce exactly the same regexp program, any observed difference in their performance is a timing artifact or a byproduct of this code recompiling the regex over and over during the benchmark. So that leaves three patterns to consider. normal, atomic1, normal, atomic2. The next thing to do is run them under C<use re 'debug';> and compare their output. $ perl re.pl normal > normal.out 2>&1 $ perl re.pl atomic1 > atomic1.out 2>&1 $ perl re.pl atomic2 > atomic2.out 2>&1 To do this it helps to pass the output thorugh a normalizing filter so that irrelevant differences dont confuse things: perl -lpe's/$\d+$//; s/\d+://; s/\|\s+([A-Z])/|$1/; s/\|\s*\d+\|/ \| /;' normal.out > normal.cln perl -lpe's/$\d+$//; s/\d+://; s/\|\s+([A-Z])/|$1/; s/\|\s*\d+\|/ \| /;' atomic1.out > atomic1.cln perl -lpe's/$\d+$//; s/\d+://; s/\|\s+([A-Z])/|$1/; s/\|\s*\d+\|/ \| /;' atomic2.out > atomic2.cln Then you can diff. If you look at the three cases normal and atomic2 are very similar. The (?>...) essentially doesn't do anything in the atomic2 pattern, and our benchmark shows that the performance is not very different. atomic2 8253/s normal 9635/s And we can see this in the diff as well: ```diff -Compiling REx "(?^:(?:be|ea|a))+\d" +Compiling REx "(?>(?^:(?:be|ea|a)))+\d" Final program: CURLYX[0]{1,INFTY} + SUSPEND TRIE-EXACT[abe] <be> <ea> <a> + SUCCEED + TAIL WHILEM[1/1] NOTHING POSIXU[\d] END minlen 2 -Matching REx "(?^:(?:be|ea|a))+\d" against "beabeabeabeabeabeabeabeabeabeabeabeabeabeabeabeabeabeabeabea"... +Matching REx "(?>(?^:(?:be|ea|a)))+\d" against "beabeabeabeabeabeabeabeabeabeabeabeabeabeabeabeabeabeabeabea"... 0 <> <beabeabeab> | CURLYX[0]{1,INFTY} 0 <> <beabeabeab> | WHILEM[1/1] | WHILEM: matched 0 out of 1..65535 + 0 <> <beabeabeab> | SUSPEND 0 <> <beabeabeab> | TRIE-EXACT[abe] 0 <> <beabeabeab> | TRIE: State: 1 Accepted: N TRIE: Charid: 1 CP: 62 After State: 2 1 <b> <eabeabeabe> | TRIE: State: 2 Accepted: N TRIE: Charid: 2 CP: 65 After State: 3 @@ -29,16 +33,22 @@ | TRIE: got 1 possible matches | TRIE matched word #1, continuing | TRIE: only one match left, short-circuiting: #1 <be> + 2 <be> <abeabeabea> | SUCCEED + | SUCCEED: subpattern success... ``` So basically the slowdown is from executing the extra SUSPEND/SUCCEED pair, which is pretty much expected. Maybe SUSPEND can be optimized a bit, but there isn't much to see here. So that means we need to compare normal and atomic1, which reveals far too many differences to include in this mail. One of the *key* differences is that in the normal case we see this: WHILEM: (cache) already tried at this position... which we do not see at all from atomic1. So, something about SUSPEND disables the super-linear cache. If there is a bug we should focus on this. On the other hand, these are extremely stupid patterns. They are unanchored, which is part of the reason the super-linear cache is important. Every time we fail to match, the regex engine advances the start point by a character and tries it again. The recursive regexp engine design we use is not capable (without *great* effort which is unlikely to be every provided) to determine that anything that matches (be|ea|a) *cannot* *ever* match \d. Now, we humans are quite capable of this analytical task with almost no effort at all, which reveals that this pattern would be *greatly* helped by a (*SKIP) directive. So, yes, there is a bug here we should figure out. Why does SUSPEND break/disable the super-linear cache? But ultimately the use of SUSPEND/possessive match here is simply not the right thing to do, and doesn't help that much with an unanchored pattern anyway. $ perl re_bm.pl Rate possessive atomic1 atomic2 normal skip possessive 660/s -- -0% -95% -96% -96% atomic1 663/s 0% -- -95% -96% -96% atomic2 13460/s 1939% 1929% -- -15% -26% normal 15761/s 2287% 2276% 17% -- -14% skip 18266/s 2667% 2654% 36% 16% -- $ cat re_bm.pl #!/usr/bin/perl use strict; use warnings; use 5.010; use Benchmark qw(cmpthese); my $str = "bea" x 100; my $re = qr/(?:be|ea|a)/; cmpthese(-2, { atomic1 => sub { die if $str =~ m/(?>$re+)\d/o }, possessive => sub { die if $str =~ m/$re++\d/o }, atomic2 => sub { die if $str =~ m/(?>$re)+\d/o }, normal => sub { die if $str =~ m/$re+\d/o }, skip => sub { die if $str =~ m/$re(*SKIP)\d/o }, }); $ cat re.pl #!/usr/bin/perl use strict; use warnings; use re 'debug'; my $str = "bea" x 100; my $re = qr/(?:be|ea|a)/; my $type= shift || "normal"; print "type= $type\n"; if ($type eq "atomic1") { $str =~ m/(?>$re+)\d/; } elsif ($type eq "atomic2"){ $str =~ m/(?>$re)+\d/; } elsif ($type eq "normal") { $str =~ m/$re+\d/; } elsif ($type eq "possessive") { $str =~ m/$re++\d/; } elsif ($type eq "skip") { $str =~ m/$re+(*SKIP)\d/; } $ cat re_bm_qr.pl #!/usr/bin/perl use strict; use warnings; use 5.010; use Benchmark qw(cmpthese); my $str = "bea" x 100; my $re = qr/(?:be|ea|a)/; my $atomic1= qr/(?>$re+)\d/; my $atomic2= qr/(?>$re)+\d/; my $possessive= qr/$re++\d/; my $normal= qr/$re+\d/; my $skip= qr/$re+(*SKIP)\d/; cmpthese(-2, { atomic1 => sub { die if $str =~ m/$atomic1/o }, possessive => sub { die if $str =~ m/$possessive/o }, atomic2 => sub { die if $str =~ m/$atomic2/o }, normal => sub { die if $str =~ m/$normal/o }, skip => sub { die if $str =~ m/$skip/o }, }); $ perl re_bm_qr.pl Rate possessive atomic1 atomic2 normal skip possessive 682/s -- -0% -95% -96% -99% atomic1 684/s 0% -- -95% -96% -99% atomic2 14166/s 1976% 1972% -- -15% -87% normal 16665/s 2343% 2338% 18% -- -84% skip 106189/s 15464% 15433% 650% 537% -- $ cat re_bm_qr_anch.pl #!/usr/bin/perl use strict; use warnings; use 5.010; use Benchmark qw(cmpthese); my $str = "bea" x 100; my $re = qr/(?:be|ea|a)/; my $atomic1= qr/^(?>$re+)\d/; my $atomic2= qr/^(?>$re)+\d/; my $possessive= qr/^$re++\d/; my $normal= qr/^$re+\d/; my $skip= qr/^$re+(*SKIP)\d/; cmpthese(-2, { atomic1 => sub { die if $str =~ m/$atomic1/o }, possessive => sub { die if $str =~ m/$possessive/o }, atomic2 => sub { die if $str =~ m/$atomic2/o }, normal => sub { die if $str =~ m/$normal/o }, skip => sub { die if $str =~ m/$skip/o }, }); $ perl re_bm_qr_anch.pl Rate atomic2 normal possessive atomic1 skip atomic2 63150/s -- -19% -37% -39% -41% normal 78442/s 24% -- -22% -25% -26% possessive 100604/s 59% 28% -- -4% -6% atomic1 104259/s 65% 33% 4% -- -2% skip 106684/s 69% 36% 6% 2% -- Cheers, yves

…

-- perl -Mre=debug -e "/just|another|perl|hacker/"

khwilliamson · 2022-04-14T12:52:44Z

I'll summarize all of the above with this quote from @demerphq from it

So, yes, there is a bug here we should figure out. Why does SUSPEND
break/disable the super-linear cache?

But ultimately the use of SUSPEND/possessive match here is simply not the
right thing to do, and doesn't help that much with an unanchored pattern
anyway.

demerphq · 2022-04-14T14:42:52Z

Although I probably should eat my words on this:

The recursive regexp engine design we use is not capable (without great effort which is unlikely to be every provided) to determine that anything that matches (be|ea|a) cannot ever match \d

It occurs me to me reading this today that if the code is TRIE'd then we can. We can't generally, in the sense that if 'be', 'ea' and 'a' are arbitrary patterns, but as written we /could/ tell that they cant match \d. Not sure how much it would help, and if the extra compile time would justify it. But in theory provided the alternation can be converted to a TRIE we could.

But it seems to me reading this now that we should be investigating the core point, why does SUSPEND disable the super linear cache?

iabyn · 2022-04-17T08:19:01Z

On Thu, Apr 14, 2022 at 07:43:03AM -0700, Yves Orton wrote: But it seems to me reading this now that we should be investigating the core point, why does SUSPEND disable the super linear cache?

I can't speak for SUSPEND directly, but in general the cache is disabled for the types of pattern elements which can break the assumption that if a particular CURLYX/WHILEM with infinite upper bound failed once at position N in the string, then it will fail at that position any other time we might try. So if we ever reach that combo of CURLYX number (up to 16 of them) and position in the string (as indicated by a bit set in the S-L cache), then there's no point continuing, as we're sure it will fail at some point further along in the pattern. A lot of the "non-regularly" bits of regexes, like \1, breach this assumption, and so their presence disables the cache.

…

-- The crew of the Enterprise encounter an alien life form which is surprisingly neither humanoid nor made from pure energy. -- Things That Never Happen in "Star Trek" #22

p5pRT added Severity Low type-regex labels Oct 18, 2019

toddr added the Closable? We might be able to close this ticket, but we need to check with the reporter label Feb 4, 2020

toddr removed the Closable? We might be able to close this ticket, but we need to check with the reporter label Feb 5, 2020

xenu removed the Severity Low label Dec 29, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Slowdown in 5.10.0 regexes with atomic patterns #9751

Slowdown in 5.10.0 regexes with atomic patterns #9751

p5pRT commented May 28, 2009

p5pRT commented May 28, 2009 •

edited by toddr

p5pRT commented Jun 18, 2009 •

edited by toddr

p5pRT commented Jun 18, 2009

p5pRT commented Jul 11, 2009

p5pRT commented Jul 14, 2009

p5pRT commented Jul 15, 2009 •

edited by hvds

p5pRT commented Jul 16, 2009

p5pRT commented Dec 16, 2009

toddr commented Feb 4, 2020

toddr commented Feb 4, 2020

demerphq commented Feb 5, 2020 via email •

edited by toddr

khwilliamson commented Apr 14, 2022

demerphq commented Apr 14, 2022

iabyn commented Apr 17, 2022 via email

Slowdown in 5.10.0 regexes with atomic patterns #9751

Slowdown in 5.10.0 regexes with atomic patterns #9751

Comments

p5pRT commented May 28, 2009

p5pRT commented May 28, 2009 • edited by toddr

From @nwc10

p5pRT commented Jun 18, 2009 • edited by toddr

From p5p@spam.wizbit.be

p5pRT commented Jun 18, 2009

p5pRT commented Jul 11, 2009

From @schwern

p5pRT commented Jul 14, 2009

From @nwc10

p5pRT commented Jul 15, 2009 • edited by hvds

From @hvds

p5pRT commented Jul 16, 2009

From @hvds

p5pRT commented Dec 16, 2009

From @obra

toddr commented Feb 4, 2020

toddr commented Feb 4, 2020

demerphq commented Feb 5, 2020 via email • edited by toddr

khwilliamson commented Apr 14, 2022

demerphq commented Apr 14, 2022

iabyn commented Apr 17, 2022 via email

p5pRT commented May 28, 2009 •

edited by toddr

p5pRT commented Jun 18, 2009 •

edited by toddr

p5pRT commented Jul 15, 2009 •

edited by hvds

demerphq commented Feb 5, 2020 via email •

edited by toddr