Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed lost on /^(foo|bar|baz)$/ match #9461

Open
p5pRT opened this issue Aug 22, 2008 · 15 comments
Open

Speed lost on /^(foo|bar|baz)$/ match #9461

p5pRT opened this issue Aug 22, 2008 · 15 comments

Comments

@p5pRT
Copy link

p5pRT commented Aug 22, 2008

Migrated from rt.perl.org#58280 (status was 'open')

Searchable as RT58280$

@p5pRT
Copy link
Author

p5pRT commented Aug 22, 2008

From @schwern

Created by @schwern

I was testing out an efficiency claim and discovered that 5.10 and
bleadperl have both lost a lot of performance on /^(foo|bar|baz)$/
type regexes.

#!/usr/bin/perl -w

use strict;
use warnings;

use Benchmark qw(cmpthese);

my $token = "open";

cmpthese(shift || -3, {
  regex => sub { $token =~ m/\A (?​: open | close | read ) \z/xms; },
  regex_opt => sub { $token =~ m/^(?​:open|close|read)$/; },
  eq => sub { $token eq 'open' ||
  $token eq 'close' ||
  $token eq 'read';
  }
});

$ bleadperl ~/tmp/bug.plx 5000000
  Rate regex_opt regex eq
regex_opt 1222494/s -- -5% -85%
regex 1282051/s 5% -- -84%
eq 8064516/s 560% 529% --

$ perl5.8.8 ~/tmp/bug.plx 5000000
  Rate regex_opt regex eq
regex_opt 3048780/s -- -1% -54%
regex 3086420/s 1% -- -53%
eq 6578947/s 116% 113% --

$ perl5.10.0 ~/tmp/bug.plx 5000000
  Rate regex regex_opt eq
regex 1492537/s -- -0% -89%
regex_opt 1497006/s 0% -- -89%
eq 13157895/s 782% 779% --

Can anyone duplicate?

Perl Info

Flags:
    category=core
    severity=low

Site configuration information for perl 5.11.0:

Configured by schwern at Fri Aug 22 15:15:47 PDT 2008.

Summary of my perl5 (revision 5 version 11 subversion 0 patch 34218) configuration:
  Platform:
    osname=darwin, osvers=8.11.1, archname=darwin-thread-multi-2level
    uname='darwin windhund.schwern.org 8.11.1 darwin kernel version 8.11.1: wed oct 10 18:23:28 pdt 2007; root:xnu-792.25.20~1release_i386 i386 i386 macbook1,1 darwin '
    config_args='-de -Dprefix=/usr/local/perl/blead -Dusedevel -Duseithreads -Dccflags=-I/sw/include -Dldflags=-L/sw/lib -Dperladmin=schwern@pobox.com -Dcf_email=schwern@pobox.com -Dmyhostname=windhund -Dmydomain=.schwern.org -Dlibpth=/usr/local/lib /sw/lib /opt/local/lib /usr/lib -Uversiononly -Uinstallusrbinperl'
    hint=recommended, useposix=true, d_sigaction=define
    useithreads=define, usemultiplicity=define
    useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef
    use64bitint=undef, use64bitall=undef, uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='cc', ccflags ='-I/sw/include -fno-common -DPERL_DARWIN -no-cpp-precomp -fno-strict-aliasing -pipe -I/usr/local/include -I/opt/local/include',
    optimize='-O3',
    cppflags='-no-cpp-precomp -I/sw/include -fno-common -DPERL_DARWIN -no-cpp-precomp -fno-strict-aliasing -pipe -I/usr/local/include -I/opt/local/include'
    ccversion='', gccversion='4.0.1 (Apple Computer, Inc. build 5370)', gccosandvers=''
    intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
    ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
    alignbytes=8, prototype=define
  Linker and Libraries:
    ld='env MACOSX_DEPLOYMENT_TARGET=10.3 cc', ldflags ='-L/sw/lib -L/usr/local/lib -L/opt/local/lib'
    libpth=/usr/local/lib /sw/lib /opt/local/lib /usr/lib
    libs=-lgdbm -ldbm -ldb -ldl -lm -lc
    perllibs=-ldl -lm -lc
    libc=/usr/lib/libc.dylib, so=dylib, useshrplib=false, libperl=libperl.a
    gnulibc_version=''
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=bundle, d_dlsymun=undef, ccdlflags=' '
    cccdlflags=' ', lddlflags='-L/sw/lib -bundle -undefined dynamic_lookup -L/usr/local/lib -L/opt/local/lib'

Locally applied patches:
    DEVEL


@INC for perl 5.11.0:
    /sw/lib/perl5/darwin-thread-multi-2level
    /sw/lib/perl5
    /sw/lib/perl5/darwin
    /usr/local/perl/blead/lib/5.11.0/darwin-thread-multi-2level
    /usr/local/perl/blead/lib/5.11.0
    /usr/local/perl/blead/lib/site_perl/5.11.0/darwin-thread-multi-2level
    /usr/local/perl/blead/lib/site_perl/5.11.0
    /usr/local/perl/blead/lib/site_perl/5.10.0
    /usr/local/perl/blead/lib/site_perl
    .


Environment for perl 5.11.0:
    DYLD_LIBRARY_PATH (unset)
    HOME=/Users/schwern
    LANG (unset)
    LANGUAGE (unset)
    LC_CTYPE=en_US.UTF-8
    LD_LIBRARY_PATH (unset)
    LOGDIR (unset)
    PATH=/Users/schwern/bin:/usr/local/bin:/opt/local/bin:/opt/local/sbin:/sw/bin:/sw/sbin:/Users/schwern/bin:/usr/local/bin:/usr/local/bin:/bin:/sbin:/usr/bin:/usr/sbin:/usr/X11R6/bin
    PERL5LIB=/sw/lib/perl5:/sw/lib/perl5/darwin
    PERL_BADLANG (unset)
    SHELL=/bin/bash

@p5pRT
Copy link
Author

p5pRT commented Aug 27, 2008

From @smpeters

On Fri Aug 22 15​:28​:29 2008, schwern wrote​:

This is a bug report for perl from schwern@​pobox.com,
generated with the help of perlbug 1.39 running under perl 5.11.0.

-----------------------------------------------------------------
[Please describe your issue here]

I was testing out an efficiency claim and discovered that 5.10 and
bleadperl have both lost a lot of performance on /^(foo|bar|baz)$/
type regexes.

#!/usr/bin/perl -w

use strict;
use warnings;

use Benchmark qw(cmpthese);

my $token = "open";

cmpthese(shift || -3, {
regex => sub { $token =~ m/\A (?​: open | close | read )
\z/xms; },
regex_opt => sub { $token =~ m/^(?​:open|close|read)$/; },
eq => sub { $token eq 'open' ||
$token eq 'close' ||
$token eq 'read';
}
});

$ bleadperl ~/tmp/bug.plx 5000000
Rate regex_opt regex eq
regex_opt 1222494/s -- -5% -85%
regex 1282051/s 5% -- -84%
eq 8064516/s 560% 529% --

$ perl5.8.8 ~/tmp/bug.plx 5000000
Rate regex_opt regex eq
regex_opt 3048780/s -- -1% -54%
regex 3086420/s 1% -- -53%
eq 6578947/s 116% 113% --

$ perl5.10.0 ~/tmp/bug.plx 5000000
Rate regex regex_opt eq
regex 1492537/s -- -0% -89%
regex_opt 1497006/s 0% -- -89%
eq 13157895/s 782% 779% --

Can anyone duplicate?

Yes, I can duplicate. The first is 5.8.8. The second is bleadperl.

Steve Peters
steve@​fisharerojo.org

steve@​picard​:~/perl-current$ perl bench.pl 5000000
  Rate regex_opt regex eq
regex_opt 1886792/s -- -0% -61%
regex 1893939/s 0% -- -61%
eq 4807692/s 155% 154% --
steve@​picard​:~/perl-current$ ./perl -Ilib bench.pl
  Rate regex_opt regex eq
regex_opt 1016789/s -- -2% -85%
regex 1035642/s 2% -- -84%
eq 6569214/s 546% 534% --

@p5pRT
Copy link
Author

p5pRT commented Aug 27, 2008

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Aug 27, 2008

From nospam-abuse@bloodgate.com

Moin,

On Wednesday 27 August 2008 05​:16​:12 Steve Peters via RT wrote​:

On Fri Aug 22 15​:28​:29 2008, schwern wrote​:

Can anyone duplicate?

Yes, I can duplicate. The first is 5.8.8. The second is bleadperl.

What I find interesting is that the "eq" case got faster :)

All the best,

Tels

--
Signed on Wed Aug 27 19​:13​:34 2008 with key 0x93B84C15.
Get one of my photo posters​: http​://bloodgate.com/posters
PGP key on http​://bloodgate.com/tels.asc or per email.

"Never offend people with style when you can offend them with
substance."

  -- Sam Brown

@p5pRT
Copy link
Author

p5pRT commented Dec 29, 2008

From @demerphq

Given the pattern involved at first look it seems this is related to the
TRIE functionality.

With perl 5.8​:

demerphq@​dromedary​:blead​:~/perl$ perl bug58280.pl
  Rate regex regex_opt eq
regex 2254593/s -- -3% -60%
regex_opt 2334045/s 4% -- -58%
eq 5618599/s 149% 141% --

With blead​:

demerphq@​dromedary​:blead​:~/perl$ ./perl -Ilib bug58280.pl
  Rate regex regex_opt eq
regex 705456/s -- -1% -86%
regex_opt 709993/s 1% -- -86%
eq 5202675/s 637% 633% --

This is with the trie optimisation disabled​:
demerphq@​dromedary​:blead​:~/perl$ ./perl -Ilib bug58280.pl
  Rate regex regex_opt eq
regex 1260232/s -- -1% -76%
regex_opt 1275183/s 1% -- -76%
eq 5272268/s 318% 313% --

From this I assume that the majority of the slowdown comes from the
setup time for doing a match using the new non-recursive process and not
from the TRIE.

So then what happens when we change the token being matched? After all
the benchmark is​:

'open'=~/^(open|close|read)$/

Which is a benchmark which is virtually designed to make the old
alternation implementation look good. So what happens when we switch it
to "read"?

With 5.8 we see the expected slowdown​:

demerphq@​dromedary​:blead​:~/perl$ perl bug58280.pl
  Rate regex regex_opt eq
regex 1846732/s -- -11% -23%
regex_opt 2071288/s 12% -- -14%
eq 2406042/s 30% 16% --

With blead we see the expected unchanged performance of a trie​:

demerphq@​dromedary​:blead​:~/perl$ ./perl -Ilib bug58280.pl
  Rate regex regex_opt eq
regex 723692/s -- -1% -62%
regex_opt 730718/s 1% -- -62%
eq 1904342/s 163% 161% --

So now, if we add a bunch more terms to the test​:

#!/usr/bin/perl -w

use strict;
use warnings;

use Benchmark qw(cmpthese);

my $token = "read";

cmpthese(shift || -3, {
  regex => sub { $token =~ m/\A (?​: open | close | foo | bar | baz
| bop | dizzy | blitzen | rocker | mod | punk | read ) \z/xms; },
  regex_opt => sub { $token =~
m/^(?​:open|close|foo|bar|baz|bop|dizzy|blitzen|rocker|mod|punk|read)$/; },
  'eq' => sub {
  $token eq 'open' ||
  $token eq 'close' ||
  $token eq 'foo' ||
  $token eq 'bar' ||
  $token eq 'baz' ||
  $token eq 'bop' ||
  $token eq 'dizzy' ||
  $token eq 'blitzen' ||
  $token eq 'rocker' ||
  $token eq 'mod' ||
  $token eq 'punk' ||
  $token eq 'read';
  }
});

We see perl 5.8's performance continue to fall off​:

demerphq@​dromedary​:blead​:~/perl$ perl bug58280.pl
  Rate eq regex regex_opt
eq 874033/s -- -35% -35%
regex 1348988/s 54% -- -0%
regex_opt 1353136/s 55% 0% --

And we see perl 5.10's performance again stay more or less static​:

demerphq@​dromedary​:blead​:~/perl$ ./perl -Ilib bug58280.pl
  Rate eq regex_opt regex
eq 601754/s -- -14% -14%
regex_opt 696555/s 16% -- -1%
regex 703210/s 17% 1% --

Also, the performance difference of the 'eq' cases suggests that perl as
a whole is a bit slower in perl5.10, this is nothing new afaik, perl has
been getting somewhat slower with each release.

Anyway, my conclusion is that this isnt really a bug. Its a place where
the changes in the regex engine result in a loss of speed, but it is
cherry picked to do so. Sure we could probably do some work to make this
class of pattern not use the trie logic because the number of options
are so low, and because the pattern is fully anchored, but that isn't
going to save much. The end result will afaict still be half as fast
simply because setting up the non-recursive engine is more expensive
than setting up the recursive one. But of course this comes at a trade
off, the new engine wont SEGV, and the new engine won't see a linear
falloff in performance due to large alternations. So we trade speed for
stability and predictable performance. Its hard to say that is a bug.

I view this more as a notice that we could spend some time making regex
startup less expensive, if we can do so without losing features. The
TRIE aspect IMO in this case is a lesser concern. Although i admit it
could be optimised. The compressed scheme we use is massive overkill for
the type of pattern we have in this bug. It is designed with huge
transition tables in mind, not the more common smaller ones that would
come from the pattern in this bug.

Yves

@p5pRT
Copy link
Author

p5pRT commented Jul 8, 2009

From @schwern

FWIW it got worse. perl5.10.1 here is the latest maint5.10.

$ time perl5.10.0 ~/tmp/bench.plx 5000000
  Rate regex regex_opt eq
regex 1366120/s -- -0% -86%
regex_opt 1369863/s 0% -- -86%
eq 9615385/s 604% 602% --

real 0m21.322s
user 0m20.494s
sys 0m0.176s

$ time perl5.10.1 ~/tmp/bench.plx 5000000
  Rate regex_opt regex eq
regex_opt 1075269/s -- -6% -85%
regex 1146789/s 7% -- -84%
eq 7246377/s 574% 532% --

real 0m23.734s
user 0m23.284s
sys 0m0.188s

Even so, I don't think this should be holding up 5.10.1.

@p5pRT
Copy link
Author

p5pRT commented Dec 16, 2009

From @obra

On Tue Jul 07 19​:34​:55 2009, schwern wrote​:

FWIW it got worse. perl5.10.1 here is the latest maint5.10.

$ time perl5.10.0 ~/tmp/bench.plx 5000000
Rate regex regex_opt eq
regex 1366120/s -- -0% -86%
regex_opt 1369863/s 0% -- -86%
eq 9615385/s 604% 602% --

real 0m21.322s
user 0m20.494s
sys 0m0.176s

$ time perl5.10.1 ~/tmp/bench.plx 5000000
Rate regex_opt regex eq
regex_opt 1075269/s -- -6% -85%
regex 1146789/s 7% -- -84%
eq 7246377/s 574% 532% --

real 0m23.734s
user 0m23.284s
sys 0m0.188s

Even so, I don't think this should be holding up 5.10.1.

I'm removing this performance issue as a 5.12 blocker as it's not a
regression from 5.10 and yves makes a reasonable NOTABUG argument in the
ticket history

@p5pRT
Copy link
Author

p5pRT commented May 2, 2012

From @jkeenan

On Tue Jul 07 19​:34​:55 2009, schwern wrote​:

FWIW it got worse. perl5.10.1 here is the latest maint5.10.

FWIW, here are some comparisons​:

Linux/i386​:

$ /usr/bin/perl 58280.pl 5000000
Perl version​: 5.010000
  Rate regex regex_opt eq
regex 1689189/s -- -7% -75%
regex_opt 1818182/s 8% -- -73%
eq 6666667/s 295% 267% --

$ /usr/local/bin/perl5.12.0 58280.pl 5000000
Perl version​: 5.012000
  Rate regex regex_opt eq
regex 1824818/s -- -1% -85%
regex_opt 1845018/s 1% -- -85%
eq 12195122/s 568% 561% --

$ perl 58280.pl 5000000
Perl version​: 5.014000
  Rate regex_opt regex eq
regex_opt 2793296/s -- -6% -77%
regex 2976190/s 7% -- -75%
eq 11904762/s 326% 300% --

Darwin/PPC​:

$ /usr/bin/perl 58280.pl 5000000
Perl version​: 5.008006
  Rate regex regex_opt eq
regex 1059322/s -- -15% -65%
regex_opt 1250000/s 18% -- -59%
eq 3067485/s 190% 145% --

$ /usr/local/bin/perl5.12.0 58280.pl 5000000
Perl version​: 5.012000
  Rate regex regex_opt eq
regex 314663/s -- -4% -86%
regex_opt 326797/s 4% -- -85%
eq 2232143/s 609% 583% --

$ perl 58280.pl 5000000
Perl version​: 5.014002
  Rate regex_opt regex eq
regex_opt 677507/s -- -4% -86%
regex 706215/s 4% -- -85%
eq 4672897/s 590% 562% --

It appears that things improved on both OSes between 5.12 and 5.14.

Does this ticket have to remain open?

Thank you very much.
Jim Keenan

@p5pRT
Copy link
Author

p5pRT commented May 2, 2012

From @iabyn

On Tue, May 01, 2012 at 06​:22​:00PM -0700, James E Keenan via RT wrote​:

On Tue Jul 07 19​:34​:55 2009, schwern wrote​:

FWIW it got worse. perl5.10.1 here is the latest maint5.10.
FWIW, here are some comparisons​:
[snip]
It appears that things improved on both OSes between 5.12 and 5.14.

Does this ticket have to remain open?

It shows that things have got better, but still not as good as under
5.8.x. I think it would be worth doing some profiling to see if there's
anything further that can be tweaked, or whether its just the inevitable
slowdown due to improved utf8 handling, trie code etc etc.

In the meantime I suggest keeping the ticket open.

--
Music lesson​: a symbiotic relationship whereby a pupil's embellishments
concerning the amount of practice performed since the last lesson are
rewarded with embellishments from the teacher concerning the pupil's
progress over the corresponding period.

@p5pRT
Copy link
Author

p5pRT commented Mar 30, 2014

From @khwilliamson

Here are the 5.19.10+ blead numbers on a Linux box​:

  Rate regex regex_opt eq
regex 7650757/s -- -1% -84%
regex_opt 7717895/s 1% -- -84%
eq 48383994/s 532% 527% --

And if we change $token to 'read', we get

  Rate regex regex_opt eq
regex 7748530/s -- -2% -62%
regex_opt 7899967/s 2% -- -61%
eq 20146179/s 160% 155% --

--
Karl Williamson

@p5pRT
Copy link
Author

p5pRT commented Mar 30, 2014

From [Unknown Contact. See original ticket]

Here are the 5.19.10+ blead numbers on a Linux box​:

  Rate regex regex_opt eq
regex 7650757/s -- -1% -84%
regex_opt 7717895/s 1% -- -84%
eq 48383994/s 532% 527% --

And if we change $token to 'read', we get

  Rate regex regex_opt eq
regex 7748530/s -- -2% -62%
regex_opt 7899967/s 2% -- -61%
eq 20146179/s 160% 155% --

--
Karl Williamson

@p5pRT
Copy link
Author

p5pRT commented Apr 1, 2014

From @iabyn

On Sat, Mar 29, 2014 at 08​:29​:46PM -0700, Karl Williamson via RT wrote​:

Here are the 5.19.10+ blead numbers on a Linux box​:

            Rate     regex regex\_opt        eq

regex 7650757/s -- -1% -84%
regex_opt 7717895/s 1% -- -84%
eq 48383994/s 532% 527% --

And if we change $token to 'read', we get

            Rate     regex regex\_opt        eq

regex 7748530/s -- -2% -62%
regex_opt 7899967/s 2% -- -61%
eq 20146179/s 160% 155% --

I think the important thing is how much slower the trie optimisation is for
simple alternations that wouldn't benefit. Comparing a range of perl
versions on the same hardware, I get for the original bug report code
(where $token matches the first alternation)​:

  5.008009 regex 10526316/s
  5.010001 regex 3676471/s
  5.012005 regex 3367003/s
  5.014004 regex 5665722/s
  5.016003 regex 6756757/s
  5.018002 regex 5602241/s
  5.019010 regex 6211180/s

And here's the same with TRIEs disabled (${^RE_TRIE_MAXBUF} = -1)​:

  5.008009 regex 10309278/s
  5.010001 regex 7017544/s
  5.012005 regex 6993007/s
  5.014004 regex 6920415/s
  5.016003 regex 8130081/s
  5.018002 regex 6451613/s
  5.019010 regex 6644518/s

and with TRIEs enabled and $token as 'read', i.e. $token matches the third
alternation​:

  5.008009 regex 8438819/s
  5.010001 regex 3809524/s
  5.012005 regex 3344482/s
  5.014004 regex 5917160/s
  5.016003 regex 6711409/s
  5.018002 regex 5524862/s
  5.019010 regex 6097561/s

and with TRIEs enabled and with Yves modification, i.e. $token matches the
12th alternation​:

  5.008009 regex 4807692/s
  5.010001 regex 3724395/s
  5.012005 regex 3367003/s
  5.014004 regex 5952381/s
  5.016003 regex 6688963/s
  5.018002 regex 5555556/s
  5.019010 regex 6042296/s

What I conclude from the above is that​:

* as with all such micro benchmarks, a certain amount of the variation is
  just due to the luck of the compiler, e.g. instruction cache alignment;
* simple alternations slow down in proportion to the number of
  alternatives that are tried (as you would expect);
* TRIEs match in roughly constant time regardless of which alternation
* matches (as you would expect);
* even non-TRIE alternations are slower in later perls, most likely due to
  all the extra overhead we've accumulated, such as utf8 correctness,
  not blowing the stack on recursion etc.
* TRIEs have got faster in 5.14;
* But TRIEs still have a bigger overhead than plain alts, so it may be
  worth looking at
  * profiling to see if the overhead can be reduced;
  * seeing whether there's a cut-off point where it's not worth
  building a TRIE, e.g. if there are less than N alternatives; and
  there might be other factors such as lengths of the alt strings,
  whether its utf8 etc.

--
The warp engines start playing up a bit, but seem to sort themselves out
after a while without any intervention from boy genius Wesley Crusher.
  -- Things That Never Happen in "Star Trek" #17

@p5pRT
Copy link
Author

p5pRT commented Apr 1, 2014

From @demerphq

On 1 April 2014 17​:40, Dave Mitchell <davem@​iabyn.com> wrote​:

On Sat, Mar 29, 2014 at 08​:29​:46PM -0700, Karl Williamson via RT wrote​:

Here are the 5.19.10+ blead numbers on a Linux box​:

            Rate     regex regex\_opt        eq

regex 7650757/s -- -1% -84%
regex_opt 7717895/s 1% -- -84%
eq 48383994/s 532% 527% --

And if we change $token to 'read', we get

            Rate     regex regex\_opt        eq

regex 7748530/s -- -2% -62%
regex_opt 7899967/s 2% -- -61%
eq 20146179/s 160% 155% --

I think the important thing is how much slower the trie optimisation is for
simple alternations that wouldn't benefit. Comparing a range of perl
versions on the same hardware, I get for the original bug report code
(where $token matches the first alternation)​:

5\.008009 regex 10526316/s
5\.010001 regex  3676471/s
5\.012005 regex  3367003/s
5\.014004 regex  5665722/s
5\.016003 regex  6756757/s
5\.018002 regex  5602241/s
5\.019010 regex  6211180/s

And here's the same with TRIEs disabled (${^RE_TRIE_MAXBUF} = -1)​:

5\.008009 regex 10309278/s
5\.010001 regex  7017544/s
5\.012005 regex  6993007/s
5\.014004 regex  6920415/s
5\.016003 regex  8130081/s
5\.018002 regex  6451613/s
5\.019010 regex  6644518/s

and with TRIEs enabled and $token as 'read', i.e. $token matches the third
alternation​:

5\.008009 regex  8438819/s
5\.010001 regex  3809524/s
5\.012005 regex  3344482/s
5\.014004 regex  5917160/s
5\.016003 regex  6711409/s
5\.018002 regex  5524862/s
5\.019010 regex  6097561/s

and with TRIEs enabled and with Yves modification, i.e. $token matches the
12th alternation​:

5\.008009 regex  4807692/s
5\.010001 regex  3724395/s
5\.012005 regex  3367003/s
5\.014004 regex  5952381/s
5\.016003 regex  6688963/s
5\.018002 regex  5555556/s
5\.019010 regex  6042296/s

What I conclude from the above is that​:

Pretty much the same thing I concluded too. :-)

* as with all such micro benchmarks, a certain amount of the variation is
just due to the luck of the compiler, e.g. instruction cache alignment;
* simple alternations slow down in proportion to the number of
alternatives that are tried (as you would expect);
* TRIEs match in roughly constant time regardless of which alternation
* matches (as you would expect);
* even non-TRIE alternations are slower in later perls, most likely due to
all the extra overhead we've accumulated, such as utf8 correctness,
not blowing the stack on recursion etc.
* TRIEs have got faster in 5.14;
* But TRIEs still have a bigger overhead than plain alts, so it may be
worth looking at
* profiling to see if the overhead can be reduced;

It definitely can be. Unusually we trade speed for space in the trie
algorithm. Something I now regret. Instead of supporting super large
alphabets and running slow we should have just ignored large alphabets
and run fast.

If I get time I will look at changing the trie/aho code to use the
fast representation. I dont have many tuits however.

* seeing whether there's a cut-off point where it's not worth
building a TRIE, e.g. if there are less than N alternatives; and
there might be other factors such as lengths of the alt strings,
whether its utf8 etc.

I also agree here.

Yves

@khwilliamson
Copy link
Contributor

@demerphq, ping

@demerphq
Copy link
Collaborator

I am not sure what to do about this. We use a compressed state transition table that means each transition is more expensive than it could be, but which gracefully handles large alphabet patterns. We actually have two construction methods however, one using a fixed width uncompressed "standard" table, and one using a more complex list representation for large alphabets, at the end of construction we convert both of those formats into a final compressed format.

The idea behind the compressed format is from the "red dragon book". You can see it in action below.

   ./perl -Ilib -Mre=Debug,TRIE,TRIEC,TRIEM -e'qr/(these|are|different|words|that|I|know|and|write|knotted|sentences|from|for|serious|fun)/'
Compiling REx "(these|are|different|words|that|I|know|and|write|knotted|sen"...
  Looking for TRIE'able sequences. Tail node is  57:CLOSE1
  ...
    make_trie start==3, first==3, last==57, tail==57 depth=1
    TRIE(NATIVE): W:15 C:72 Uq:17 Min:1 Max:9
    Compiling trie using table compiler
      Char :    t   h   e   s   a   r   d   i   f   n   w   o   I   k   c   m   u
      State+---------------------------------------------------------------------
         1 :    2   .   .  29   7   .   A   .  32   .  13   .  1A  1B   .   .   . (   8)
         2 :    .   3   .   .   .   .   .   .   .   .   .   .   .   .   .   .   . (   1)
         3 :    .   .   4   .  18   .   .   .   .   .   .   .   .   .   .   .   . (   2)
         4 :    .   .   .   5   .   .   .   .   .   .   .   .   .   .   .   .   . (   1)
         5 :    .   .   6   .   .   .   .   .   .   .   .   .   .   .   .   .   . (   1)
         6 :    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   . (   0) W   1
         7 :    .   .   .   .   .   8   .   .   .  1F   .   .   .   .   .   .   . (   2)
         8 :    .   .   9   .   .   .   .   .   .   .   .   .   .   .   .   .   . (   1)
         9 :    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   . (   0) W   2
         A :    .   .   .   .   .   .   .   B   .   .   .   .   .   .   .   .   . (   1)
         B :    .   .   .   .   .   .   .   .   C   .   .   .   .   .   .   .   . (   1)
         C :    .   .   .   .   .   .   .   .   D   .   .   .   .   .   .   .   . (   1)
         D :    .   .   E   .   .   .   .   .   .   .   .   .   .   .   .   .   . (   1)
         E :    .   .   .   .   .   F   .   .   .   .   .   .   .   .   .   .   . (   1)
         F :    .   .  10   .   .   .   .   .   .   .   .   .   .   .   .   .   . (   1)
        10 :    .   .   .   .   .   .   .   .   .  11   .   .   .   .   .   .   . (   1)
        11 :   12   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   . (   1)
        12 :    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   . (   0) W   3
        13 :    .   .   .   .   .  21   .   .   .   .   .  14   .   .   .   .   . (   2)
        14 :    .   .   .   .   .  15   .   .   .   .   .   .   .   .   .   .   . (   1)
        15 :    .   .   .   .   .   .  16   .   .   .   .   .   .   .   .   .   . (   1)
        16 :    .   .   .  17   .   .   .   .   .   .   .   .   .   .   .   .   . (   1)
        17 :    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   . (   0) W   4
        18 :   19   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   . (   1)
        19 :    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   . (   0) W   5
        1A :    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   . (   0) W   6
        1B :    .   .   .   .   .   .   .   .   .  1C   .   .   .   .   .   .   . (   1)
        1C :    .   .   .   .   .   .   .   .   .   .   .  1D   .   .   .   .   . (   1)
        1D :   25   .   .   .   .   .   .   .   .   .  1E   .   .   .   .   .   . (   2)
        1E :    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   . (   0) W   7
        1F :    .   .   .   .   .   .  20   .   .   .   .   .   .   .   .   .   . (   1)
        20 :    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   . (   0) W   8
        21 :    .   .   .   .   .   .   .  22   .   .   .   .   .   .   .   .   . (   1)
        22 :   23   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   . (   1)
        23 :    .   .  24   .   .   .   .   .   .   .   .   .   .   .   .   .   . (   1)
        24 :    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   . (   0) W   9
        25 :   26   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   . (   1)
        26 :    .   .  27   .   .   .   .   .   .   .   .   .   .   .   .   .   . (   1)
        27 :    .   .   .   .   .   .  28   .   .   .   .   .   .   .   .   .   . (   1)
        28 :    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   . (   0) W   A
        29 :    .   .  2A   .   .   .   .   .   .   .   .   .   .   .   .   .   . (   1)
        2A :    .   .   .   .   .  38   .   .   .  2B   .   .   .   .   .   .   . (   2)
        2B :   2C   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   . (   1)
        2C :    .   .  2D   .   .   .   .   .   .   .   .   .   .   .   .   .   . (   1)
        2D :    .   .   .   .   .   .   .   .   .  2E   .   .   .   .   .   .   . (   1)
        2E :    .   .   .   .   .   .   .   .   .   .   .   .   .   .  2F   .   . (   1)
        2F :    .   .  30   .   .   .   .   .   .   .   .   .   .   .   .   .   . (   1)
        30 :    .   .   .  31   .   .   .   .   .   .   .   .   .   .   .   .   . (   1)
        31 :    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   . (   0) W   B
        32 :    .   .   .   .   .  33   .   .   .   .   .  36   .   .   .   .  3D (   3)
        33 :    .   .   .   .   .   .   .   .   .   .   .  34   .   .   .   .   . (   1)
        34 :    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .  35   . (   1)
        35 :    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   . (   0) W   C
        36 :    .   .   .   .   .  37   .   .   .   .   .   .   .   .   .   .   . (   1)
        37 :    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   . (   0) W   D
        38 :    .   .   .   .   .   .   .  39   .   .   .   .   .   .   .   .   . (   1)
        39 :    .   .   .   .   .   .   .   .   .   .   .  3A   .   .   .   .   . (   1)
        3A :    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .  3B (   1)
        3B :    .   .   .  3C   .   .   .   .   .   .   .   .   .   .   .   .   . (   1)
        3C :    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   . (   0) W   E
        3D :    .   .   .   .   .   .   .   .   .  3E   .   .   .   .   .   .   . (   1)
        3E :    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   . (   0) W   F
    Alloc: 1242 Orig: 1055 elements, Final:62. Savings of %94.12
    Statecount:3f Lasttrans:3f
      Char : Match Base  Ofs     t   h   e   s   a   r   d   i   f   n   w   o   I   k   c   m   u
      State|-------------------------------------------------------------------------------------------
      #   1|       @  11 + 0[    2   .   .  29   7   .   A   .  32   .  13   .  1A  1B   .   .   .]
      #   2|       @  11 + 1[    .   3   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .]
      #   3|       @  1D + 2[    .   .   4   .  18   .   .   .   .   .   .   .   .   .   .   .   .]
      #   4|       @  10 + 3[    .   .   .   5   .   .   .   .   .   .   .   .   .   .   .   .   .]
      #   5|       @  14 + 2[    .   .   6   .   .   .   .   .   .   .   .   .   .   .   .   .   .]
      #   6| W   1 @   0 
      #   7|       @  1D + 5[    .   .   .   .   .   8   .   .   .  1F   .   .   .   .   .   .   .]
      #   8|       @  16 + 2[    .   .   9   .   .   .   .   .   .   .   .   .   .   .   .   .   .]
      #   9| W   2 @   0 
      #   A|       @  13 + 7[    .   .   .   .   .   .   .   B   .   .   .   .   .   .   .   .   .]
      #   B|       @  14 + 8[    .   .   .   .   .   .   .   .   C   .   .   .   .   .   .   .   .]
      #   C|       @  18 + 8[    .   .   .   .   .   .   .   .   D   .   .   .   .   .   .   .   .]
      #   D|       @  21 + 2[    .   .   E   .   .   .   .   .   .   .   .   .   .   .   .   .   .]
      #   E|       @  1F + 5[    .   .   .   .   .   F   .   .   .   .   .   .   .   .   .   .   .]
      #   F|       @  23 + 2[    .   .  10   .   .   .   .   .   .   .   .   .   .   .   .   .   .]
      #  10|       @  1E + 9[    .   .   .   .   .   .   .   .   .  11   .   .   .   .   .   .   .]
      #  11|       @  28 + 0[   12   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .]
      #  12| W   3 @   0 
      #  13|       @  24 + 5[    .   .   .   .   .  21   .   .   .   .   .  14   .   .   .   .   .]
      #  14|       @  25 + 5[    .   .   .   .   .  15   .   .   .   .   .   .   .   .   .   .   .]
      #  15|       @  25 + 6[    .   .   .   .   .   .  16   .   .   .   .   .   .   .   .   .   .]
      #  16|       @  29 + 3[    .   .   .  17   .   .   .   .   .   .   .   .   .   .   .   .   .]
      #  17| W   4 @   0 
      #  18|       @  2D + 0[   19   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .]
      #  19| W   5 @   0 
      #  1A| W   6 @   0 
      #  1B|       @  25 + 9[    .   .   .   .   .   .   .   .   .  1C   .   .   .   .   .   .   .]
      #  1C|       @  25 + B[    .   .   .   .   .   .   .   .   .   .   .  1D   .   .   .   .   .]
      #  1D|       @  31 + 0[   25   .   .   .   .   .   .   .   .   .  1E   .   .   .   .   .   .]
      #  1E| W   7 @   0 
      #  1F|       @  2C + 6[    .   .   .   .   .   .  20   .   .   .   .   .   .   .   .   .   .]
      #  20| W   8 @   0 
      #  21|       @  2C + 7[    .   .   .   .   .   .   .  22   .   .   .   .   .   .   .   .   .]
      #  22|       @  34 + 0[   23   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .]
      #  23|       @  33 + 2[    .   .  24   .   .   .   .   .   .   .   .   .   .   .   .   .   .]
      #  24| W   9 @   0 
      #  25|       @  36 + 0[   26   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .]
      #  26|       @  35 + 2[    .   .  27   .   .   .   .   .   .   .   .   .   .   .   .   .   .]
      #  27|       @  32 + 6[    .   .   .   .   .   .  28   .   .   .   .   .   .   .   .   .   .]
      #  28| W   A @   0 
      #  29|       @  37 + 2[    .   .  2A   .   .   .   .   .   .   .   .   .   .   .   .   .   .]
      #  2A|       @  37 + 5[    .   .   .   .   .  38   .   .   .  2B   .   .   .   .   .   .   .]
      #  2B|       @  3A + 0[   2C   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .]
      #  2C|       @  3B + 2[    .   .  2D   .   .   .   .   .   .   .   .   .   .   .   .   .   .]
      #  2D|       @  35 + 9[    .   .   .   .   .   .   .   .   .  2E   .   .   .   .   .   .   .]
      #  2E|       @  31 + E[    .   .   .   .   .   .   .   .   .   .   .   .   .   .  2F   .   .]
      #  2F|       @  3F + 2[    .   .  30   .   .   .   .   .   .   .   .   .   .   .   .   .   .]
      #  30|       @  3F + 3[    .   .   .  31   .   .   .   .   .   .   .   .   .   .   .   .   .]
      #  31| W   B @   0 
      #  32|       @  3E + 5[    .   .   .   .   .  33   .   .   .   .   .  36   .   .   .   0  3D]
      #  33|       @  39 + B[    .   .   .   .   .   .   .   .   .   .   .  34   .   .   .   .   .]
      #  34|       @  36 + F[    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .  35   .]
      #  35| W   C @   0 
      #  36|       @  41 + 5[    .   .   .   .   .  37   .   .   .   .   .   .   .   .   .   .   .]
      #  37| W   D @   0 
      #  38|       @  40 + 7[    .   .   .   .   .   .   .  39   .   .   .   .   .   .   .   .   .]
      #  39|       @  3D + B[    .   .   .   .   .   .   .   .   .   .   .  3A   .   .   .   .   .]
      #  3A|       @  3A +10[    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .  3B]
      #  3B|       @  48 + 3[    .   .   .  3C   .   .   .   .   .   .   .   .   .   .   .   .   .]
      #  3C| W   E @   0 
      #  3D|       @  43 + 9[    .   .   .   .   .   .   .   .   .  3E   .   .   .   .   .   .   .]
      #  3E| W   F @   0 
    word_info N:(prev,len)= 1:(0,5) 2:(0,3) 3:(0,9) 4:(0,5) 5:(0,4) 6:(0,1) 7:(0,4) 8:(0,3) 9:(0,5) 10:(0,7) 11:(0,9) 12:(0,4) 13:(0,3) 14:(0,7) 15:(0,3)
Stclass Failtable (63 states): 0, 0, 1, 1, 1, 41, 42, 1, 1, 1, 1, 1, 50, 50, 1, 1, 1, 1, 2, 1, 1, 1, 10, 41, 7, 2, 1, 1, 1, 1, 19, 1, 10, 1, 1, 2, 1, 2, 2, 1, 10, 1, 1, 1, 2, 1, 1, 1, 1, 41, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 41, 1, 1
Final program:
   1: OPEN1 (3)
   3:   TRIEC-EXACT<S:1/62 W:15 L:1/9 C:72/17>[Iadfkstw] (57)
        <these> 
        <are> 
        <different> 
        <words> 
        <that> 
        <I> 
        <know> 
        <and> 
        <write> 
        <knotted> 
        <sentences> 
        <from> 
        <for> 
        <serious> 
        <fun> 
  57: CLOSE1 (59)
  59: END (0)
stclass AHOCORASICKC-EXACT<S:1/62 W:15 L:1/9 C:72/17>[Iadfkstw] minlen 1 
Freeing REx: "(these|are|different|words|that|I|know|and|write|knotted|sen"...

We build a map table that converts the unique characters into ids, which are then represented by columns in the tables.

The first table shows the "native construction" (tabular), the second shows the same table in compressed form.

The idea behind the compression is to overlay states on top of each other so they can use each other "fail transitions" for their own "real transitions", by using a "check" value for each transition. When a state looks at a transition that it doesn't own, it knows it is actually a fail transition. This means each state has its own offset into the transition blob. Notice that in the typical state transition table there are many states that only have one transition, in fact besides the initial state most of the state transitions are fails.

This means that where the typical dfa loop would be something like:

while (state && str < str_end) {
   input = *str++;
   state = trans[state * num_states + input];
}

we have code more like this:

while (state && str < str_end) {
   input= chr_map[*str++];
   if (!input) break;
   else input--;
   ofs = state_offset[state];
   if (trans[ofs + input].check != state) break;
   state = trans[ofs+input].state;
}

I would guess this extra bookkeeping accounts for some of the slowdown, but it is also what makes it possible to have patterns with thousands of distinct characters in them without blowing out ram too much. The other thing is each trie opcode has a fairly high setup cost in the regex engine itself. We have to initialize various vars to have the right tables, etc. It is certainly possible we could rethink things so that we can use a more efficient loop, and reduce the setup costs. I am not sure I am up to the task in the short term. But i thought would dump this here for feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants