Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow global pattern match with input from utf8 #13455

Closed
p5pRT opened this issue Dec 4, 2013 · 5 comments
Closed

Slow global pattern match with input from utf8 #13455

p5pRT opened this issue Dec 4, 2013 · 5 comments

Comments

@p5pRT
Copy link

p5pRT commented Dec 4, 2013

Migrated from rt.perl.org#120692 (status was 'resolved')

Searchable as RT120692$

@p5pRT
Copy link
Author

p5pRT commented Dec 4, 2013

From @hknutzen

There is a massive slowdown in global pattern match with Perl v5.19.6.

Create test data with this shell command line​:
$ for i in $(seq 1 40000) ; do echo -n ab; done > abab

$ perlbrew use perl-5.18.0
$ /usr/bin/time -f '%Us' perl -Ci -e '$in = <>;while ($in =~
m/\Ga+ba+b/g) {}' abab
0.02s
$ perlbrew use perl-5.19.5
$ /usr/bin/time -f '%Us' perl -Ci -e '$in = <>;while ($in =~
m/\Ga+ba+b/g) {}' abab
0.03s
$ perlbrew use perl-5.19.6
$ /usr/bin/time -f '%Us' perl -Ci -e '$in = <>;while ($in =~
m/\Ga+ba+b/g) {}' abab
3.50s

@p5pRT
Copy link
Author

p5pRT commented Dec 5, 2013

From @wolfsage

On Wed, Dec 4, 2013 at 2​:20 PM, Heinz Knutzen <perlbug-followup@​perl.org> wrote​:

There is a massive slowdown in global pattern match with Perl v5.19.6.

Create test data with this shell command line​:
$ for i in $(seq 1 40000) ; do echo -n ab; done > abab

$ perlbrew use perl-5.18.0
$ /usr/bin/time -f '%Us' perl -Ci -e '$in = <>;while ($in =~
m/\Ga+ba+b/g) {}' abab
0.02s
$ perlbrew use perl-5.19.5
$ /usr/bin/time -f '%Us' perl -Ci -e '$in = <>;while ($in =~
m/\Ga+ba+b/g) {}' abab
0.03s
$ perlbrew use perl-5.19.6
$ /usr/bin/time -f '%Us' perl -Ci -e '$in = <>;while ($in =~
m/\Ga+ba+b/g) {}' abab
3.50s

Fire number two and you've hit the fix for your previous RT ticket
about a similar issue
(https://rt-archive.perl.org/perl5/Ticket/Display.html?id=120446)

  $ for i in $(seq 1 40000) ; do echo -n ab; done > abab
  $ ../perl-1/Porting/bisect.pl -j 4 --start=v5.19.5 --end=v5.19.6
--timeout=2 -- ./perl -Ci -e '$in = <>; while ($in =~ m/\Ga+ba+b/g)
{}' /abab
  HEAD is now at 0b2c2a8 RT #120446​: /\Ga/ running slowly on long strings
  bad - Timeout when running ./perl -Ci -e $in = <>; while ($in =

m/\Ga+ba+b/g) {} /home/mhorsfall/abab
  0b2c2a8 is the first bad commit
  commit 0b2c2a8
  Author​: David Mitchell <davem@​iabyn.com>
  Date​: Tue Nov 5 12​:29​:07 2013 +0000

  RT #120446​: /\Ga/ running slowly on long strings

  This commit reverts my commit cf44e60
  (except for the tests), which incorrectly disabled fix-string intuiting
  in the presence of anchored \G. I thought that the old behaviour was
  logically incorrect, but it wasn't (or at least I don't see it that way
  now, and none of the tests I added at the time fail under the old regime).

  :100644 100644 ab2c18ed8013b6051a5e41231f45f3ddda9065f1
7f84fcb43e82cb26301973ac251ce8cb1d5c2ea1 M regexec.c
  :040000 040000 9f53368d22ff2cfb02b58235673cd51c480b352a
c4e89c9eae1ea58316d56d6cfbac00ba29ed36f9 M t
  bisect run success
  That took 792 seconds

Sorry Dave :|

-- Matthew Horsfall (alh)

@p5pRT
Copy link
Author

p5pRT commented Dec 5, 2013

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Feb 9, 2014

From @tonycoz

On Thu Dec 05 12​:51​:48 2013, alh wrote​:

On Wed, Dec 4, 2013 at 2​:20 PM, Heinz Knutzen <perlbug-
followup@​perl.org> wrote​:

There is a massive slowdown in global pattern match with Perl
v5.19.6.

Create test data with this shell command line​:
$ for i in $(seq 1 40000) ; do echo -n ab; done > abab

$ perlbrew use perl-5.18.0
$ /usr/bin/time -f '%Us' perl -Ci -e '$in = <>;while ($in =~
m/\Ga+ba+b/g) {}' abab
0.02s
$ perlbrew use perl-5.19.5
$ /usr/bin/time -f '%Us' perl -Ci -e '$in = <>;while ($in =~
m/\Ga+ba+b/g) {}' abab
0.03s
$ perlbrew use perl-5.19.6
$ /usr/bin/time -f '%Us' perl -Ci -e '$in = <>;while ($in =~
m/\Ga+ba+b/g) {}' abab
3.50s

Fire number two and you've hit the fix for your previous RT ticket
about a similar issue
(https://rt-archive.perl.org/perl5/Ticket/Display.html?id=120446)

$ for i in $(seq 1 40000) ; do echo -n ab; done > abab
$ ../perl-1/Porting/bisect.pl -j 4 --start=v5.19.5 --end=v5.19.6
--timeout=2 -- ./perl -Ci -e '$in = <>; while ($in =~ m/\Ga+ba+b/g)
{}' /abab
HEAD is now at 0b2c2a8 RT #120446​: /\Ga/ running slowly on long
strings
bad - Timeout when running ./perl -Ci -e $in = <>; while ($in =

m/\Ga+ba+b/g) {} /home/mhorsfall/abab
0b2c2a8 is the first bad commit
commit 0b2c2a8
Author​: David Mitchell <davem@​iabyn.com>
Date​: Tue Nov 5 12​:29​:07 2013 +0000

RT #120446​: /\Ga/ running slowly on long strings

This commit reverts my commit cf44e60
(except for the tests), which incorrectly disabled fix-string
intuiting
in the presence of anchored \G. I thought that the old behaviour was
logically incorrect, but it wasn't (or at least I don't see it that
way
now, and none of the tests I added at the time fail under the old
regime).

:100644 100644 ab2c18ed8013b6051a5e41231f45f3ddda9065f1
7f84fcb43e82cb26301973ac251ce8cb1d5c2ea1 M regexec.c
:040000 040000 9f53368d22ff2cfb02b58235673cd51c480b352a
c4e89c9eae1ea58316d56d6cfbac00ba29ed36f9 M t
bisect run success
That took 792 seconds

Sorry Dave :|

I think this is fixed with Dave's "fix and refactor re_intuit_start()" merge.

Some timing runs, 3 runs each​:

5.18.0​:

0.08s
0.08s
0.08s

blead@​ v5.19.8-311-g74d8bc5

236.86s
236.89s
236.83s

blead@​ v5.19.8-430-g3a3ee36

0.10s
0.10s
0.10s

So it's a little slower, but not the massive slowdown that this bug refers to.

Tony

@p5pRT
Copy link
Author

p5pRT commented Feb 10, 2014

@iabyn - Status changed from 'open' to 'resolved'

@p5pRT p5pRT closed this as completed Feb 10, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant