New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A multiline regex that starts with /^/m is much slower than the corresponding one that starts with /\n/ #16094
Comments
From @shlomifA multiline regex that starts with /^/m is much slower than the corresponding The code can be reduced to the following: my $nomatch = <<EOF; my $match = <<EOF; $_ = ($nomatch x 10_000) . $match; $ time perl5260o ~/tmp/p 0; time perl5260o ~/tmp/p 1 real 0m0.004s real 0m0.691s It's probably down to this in regexec_flags(): /* note that with PREGf_IMPLICIT, intuit can only fail Basically the '^' causes it to (fruitlessly) run intuit at the start of I may need to revisit that decision. The whole 'pick the next viable start =========== Please look into fixing it in a future version. |
Searching a file with 200 thousand lines for certain patterns using the following:
The file contains 48 matches.
|
@joevt can you provide a way for us to replicate your findings, without having to attach a very large file? Perhaps by pointing to some corpus already available on the web that has similar timings, or by constructing the corpus from multiple copies of something much smaller? This perlmonks node shows an example of how that might be possible. Note that #17427 also reports an issue quite similar to this ticket, but is not specific to |
@hvds I'm having difficulty trying to contrive a method to create a file that reproduces the results since I don't know what about the file is causing the problem. Is 400K zipped too large? The only problem with I found that |
It's probably a bit large for here, but you can email it to me - my email address is under Hugo in AUTHORS. Hopefully some analysis should let me create a more succinct reproducer. |
|
Migrated from rt.perl.org#131822 (status was 'new')
Searchable as RT131822$
The text was updated successfully, but these errors were encountered: