Skip Menu |
Report information
Id: 123071
Status: resolved
Priority: 0/
Queue: perl5

Owner: Nobody
Requestors: horos11 [at] gmail.com
Cc:
AdminCc:

Operating System: (no value)
PatchStatus: (no value)
Severity: low
Type: unknown
Perl Version: (no value)
Fixed In: 5.22.0



Subject: substitution loop issue with long strings
CC: horos11 [...] gmail.com
Download (untitled) / with headers
text/plain 2.5k
I’m creating a ticket for this, so it is easier to track. In <CAH+_n-4R1tO7nxEu6ehRkNZfO+0reMdcU7wGK-YMF3SqHchoOw@mail.gmail.com> Edward Peschko wrote: Show quoted text
> I'm getting the following problem when doing a substitution on a large string: > > Substitution loop at ... > > Is there a way to override this error? As it is, its annoying because > acts as a de-facto built in limitation on the size of strings that you > can substitute. > > Is this fixed in the latest versions of perl?
And in <CAH+_n-4-aiGqxHDOdwd9NB-xbGkGfaEMOjsGyeSSZKUEi6R-mw@mail.gmail.com> he wrote: Show quoted text
> Its very easy to reproduce: > > local($/) = undef; > open(FD, "very_large_file.txt"); # say with the alphabet printed over and over, one per line, 2 GB in total size > my $line = <FD>; > close(FD); > > do a substitution where the size of substitution is greater than the > thing its replacing, ie: > > $line =~ s#a#bbb#sg; > > and you'll get 'Substitution loop at ... line ...' > > And no - the 'substitution loop' description as described in perldiag > doesn't apply. Any replacement string doesn't work (where it is longer > than the original). There are only a finite number of 'a's in the > source string - so my guess is what is happening is perl is keeping > some counter of substitutions, and that counter is overflowing.
That’s exactly what’s happening. The sbu_iters and sbu_maxiters members defined in cop.h are of type I32. (And this bug is *old*. Perl 1 had a fixed limit of 10000. Perl 4 started calculating the maximum number of iterations based on the string length, fixing the bug, but in such a way that when 64-bit systems came along it resurfaced. So since Perl 4 the bug is as old as 64-bit systems.) We could fix this by changing those two struct members to SSize_t. But if that would enlarge the struct subst/struct blk union defined in cop.h, it might be worthwhile considering skipping the check altogether for long strings. After all, if substitution loops, it is because of a bug in perl; and if that bug does occur then it is likely to happen regardless of the length of the string. (Right?) So it will be caught even if the check is skipped for long strings. Now, to work around the bug, you would have to do a while() loop instead of substituting all at once. But that will still fail in 5.18 and earlier, because it was not until 5.20 that the regular expression gained support for strings longer than 2GB. Another thing you could do is split your string into smaller strings and concatenate them afterwards. But only you can tell whether that will work for your code. -- Father Chrysostomos
RT-Send-CC: perl5-porters [...] perl.org
Download (untitled) / with headers
text/plain 895b
On Mon Oct 27 21:40:18 2014, sprout wrote: Show quoted text
> I’m creating a ticket for this, so it is easier to track. >
[snip] Show quoted text
> > That’s exactly what’s happening. The sbu_iters and sbu_maxiters > members defined in cop.h are of type I32. >
[snip] Show quoted text
> > We could fix this by changing those two struct members to SSize_t. > But if that would enlarge the struct subst/struct blk union defined in > cop.h, it might be worthwhile considering skipping the check > altogether for long strings.
Father C, which of these two alternatives do you think we should pursue? (Or, are there others?) Show quoted text
> After all, if substitution loops, it is > because of a bug in perl; and if that bug does occur then it is likely > to happen regardless of the length of the string. (Right?) So it > will be caught even if the check is skipped for long strings. >
Thank you very much. -- James E Keenan (jkeenan@cpan.org)
RT-Send-CC: perl5-porters [...] perl.org
Download (untitled) / with headers
text/plain 810b
On Fri Nov 14 19:01:36 2014, jkeenan wrote: Show quoted text
> On Mon Oct 27 21:40:18 2014, sprout wrote:
> > I’m creating a ticket for this, so it is easier to track. > >
> [snip]
> > > > That’s exactly what’s happening. The sbu_iters and sbu_maxiters > > members defined in cop.h are of type I32. > >
> [snip]
> > > > We could fix this by changing those two struct members to SSize_t. > > But if that would enlarge the struct subst/struct blk union defined > > in > > cop.h, it might be worthwhile considering skipping the check > > altogether for long strings.
> > Father C, which of these two alternatives do you think we should > pursue? (Or, are there others?)
That would depend on whether using 64-bit values to records the iterations enlarges the struct. I haven’t checked yet. -- Father Chrysostomos
RT-Send-CC: perl5-porters [...] perl.org
Fixed in 3c6ef0a. This turned out to be the same bug as #103260. -- Father Chrysostomos
Subject: Your ticket against Perl 5 has been resolved
Download (untitled) / with headers
text/plain 222b
Thanks for submitting this ticket The issue should be resolved with the release today of Perl v5.22. If you find that the problem persists, feel free to reopen this ticket -- Karl Williamson for the Perl 5 porters team


This service is sponsored and maintained by Best Practical Solutions and runs on Perl.org infrastructure.

For issues related to this RT instance (aka "perlbug"), please contact perlbug-admin at perl.org