New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
(?>) causes wrongness on long string #9539
Comments
From zefram@fysh.orgCreated by zefram@fysh.org$ perl -lwe '$a="xyzt"x1000000; print $a =~ /\A(?>[a-z])*\z/ ? "yes" : "no"' It gives the correct answer "yes" if the input is repeated only 1000 Perl Info
|
From @nwc10Andreas, would you be able to work out which change in 5.8.x caused this? On Tue, Oct 21, 2008 at 04:18:18AM -0700, Zefram wrote:
Verified under blead, maint-5.10.x *and* maint-5.8.x Gah. That's a regression. And there was I thinking that I'd nailed them all. $ ./perl -lwe '$a="xyzt"x8191; print $a =~ /\A(?>[a-z])*\z/ ? "yes" : "no"' What's so special about 32768 here? (the length of the failing string) Nicholas Clark |
The RT System itself - Status changed from 'new' to 'open' |
From @AbigailOn Tue, Oct 21, 2008 at 01:51:09PM +0100, Nicholas Clark wrote:
Hard coded constant from a way back, I think, when the regexp engine was $ perl -Mre=debug -cE '/\A(?>[a-z])*\z/' Abigail |
From @nwc10On Tue, Oct 21, 2008 at 01:51:09PM +0100, Nicholas Clark wrote:
Don't worry. Vincent said on IRC that it would probably be change 29916 and So, next question. What's the fix? It doesn't quite revert cleanly. Plus I Nicholas Clark Change 29916 by nicholas@nicholas-saigo on 2007/01/22 16:26:58 Integrate: Affected files ... ... //depot/maint-5.8/perl/embed.fnc#176 integrate Differences ... ==== //depot/maint-5.8/perl/embed.fnc#176 (text) ==== @@ -1261,7 +1261,6 @@ ==== //depot/maint-5.8/perl/embed.h#131 (text+w) ==== @@ -1291,7 +1291,6 @@ ==== //depot/maint-5.8/perl/proto.h#165 (text+w) ==== @@ -1845,9 +1845,6 @@ -STATIC I32 S_regrepeat_hard(pTHX_ regnode *p, I32 max, I32 *lp) ==== //depot/maint-5.8/perl/regexec.c#68 (text) ==== @@ -3399,7 +3399,9 @@ @@ -3467,7 +3496,7 @@ @@ -3511,19 +3538,20 @@ -/* /* |
From perl@profvince.com
From how I understand the problem, maxwanted is REG_INFTY (i.e. 32767) Vincent. |
From @iabynOn Tue, Oct 21, 2008 at 05:09:28PM +0200, Vincent Pit wrote:
Without looking too closely, I presume that the "maxwanted == REG_INFTY in I guess things might be a bit more complex for bleed/5.10.x -- |
From perl@profvince.com
Yes, this makes sense. Patch attached, ok with maint-5.8@34560. |
From perl@profvince.comperl-re-long-fix.patch--- regexec.c 2008-09-22 15:53:35.000000000 +0200
+++ regexec.c 2008-10-21 16:43:43.000000000 +0200
@@ -3192,7 +3192,8 @@
PL_reginput = locinput;
maxwanted = minmod ? ln : n;
if (maxwanted) {
- while (PL_reginput < PL_regeol && matches < maxwanted) {
+ while (PL_reginput < PL_regeol && (maxwanted == REG_INFTY
+ || matches < maxwanted)) {
if (!regmatch(scan))
break;
/* on first match, determine length, l */
--- t/op/pat.t 2008-09-22 15:53:35.000000000 +0200
+++ t/op/pat.t 2008-10-21 16:49:52.000000000 +0200
@@ -3845,6 +3845,12 @@
'IsPunct agrees with [:punct:] with explicit Latin1');
}
+# [perl #60034]
+{
+ my $a = "xyzt" x 8192;
+ ok($a =~ /\A(?>[a-z])*\z/, '(?>) does not cause wrongness on long string');
+}
+
# Test counter is at bottom of file. Put new tests above here.
#-------------------------------------------------------------------
# Keep the following tests last -- they may crash perl
@@ -3860,4 +3866,4 @@
# Put new tests above the dotted line about a page above this comment
-BEGIN{print "1..1273\n"};
+BEGIN{print "1..1274\n"};
|
From zefram@fysh.orgI wrote:
I discovered a variant on this bug. If the string is represented in $ perl -lwe '$a="xyzt"x1000000; utf8::upgrade($a); print $a =~ /\A(?>[a-z])*\z/ ? "yes" : "no"' -zefram |
From @nwc10On Thu, Oct 23, 2008 at 12:40:36AM +0200, Vincent Pit wrote:
Is adding that ^^^^^^^^^^^^^^^^^^^^^^ Given that it seems correct for 5.8.x, isn't it correct for all? Nicholas Clark |
From perl@profvince.com
In the spirit, it should be. However, the code has changed since 5.10 |
From @nwc10On Thu, Oct 23, 2008 at 12:40:36AM +0200, Vincent Pit wrote:
Thanks, applied as change 34580. On Thu, Oct 23, 2008 at 07:07:36PM +0200, Vincent Pit wrote:
Gosh yes. It's not the same code now, is it? I can merge the tests as TODOs into blead. Nicholas Clark |
From perl@profvince.com
|
From @nwc10On Sat, Oct 25, 2008 at 10:49:49AM +0200, Vincent Pit wrote:
Done as change 34581. Nicholas Clark |
@smpeters - Status changed from 'open' to 'resolved' |
From zefram@fysh.orgSteve Peters via RT wrote:
This has not been resolved. A patch went into 5.8.9 for one form of this -zefram |
From @iabynOn Sat, Oct 25, 2008 at 10:13:14AM +0100, Nicholas Clark wrote:
Now fixed in bleed, and tests unTODOed, -- |
From @demerphq2008/10/21 via RT Zefram <perlbug-followup@perl.org>:
Should we be concerned that this pattern makes no sense? I mean it should be /\A(?>[a-z]*)\z/ and in fact, we should be able to statically optimize this so that it I think this is an interesting class of pattern, one that could be If it was part of a larger pattern that required the RNFA engine to So ultimately there will ALWAYS be the requirement for some sort of IMO 32 bits is not enough, it needs to be a 64 bit value. Im not Yves -- |
From @nwc10On Mon, Mar 23, 2009 at 10:01:11AM +0100, demerphq wrote:
Not if the platform doesn't have a 64 bit type
Yes. And, I think I need a signature, or at least a block comment, roughly "This seems like a good idea. Note I've not thought through all the (Following on from an unrelated private conversation with Yves) Nicholas Clark |
From @iabynOn Mon, Mar 23, 2009 at 10:01:11AM +0100, demerphq wrote:
I presume the regexp above just happens to excercise the bug, so it $ perl5100 -le 'print "ok" if ("ab" x 32767) =~ /^(ab)*$/;' I'm not really in a position to comment on the rest of your post,
-- |
From @demerphq2009/3/23 Nicholas Clark <nick@ccl4.org>:
So how do we decide what that is?
Ohh, did i annoy you with that conversation? I didnt mean to. And no, its a regex thing, ill take care of it if i can, not expecting Yves -- |
From @nwc10On Mon, Mar 23, 2009 at 10:26:20PM +0100, demerphq wrote:
Well, it's axiomatic in C89 that it's called "long". I suspect that size_t will do the job perfectly.
No, you didn't. It was more that you'd know where I was coming from if I
Rah! Nicholas Clark * and also cough, atof(), cough |
From p5p@spam.wizbit.besetting status to open again (it looks like there are unresolved issues) |
p5p@spam.wizbit.be - Status changed from 'resolved' to 'open' |
From @iabynI'm closing this again, as the original bug is now fixed on 5.8.9, |
@iabyn - Status changed from 'open' to 'resolved' |
Migrated from rt.perl.org#60034 (status was 'resolved')
Searchable as RT60034$
The text was updated successfully, but these errors were encountered: