New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some bugs in Perl regexp (core Perl issues) #9408
Comments
From til@schubbe.orgCreated by til@schubbe.orgtil@debian:~ - perl -e 'if ("a1" =~ m/(^|\D)(?=\d)1/) {print "1\n"} else The 1. expression should be true, but it isn't. This behaviour changes The bug exists in v5.8.3 and v5.8.2, too. Regards Perl Info
|
From japhy@pobox.comOn May 12, Til Schubbe said:
Here's the re=debug output for a simpler case: perlmonk:~ 1779:$ perl -mre=debug -e '"a1" =~ /(a|^)(?=1)1/' This: Guessing start of match, REx `(a|^)(?=1)1' against `a1'... appears to be the culprit. This guessing and moving doesn't occur if /i -- |
The RT System itself - Status changed from 'new' to 'open' |
From g+i@gameintellect.comHello, I have found some simple and unpleasant bugs in Perl regexp: print "Match" if 'ab' =~ /^a?(?=b)b/; # Not match, but must... Also you can replace ^ with \A, and ? with *. Here are bugs similar to the above: print $& if 'ab' =~ /a?(?=b)b/; Both operators print b, but must print ab. Here is my bug report at ActiveState: -- |
From mail@der-pepe.deI checked some old perl installations and it appears that the bug has |
The RT System itself - Status changed from 'new' to 'open' |
From @AbigailOn Mon, Jul 07, 2008 at 11:30:05PM -0700, Serge wrote:
Here are some tests for this bug: Inline Patch--- t/op/re_tests.orig 2008-04-11 14:20:20.000000000 +0200
+++ t/op/re_tests 2008-07-08 18:43:39.000000000 +0200
@@ -1344,4 +1344,7 @@
.*?(?:(\w)|(\w))x abx y $1-$2 b-
0{50} 000000000000000000000000000000000000000000000000000 y - -
+# Bug #56690
+^a?(?=b)b ab y $& ab
+^a*(?=b)b ab y $& ab
|
From g+i@gameintellect.comHello Christoph Bussenius, this is another bug in regexp: the special variable $^N does not work, when print "\$1=$1, \$^N=$^N, \$+=$+" if 'ab' =~ /(\w)/; Output: This work well. Now with a quantifier: print "\$1=$1, \$^N=$^N, \$+=$+" if 'ab' =~ /(\w)+/; Output: $^N is undefined! It is a bug. my $a='bbb'; $a is undefined. I hope the both bugs I found will be fixed in the next Perl build, thanks. -- |
From @moritzOn Wed Jul 09 07:50:31 2008, g+i@gameintellect.com wrote:
it says
It says
These two are fixed in 5.10.0 already. Cheers, |
The RT System itself - Status changed from 'new' to 'open' |
From @smpetersOn Tue, Jul 8, 2008 at 11:48 AM, Abigail <abigail@abigail.be> wrote:
This patch (skipping the tests for now) has been added as change #34116. Thanks, Steve Peters |
From @demerphq2008/7/9 via RT Serge <perlbug-followup@perl.org>:
Actually in my book the result of a capture buffer with a quantifier What should it hold? The content of the last matched thing? Or the content of the thing it didnt match which stopped the Curently /in this pattern/ its the latter. Which to me is just as
I doubt this will be fixed for the above mentioned reasons and because Luckily in most of these situations there is a workaround, you can 'ab'=~/\w*(\w)(?{ Cheers, -- |
From @demerphq2008/7/11 Moritz Lenz <moritz@casella.verplant.org>:
My point is that both are acceptable behaviours given that the If you poke around Im pretty sure youll find that the behaviour of I did look into all of this at one time and it was not fun. But ok, in this context yes ill grant that its a bug that $^N and $1 Yves -- |
From @moritzdemerphq wrote:
Yes. Just like $1. Why should it behave any different than $1 if there's It might seem weird in retrospect to implement $1, $2... this way, but I
|
From @nwc10Dave notes: actually exists in 5.8.x too, but looks like a good one to fix for 5.10.1 |
From @hvdsOn Tue Jul 08 14:49:43 2008, abigail@abigail.be wrote:
This is caused by a failure of the start_class optimization in the case In more detail: at the point study_chunk() attempts to deal with the So given: In other words, we need to stack an alternation of ANDs and ORs to cope A simpler short-term fix is instead to throw up our hands in this Hugo |
From @hvds |
From p5p@spam.wizbit.beBinary search: ----Program---- print "ok" if "ab" =~ m/^a?(?=b)b/ ----Output of ...IZtxUq/perl-5.005_62@4668/bin/perl---- ----EOF ($?='0')---- ----EOF ($?='0')---- http://perl5.git.perl.org/perl.git/commit/ apply change#4618 again along with Ilya's patch to fix bugs p4raw-link: @4622 on //depot/perl: 34baa6c p4raw-id: //depot/perl@4669 |
From @hvds"Hugo van der Sanden via RT" <perlbug-followup@perl.org> wrote: This patch implements the simple fix, and passes all tests including Yves: note that I've preserved the 'was' code in this chunk, introduced Hugo [1] http://perl5.git.perl.org/perl.git/commit/b515a41db88584b4fd1c30cf890c92d3f9697760 Inline Patch--- regcomp.c.old 2009-06-18 10:21:11.000000000 +0100
+++ regcomp.c 2009-07-02 11:16:29.000000000 +0100
@@ -3727,11 +3727,22 @@
data->whilem_c = data_fake.whilem_c;
}
if (f & SCF_DO_STCLASS_AND) {
- const int was = (data->start_class->flags & ANYOF_EOS);
-
- cl_and(data->start_class, &intrnl);
- if (was)
- data->start_class->flags |= ANYOF_EOS;
+ if (flags & SCF_DO_STCLASS_OR) {
+ /* OR before, AND after: ideally we would recurse with
+ * data_fake to get the AND applied by study of the
+ * remainder of the pattern, and then derecurse;
+ * *** HACK *** for now just treat as "no information".
+ * See [perl #56690].
+ */
+ cl_init(pRExC_state, data->start_class);
+ } else {
+ /* AND before and after: combine and continue */
+ const int was = (data->start_class->flags & ANYOF_EOS);
+
+ cl_and(data->start_class, &intrnl);
+ if (was)
+ data->start_class->flags |= ANYOF_EOS;
+ }
}
}
#if PERL_ENABLE_POSITIVE_ASSERTION_STUDY
--- t/op/re_tests.old 2009-06-18 10:21:11.000000000 +0100
+++ t/op/re_tests 2009-07-02 11:21:31.000000000 +0100
@@ -1365,8 +1365,8 @@
.*?(?:(\w)|(\w))x abx y $1-$2 b-
0{50} 000000000000000000000000000000000000000000000000000 y - -
-^a?(?=b)b ab B $& ab # Bug #56690
-^a*(?=b)b ab B $& ab # Bug #56690
+^a?(?=b)b ab y $& ab # Bug #56690
+^a*(?=b)b ab y $& ab # Bug #56690
/>\d+$ \n/ix >10\n y $& >10
/>\d+$ \n/ix >1\n y $& >1
/\d+$ \n/ix >10\n y $& 10 |
From @craigberryOn Thu, Jul 2, 2009 at 5:36 AM, <hv@crypt.org> wrote:
Thanks, applied here: <http://perl5.git.perl.org/perl.git/commitdiff/906cdd2> |
@hvds - Status changed from 'open' to 'resolved' |
From @cpansproutThis bug (the same as #56690) was fixed by commit 906cdd2. |
@cpansprout - Status changed from 'open' to 'resolved' |
Migrated from rt.perl.org#56690 (status was 'resolved')
Searchable as RT56690$
The text was updated successfully, but these errors were encountered: