Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Combination of quantifier and capturing group gives incorrect results in Rakudo (nqp) #2500

Closed
p6rt opened this issue Oct 4, 2011 · 5 comments
Labels

Comments

@p6rt
Copy link

p6rt commented Oct 4, 2011

Migrated from rt.perl.org#100650 (status was 'resolved')

Searchable as RT100650$

@p6rt
Copy link
Author

p6rt commented Oct 4, 2011

From @masak

<flussence> I broke nom last night!
<flussence> nom​: say ('aabaa' ~~ /\N+ b/).perl
<p6eval> nom 834d9d​: OUTPUT«Match.perl(orig => "aabaa", from => 0, to
=> 3, ast => Mu, list => ().list, hash => EnumMap.new())␤»
<flussence> or I did, somehow
* moritz doesn't see the brokenness offhand
<flussence> nom​: say ('aaaaabaaaaa' ~~ /^(<[a..z]>*) b/).perl;
<p6eval> nom 834d9d​: OUTPUT«Match.perl(orig => "aaaaabaaaaa", from =>
11, to => -3, ast => Mu, list => ().list, hash => EnumMap.new())␤»
<masak> huh!
<masak> that's... wrong.

(Note that the match fails, hence the 'to => -3'. There's clearly a
way for the regex to match, namely to map 'aaaaa' to the capturing
group and then match the 'b' literally.)

<flussence> it seems to not work with anything resembling a character class
<flussence> nom​: say ('aaaaabaaaaa' ~~ /^(\N*) b/).perl;
<p6eval> nom 834d9d​: OUTPUT«Match.perl(orig => "aaaaabaaaaa", from =>
11, to => -3, ast => Mu, list => ().list, hash => EnumMap.new())␤»
<moritz> flussence​: it seems that the combincation of * and captures
somehow breaks
<flussence> nom​: say ('aaaaabaaaaa' ~~ /(\N*) b/).perl;
<p6eval> nom 834d9d​: OUTPUT«Match.perl(orig => "aaaaabaaaaa", from =>
11, to => -3, ast => Mu, list => ().list, hash => EnumMap.new())␤»
<flussence> hm
<moritz> nom​: say so ('aaaaabaaaaa' ~~ /^(\N*) b/).perl;
<p6eval> nom 834d9d​: OUTPUT«Bool​::True␤»
<moritz> nom​: say so ('aaaaabaaaaa' ~~ /^(\N*) b/)
<p6eval> nom 834d9d​: OUTPUT«Bool​::False␤»
<masak> moritz​: is there an RT ticket for that yet?
<moritz> masak​: I'm not aware of any
* masak submits rakudobug

@p6rt
Copy link
Author

p6rt commented Oct 4, 2011

From @masak

On Tue Oct 04 02​:21​:54 2011, masak wrote​:

<flussence> I broke nom last night!
<flussence> nom​: say ('aabaa' ~~ /\N+ b/).perl
<p6eval> nom 834d9d​: OUTPUT«Match.perl(orig => "aabaa", from => 0, to
=> 3, ast => Mu, list => ().list, hash => EnumMap.new())␤»
<flussence> or I did, somehow
* moritz doesn't see the brokenness offhand
<flussence> nom​: say ('aaaaabaaaaa' ~~ /^(<[a..z]>*) b/).perl;
<p6eval> nom 834d9d​: OUTPUT«Match.perl(orig => "aaaaabaaaaa", from =>
11, to => -3, ast => Mu, list => ().list, hash => EnumMap.new())␤»
<masak> huh!
<masak> that's... wrong.

(Note that the match fails, hence the 'to => -3'. There's clearly a
way for the regex to match, namely to map 'aaaaa' to the capturing
group and then match the 'b' literally.)

<flussence> it seems to not work with anything resembling a character
class
<flussence> nom​: say ('aaaaabaaaaa' ~~ /^(\N*) b/).perl;
<p6eval> nom 834d9d​: OUTPUT«Match.perl(orig => "aaaaabaaaaa", from =>
11, to => -3, ast => Mu, list => ().list, hash => EnumMap.new())␤»
<moritz> flussence​: it seems that the combincation of * and captures
somehow breaks
<flussence> nom​: say ('aaaaabaaaaa' ~~ /(\N*) b/).perl;
<p6eval> nom 834d9d​: OUTPUT«Match.perl(orig => "aaaaabaaaaa", from =>
11, to => -3, ast => Mu, list => ().list, hash => EnumMap.new())␤»
<flussence> hm
<moritz> nom​: say so ('aaaaabaaaaa' ~~ /^(\N*) b/).perl;
<p6eval> nom 834d9d​: OUTPUT«Bool​::True␤»
<moritz> nom​: say so ('aaaaabaaaaa' ~~ /^(\N*) b/)
<p6eval> nom 834d9d​: OUTPUT«Bool​::False␤»
<masak> moritz​: is there an RT ticket for that yet?
<moritz> masak​: I'm not aware of any
* masak submits rakudobug

<flussence> nom​: say so ('aaaaabaaaaa' ~~ /^(a*) b/);
<p6eval> nom 7408d6​: OUTPUT«Bool​::True␤»
<flussence> it works fine for literals, not for char classes...
charset.t might fit
<moritz> nom​: say so ('aaaaabaaaaa' ~~ /^(.*) b/);
<p6eval> nom 7408d6​: OUTPUT«Bool​::False␤»
<moritz> nom​: say so ('aaaaabaaaaa' ~~ /^(<[a]>*) b/);
<p6eval> nom 7408d6​: OUTPUT«Bool​::True␤»
<moritz> nom​: say so ('aaaaabaaaaa' ~~ /^(<-[b]>*) b/);
<p6eval> nom 7408d6​: OUTPUT«Bool​::True␤»
<moritz> that kinda convinces me it's not char classes that are the
problem, but the fact that it needs to backtrack if that first thing
matches an 'a' too
<masak> aye.
<masak> I should've seen that too.
<masak> it's quantifiers and capturing groups that combine to create the
bug.
<moritz> nom​: say so 'aaaaaba' ~~ / ^ .* b/
<p6eval> nom 7408d6​: OUTPUT«Bool​::True␤»
<moritz> nom​: say so 'aaaaaba' ~~ / ^ (.*) b/
<p6eval> nom 7408d6​: OUTPUT«Bool​::False␤»
<moritz> nom​: say so 'aaaaaba' ~~ / ^ (.)* b/
<p6eval> nom 7408d6​: OUTPUT«Bool​::True␤»
* moritz agrees with masak's diagnosis
<flussence> nom​: say so ('ab' ~~ /^(.*) b/);
<p6eval> nom 7408d6​: OUTPUT«Bool​::False␤»
* masak adds this to the ticket

@p6rt
Copy link
Author

p6rt commented Oct 4, 2011

@masak - Status changed from 'new' to 'open'

@p6rt
Copy link
Author

p6rt commented Jan 30, 2012

From @jnthn

On Tue Oct 04 02​:21​:54 2011, masak wrote​:

<flussence> I broke nom last night!
<flussence> nom​: say ('aabaa' ~~ /\N+ b/).perl
<p6eval> nom 834d9d​: OUTPUT«Match.perl(orig => "aabaa", from => 0, to
=> 3, ast => Mu, list => ().list, hash => EnumMap.new())␤»
<flussence> or I did, somehow
* moritz doesn't see the brokenness offhand
<flussence> nom​: say ('aaaaabaaaaa' ~~ /^(<[a..z]>*) b/).perl;
<p6eval> nom 834d9d​: OUTPUT«Match.perl(orig => "aaaaabaaaaa", from =>
11, to => -3, ast => Mu, list => ().list, hash => EnumMap.new())␤»
<masak> huh!
<masak> that's... wrong.

(Note that the match fails, hence the 'to => -3'. There's clearly a
way for the regex to match, namely to map 'aaaaa' to the capturing
group and then match the 'b' literally.)

Fixed this one a while back.

say ('aaaaabaaaaa' ~~ /^(<[a..z]>*) b/).perl;
Match.new(orig => "aaaaabaaaaa", from => 0, to => 6, ast => Any, list =>
(Match.new(orig => "aaaaabaaaaa", from => 0, to => 5, ast => Any, list
=> ().list, hash => EnumMap.new()),).list, hash => EnumMap.new())

And turned on various tests after doing so. So, resolving ticket.

/jnthn

@p6rt
Copy link
Author

p6rt commented Jan 30, 2012

@jnthn - Status changed from 'open' to 'resolved'

@p6rt p6rt closed this as completed Jan 30, 2012
@p6rt p6rt added the Bug label Jan 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant