Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sequential alternation || does not respect :ratchet #5804

Closed
p6rt opened this issue Nov 16, 2016 · 7 comments
Closed

Sequential alternation || does not respect :ratchet #5804

p6rt opened this issue Nov 16, 2016 · 7 comments
Labels
regex Regular expressions, pattern matching, user-defined grammars, tokens and rules testneeded

Comments

@p6rt
Copy link

p6rt commented Nov 16, 2016

Migrated from rt.perl.org#130117 (status was 'resolved')

Searchable as RT130117$

@p6rt
Copy link
Author

p6rt commented Nov 16, 2016

From @AlexDaniel

*Code​:*
say â��abczâ�� ~~ /​:r [â��aâ�� || â��abcâ�� ] â��zâ�� /

*Result​:*
ï½¢abczï½£

:r (​:ratchet) should prevent backtracking (trying different ways to match a
string). However, it seems like it does not work as intended (or even at
all?).

To make the example above as clear as possible​: || does not do LTM (longest
token matching), so it will match things from left to right. �a� matches
just fine, so it proceeds. Then �z� will not match, at which point it
should give up because of :r. However, it goes back and tries �abc�, which
is exactly what somebody would expect from backtracking, but we are trying
to turn it off.

There are several ways to make it clear that backtracking actually happens
when using || with :r (if example above is not enough)​:

*Code​:*
grammar G { token TOP { [ �a� {say $/.CURSOR.pos} || {say $/.CURSOR.pos}
�abc� ] {say �here�} �z� } }; say G.parse(�abcz�)

*Result​:*
1
here
0
here
ï½¢abczï½£

*Code​:*
say â��abcdefghijâ�� ~~ /​:r [.||.||.||.||.||.||.||.] [.||.||.||.||.||.||.||.]
[.||.||.||.||.||.||.||.] [.||.||.||.||.||.||.||.] [.||.||.||.||.||.||.||.]
[.||.||.||.||.||.||.||.] [.||.||.||.||.||.||.||.] [.||.||.||.||.||.||.||.]
[.||.||.||.||.||.||.||.] [.||.||.||.||.||.||.||.] [.||.||.||.||.||.||.||.]
<!> /

*Result​:*
(Takes a really long time to finish because it attempts to try all of the
paths, even though �token� has implicit :r)

According to committable, this behavior has been there since the beginning
( https://gist.github.com/Whateverable/814019538d89ec53fca6c09269136ebd ).
Therefore, I am not sure how much code will break because of us fixing this
bug. However, I am hoping to see some performance improvement after we fix
this.

@p6rt
Copy link
Author

p6rt commented Aug 27, 2017

From @smls

Relevant​:

https://rt-archive.perl.org/perl6/Ticket/Display.html?id=123934#txn-1401917

In short, `||` alternations don't respect `​:` in Rakudo, whereas `|` alternations (and other atoms such as quantifiers) do respect it.

Simpler test-case​:

  â�� say "ab" ~~ / [ "ab" | "a" ]​: "b" /;
  Nil

  â�� say "ab" ~~ / [ "ab" || "a" ]​: "b" /;
  ï½¢abï½£

(Remember that `​:ratchet` simply adds `​:` to every atom.)

@p6rt
Copy link
Author

p6rt commented Aug 27, 2017

The RT System itself - Status changed from 'new' to 'open'

@p6rt
Copy link
Author

p6rt commented Aug 28, 2017

From @smls

I sent a pull request which fixes this bug​:

Raku/nqp#368

Please review.

@p6rt
Copy link
Author

p6rt commented Oct 4, 2017

From @smls

On Mon, 28 Aug 2017 11​:50​:51 -0700, smls75@​gmail.com wrote​:

I sent a pull request which fixes this bug​:

Raku/nqp#368

Please review.

The PR was merged (and Rakudo's nqp version bumped).

Marking the ticket TESTNEEDED. (Note that some possible tests are listed in the PR description!)

@p6rt
Copy link
Author

p6rt commented Oct 6, 2017

@smls - Status changed from 'open' to 'resolved'

@p6rt p6rt closed this as completed Oct 6, 2017
@p6rt
Copy link
Author

p6rt commented Oct 6, 2017

From @smls

Tests were added here​: Raku/roast@65a762217

@p6rt p6rt added regex Regular expressions, pattern matching, user-defined grammars, tokens and rules testneeded labels Jan 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
regex Regular expressions, pattern matching, user-defined grammars, tokens and rules testneeded
Projects
None yet
Development

No branches or pull requests

1 participant