Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

~ operator in regexp reverts capture order, but it should not #2129

Closed
p6rt opened this issue Sep 1, 2010 · 14 comments
Closed

~ operator in regexp reverts capture order, but it should not #2129

p6rt opened this issue Sep 1, 2010 · 14 comments

Comments

@p6rt
Copy link

p6rt commented Sep 1, 2010

Migrated from rt.perl.org#77616 (status was 'resolved')

Searchable as RT77616$

@p6rt
Copy link
Author

p6rt commented Sep 1, 2010

From @bbkr

[17​:29] <bbkr> rakudo​: say so "abc" ~~ /a ~ (c) (b)/; say $0
[17​:29] <p6eval> rakudo dc9900​: OUTPUT«1â�¤bâ�¤Â»
[17​:29] <TimToady> uhhh
[17​:30] <TimToady> that seems like a bug to me
[17​:32] * bbkr reports

@p6rt
Copy link
Author

p6rt commented Aug 25, 2011

From @bbkr

[12​:14] <bbkr_> nom​: say so "abc" ~~ /a ~ (c) (b)/; say $0 # this is
weird. in rakudo it captured incorectly ( #​77616 ) in nom it does not
capture at all.
[12​:14] <p6eval> nom b0da69​: OUTPUT«Bool​::Trueâ�¤Any()â�¤Â»

@p6rt
Copy link
Author

p6rt commented Aug 25, 2011

The RT System itself - Status changed from 'new' to 'open'

@p6rt
Copy link
Author

p6rt commented Oct 21, 2012

From @coke

On Thu Aug 25 03​:17​:17 2011, bbkr wrote​:

[12​:14] <bbkr_> nom​: say so "abc" ~~ /a ~ (c) (b)/; say $0 # this is
weird. in rakudo it captured incorectly ( #​77616 ) in nom it does not
capture at all.
[12​:14] <p6eval> nom b0da69​: OUTPUT«Bool​::Trueâ�¤Any()â�¤Â»

Current nom behavior​:

say so "abc" ~~ /a ~ (c) (b)/; say $0
True
ï½¢bï½£

--
Will "Coke" Coleda

@p6rt
Copy link
Author

p6rt commented Feb 7, 2013

From @FROGGS

what exactly is wrong?

FROGGS> r​: say so "abc" ~~ /a ~ (c) (b)/; say $/
<p6eval> rakudo 4fb07b​: OUTPUT«Trueâ�¤ï½¢abcï½£â�¤ 0 => ï½¢bï½£â�¤ 1 => ï½¢cï½£â�¤â�¤Â»

It's 'b' surrounded by 'a' and 'c'. And since b is in the middle it is
$0, c will be $1.

Star matches the same.

@p6rt
Copy link
Author

p6rt commented Feb 7, 2013

From @bbkr

<masak> bbkr__​: definitely a bug.
<masak> bbkr__​: the parentheses are numbered by their location in the regex.
<masak> bbkr__​: not by match order.
<PerlJam> bbkr__​: and this has *always* been the case, even before Perl 6 :)

@p6rt
Copy link
Author

p6rt commented Oct 8, 2014

From @usev6

Current behaviour​:

say so "abc" ~~ /a ~ (c) (b)/; say $0, $1
True
ï½¢bï½£
ï½¢cï½£

As I understand S05 the ~ operator basically rewrites the above regex to /a (b) (c)/ and only then matching and capturing happens.

So the question seems to be, whether numbering of subpatterns should be done before the regex is rewritten or afterwards. Rakudo seems to do the latter while it should be the former. (Please correct me if that's not the point. I'm trying to rephrase the bug report to avoid any confusion.)

1 similar comment
@p6rt
Copy link
Author

p6rt commented Oct 8, 2014

From @usev6

Current behaviour​:

say so "abc" ~~ /a ~ (c) (b)/; say $0, $1
True
ï½¢bï½£
ï½¢cï½£

As I understand S05 the ~ operator basically rewrites the above regex to /a (b) (c)/ and only then matching and capturing happens.

So the question seems to be, whether numbering of subpatterns should be done before the regex is rewritten or afterwards. Rakudo seems to do the latter while it should be the former. (Please correct me if that's not the point. I'm trying to rephrase the bug report to avoid any confusion.)

@p6rt
Copy link
Author

p6rt commented Oct 8, 2014

From @FROGGS

The capture order is meant to be how *you* read the regex from left to
right. Whatever happens inside the regex engine should stay inside it :o)

So it is meant to capture c first.

@p6rt
Copy link
Author

p6rt commented Oct 9, 2014

From @masak

Christian Bartolomaeus (>)​:

So the question seems to be, whether numbering of subpatterns should be
done before the regex is rewritten or afterwards. Rakudo seems to do the
latter while it should be the former. (Please correct me if that's not the point.
I'm trying to rephrase the bug report to avoid any confusion.)

Sounds right to me. The number of the capture group should depend on
the syntactical order of components in the regex, not on any runtime
execution order.

@p6rt
Copy link
Author

p6rt commented Oct 9, 2014

From @usev6

Thanks for the feedback. I added a fudged test to S05-metachars/tilde.t with the following commit​: Raku/roast@f15d9aed26

1 similar comment
@p6rt
Copy link
Author

p6rt commented Oct 9, 2014

From @usev6

Thanks for the feedback. I added a fudged test to S05-metachars/tilde.t with the following commit​: Raku/roast@f15d9aed26

@p6rt
Copy link
Author

p6rt commented May 1, 2015

From @jnthn

On Wed Sep 01 08​:34​:47 2010, pawel.pabian@​implix.com wrote​:

[17​:29] <bbkr> rakudo​: say so "abc" ~~ /a ~ (c) (b)/; say $0
[17​:29] <p6eval> rakudo dc9900​: OUTPUT«1â�¤bâ�¤Â»
[17​:29] <TimToady> uhhh
[17​:30] <TimToady> that seems like a bug to me
[17​:32] * bbkr reports

Fixed this, and also unfudged the test in S05-metachars/tilde.t.

@p6rt p6rt closed this as completed May 1, 2015
@p6rt
Copy link
Author

p6rt commented May 1, 2015

@jnthn - Status changed from 'open' to 'resolved'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant