Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interpolation of /<$var>/ causes incorrect matches when $var contains alternation. #3316

Open
p6rt opened this issue Jan 17, 2014 · 4 comments
Labels

Comments

@p6rt
Copy link

p6rt commented Jan 17, 2014

Migrated from rt.perl.org#121024 (status was 'open')

Searchable as RT121024$

@p6rt
Copy link
Author

p6rt commented Jan 17, 2014

From @Util

Interpolation of /<$var>/ causes incorrect matches when $var contains alternation.
TimToady weighs in during the last section of the log.

2014-01-17 15​:22-15​:45
< Util> r​: my $s = "abcdefghtttaaccta"; my @​pats = /ttta<[agt]>cct/, /z|ttta<[agt]>cct/; for @​pats -> $pat { say $s.comb( /$pat/ ); }
<+camelia> rakudo-parrot 82f2fd, rakudo-jvm 82f2fd, rakudo-moar 82f2fd​: OUTPUT«tttaacct␤tttaacct␤»
< Util> r​: my $s = "abcdefghtttaaccta"; my @​pats = "ttta<[agt]>cct", "z|ttta<[agt]>cct"; for @​pats -> $pat { say $s.comb( /<$pat>/ ); }
<+camelia> rakudo-parrot 82f2fd, rakudo-jvm 82f2fd, rakudo-moar 82f2fd​: OUTPUT«tttaacct␤abcdefgh tttaacct␤»

< Util> Adding 'z|' to the front of the pattern causes crazy matches, but only when the string pattern is interpolated into a re.
< Util> Is this a bug in <$var> interpolation? If so, is it a known bug? If not, how am I mis-reading the results?

< ingy> Util​: can you serialize the regex after string interp?
< Util> ingy​: .perl method on a compiled regex does not output anything useful. I welcome new knowledge on how else to serialize a regex.
< ingy> no clue. just guessing
< Util> r​: my $re = /abc/; say $re.perl;
<+camelia> rakudo-parrot 82f2fd, rakudo-jvm 82f2fd, rakudo-moar 82f2fd​: OUTPUT«regex(Mu : Mu *%_) { ... }␤»
< Util> I would love to be able to see what is actually in the $re.

< FROGGS> p​: my $r = "z|ttta<[agt]>cct"; say "abcdefghtttaaccta" ~~ /<$r>/
<+camelia> rakudo-parrot 82f2fd​: OUTPUT«「abcdefgh」␤␤»
< FROGGS> p​: my $r = "ttta<[agt]>cct"; say "abcdefghtttaaccta" ~~ /<$r>/
<+camelia> rakudo-parrot 82f2fd​: OUTPUT«「tttaacct」␤␤»
< FROGGS> p​: my $r = "ttta<[agt]>cct"; say "abcdefghttttaaccta" ~~ /<$r>/
<+camelia> rakudo-parrot 82f2fd​: OUTPUT«「tttaacct」␤␤»
< FROGGS> p​: my $r = "y|ttta<[agt]>cct"; say "abcdefghtttaaccta" ~~ /<$r>/
<+camelia> rakudo-parrot 82f2fd​: OUTPUT«「abcdefgh」␤␤»

< FROGGS> it is like it still matches "tttaacct", but then forgets the position where the match started
< FROGGS> p​: my $r = "tta<[agt]>cct"; say "abcdefghttttaaccta" ~~ /<$r>/
<+camelia> rakudo-parrot 82f2fd​: OUTPUT«「ttaacct」␤␤»
< FROGGS> p​: my $r = "e|tta<[agt]>cct"; say "abcdefghttttaaccta" ~~ /<$r>/
<+camelia> rakudo-parrot 82f2fd​: OUTPUT«「abcdefg」␤␤»
< FROGGS> see, it always has the same length
< FROGGS> p​: my $r = "tta<[agt]>cct"; say "abcdefghttttaaccta" ~~ /<e|$r>/
<+camelia> rakudo-parrot 82f2fd​: OUTPUT«===SORRY!=== Error while compiling /tmp/4mFkdy5iGB␤Unable to parse expression in metachar​:sym<assert>; couldn't find final '>' ␤at /tmp/4mFkdy5iGB​:1␤------> t]>cct"; say "abcdefghttttaaccta" ~~ /<e⏏|$r>/␤ …»
< FROGGS> in theory it should explode like that
< Util> FROGGS​: I observed that as well, but do not know what to make of it.

< FROGGS> p​: my $r = "e||tta<[agt]>cct"; say "abcdefghttttaaccta" ~~ /<$r>/
<+camelia> rakudo-parrot 82f2fd​: OUTPUT«「abcdefg」␤␤»
< Util> p​: my $r = "tta<[agt]>cct"; say "abcdefghttttaaccta" ~~ /e|<$r>/
<+camelia> rakudo-parrot 82f2fd​: OUTPUT«「e」␤␤»
< FROGGS> p​: my $r = "tta<[agt]>cct"; say "abcdefghttttaaccta" ~~ /z|<$r>/
<+camelia> rakudo-parrot 82f2fd​: OUTPUT«「ttaacct」␤␤»
< FROGGS> that works well
< Util> FROGGS​: FYI, I think that /<e|$var>/ is incorrect syntax, no matter what is in $var
< FROGGS> Util​: correct
< FROGGS> and that is why /<$var>/ should explode too it there is just a | in it
< FROGGS> if*
< Util> FROGGS​: Are you asserting that alternation (|) is never valid in a interpolated regex?
< FROGGS> Util​: it would think that you have to add [ ]
< FROGGS> a similar question would be if a + in such an assertion should be valid, and if it should be a quantifier for the thing on the left of the assertion
< Util> I p​: my $r = "z|abc"; say "abcde" ~~ /<$r>/
< Util> p​: my $r = "z|abc"; say "abcde" ~~ /<$r>/
<+camelia> rakudo-parrot 82f2fd​: OUTPUT«「abc」␤␤»
< FROGGS> p​: my $r = "z|abc"; say "peterabcde" ~~ /<$r>/
<+camelia> rakudo-parrot 82f2fd​: OUTPUT«「pet」␤␤»
< Util> I would expect your "peterabce" example to have worked correctly.
< Util> S05 says that /$var/ no longer interpolates like it did in Perl 5. /<$var>/ is how you get the old behavior.
< FROGGS> p​: my $r = "[z|abc]"; say "peterabcde" ~~ /<$r>/
<+camelia> rakudo-parrot 82f2fd​: OUTPUT«「abc」␤␤»
< FROGGS> see
< Util> I don't see anything *restricting* the old behavior when done as /<$var>/
< Util> (in S05)

< FROGGS> it fails because it needs a group
< Util> FROGGS​: Thanks! That may be the breakthrough thought that I needed.
< Util> Just like S05 says that /moose*/ matches multiple 'e', but /'moose'*/ matches multiple 'moose’.

< Util> That might give me a workaround, but the current behavior is still a bug, IMO.
< FROGGS> report it
< Util> Will do. Thanks again!

2014-01-17 18​:49-18​:52
< TimToady> on / <$foo> /, the assertion is supposed to match as a subgroup, so it should not be necessary to supply [], and bare z|abc should work
< TimToady> we do not provide any way to interpolate a string as a regex without it being a submatch, unless you use EVAL
< japhb> TimToady​: I read that discussion as resolving to​: <$foo>'s implementation should just implicitly wrap the contents in [].
< TimToady> that would be...unhygienic
< TimToady> that's what a submatch means
< japhb> Fair enough.
< TimToady> well, more like (), in the sense that it hides inner ()
< TimToady> but not in the sense of supplying an outer ()
< TimToady> unless you bind it explicitly
< TimToady> just as if you'd called <.foo>
< japhb> gotcha.
< TimToady> nothing is captured, and the () inside foo are hidden

--
Thank you,
Bruce Gray (Util of PerlMonks)

@p6rt
Copy link
Author

p6rt commented Oct 22, 2015

From @usev6

FWIW the first two evaluations give identical results now​:

$ perl6 -e 'my $s = "abcdefghtttaaccta"; my @​pats = /ttta<[agt]>cct/, /z|ttta<[agt]>cct/; for @​pats -> $pat { say $s.comb( /$pat/ ); }'
(tttaacct)
(tttaacct)

$ perl6 -e 'my $s = "abcdefghtttaaccta"; my @​pats = "ttta<[agt]>cct", "z|ttta<[agt]>cct"; for @​pats -> $pat { say $s.comb( /<$pat>/ ); }'
(tttaacct)
(tttaacct)

1 similar comment
@p6rt
Copy link
Author

p6rt commented Oct 22, 2015

From @usev6

FWIW the first two evaluations give identical results now​:

$ perl6 -e 'my $s = "abcdefghtttaaccta"; my @​pats = /ttta<[agt]>cct/, /z|ttta<[agt]>cct/; for @​pats -> $pat { say $s.comb( /$pat/ ); }'
(tttaacct)
(tttaacct)

$ perl6 -e 'my $s = "abcdefghtttaaccta"; my @​pats = "ttta<[agt]>cct", "z|ttta<[agt]>cct"; for @​pats -> $pat { say $s.comb( /<$pat>/ ); }'
(tttaacct)
(tttaacct)

@p6rt
Copy link
Author

p6rt commented Oct 22, 2015

@usev6 - Status changed from 'new' to 'open'

@p6rt p6rt added the Bug label Jan 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant