Skip Menu |
Report information
Id: 121024
Status: open
Priority: 0/
Queue: perl6

Owner: Nobody
Requestors: util <bruce.gray [at] acm.org>
Cc:
AdminCc:

Severity: (no value)
Tag: Bug
Platform: (no value)
Patch Status: (no value)
VM: (no value)



Subject: [BUG] Interpolation of /<$var>/ causes incorrect matches when $var contains alternation.
To: rakudobug [...] perl.org
From: Bruce Gray <bruce.gray [...] acm.org>
CC: Robert Bruce Gray III <bruce.gray [...] acm.org>
Date: Fri, 17 Jan 2014 13:11:10 -0600
Download (untitled) / with headers
text/plain 5.5k
Interpolation of /<$var>/ causes incorrect matches when $var contains alternation. TimToady weighs in during the last section of the log. 2014-01-17 15:22-15:45 < Util> r: my $s = "abcdefghtttaaccta"; my @pats = /ttta<[agt]>cct/, /z|ttta<[agt]>cct/; for @pats -> $pat { say $s.comb( /$pat/ ); } <+camelia> rakudo-parrot 82f2fd, rakudo-jvm 82f2fd, rakudo-moar 82f2fd: OUTPUT«tttaacct␤tttaacct␤» < Util> r: my $s = "abcdefghtttaaccta"; my @pats = "ttta<[agt]>cct", "z|ttta<[agt]>cct"; for @pats -> $pat { say $s.comb( /<$pat>/ ); } <+camelia> rakudo-parrot 82f2fd, rakudo-jvm 82f2fd, rakudo-moar 82f2fd: OUTPUT«tttaacct␤abcdefgh tttaacct␤» < Util> Adding 'z|' to the front of the pattern causes crazy matches, but only when the string pattern is interpolated into a re. < Util> Is this a bug in <$var> interpolation? If so, is it a known bug? If not, how am I mis-reading the results? < ingy> Util: can you serialize the regex after string interp? < Util> ingy: .perl method on a compiled regex does not output anything useful. I welcome new knowledge on how else to serialize a regex. < ingy> no clue. just guessing < Util> r: my $re = /abc/; say $re.perl; <+camelia> rakudo-parrot 82f2fd, rakudo-jvm 82f2fd, rakudo-moar 82f2fd: OUTPUT«regex(Mu : Mu *%_) { ... }␤» < Util> I would love to be able to see what is actually in the $re. < FROGGS> p: my $r = "z|ttta<[agt]>cct"; say "abcdefghtttaaccta" ~~ /<$r>/ <+camelia> rakudo-parrot 82f2fd: OUTPUT«「abcdefgh」␤␤» < FROGGS> p: my $r = "ttta<[agt]>cct"; say "abcdefghtttaaccta" ~~ /<$r>/ <+camelia> rakudo-parrot 82f2fd: OUTPUT«「tttaacct」␤␤» < FROGGS> p: my $r = "ttta<[agt]>cct"; say "abcdefghttttaaccta" ~~ /<$r>/ <+camelia> rakudo-parrot 82f2fd: OUTPUT«「tttaacct」␤␤» < FROGGS> p: my $r = "y|ttta<[agt]>cct"; say "abcdefghtttaaccta" ~~ /<$r>/ <+camelia> rakudo-parrot 82f2fd: OUTPUT«「abcdefgh」␤␤» < FROGGS> it is like it still matches "tttaacct", but then forgets the position where the match started < FROGGS> p: my $r = "tta<[agt]>cct"; say "abcdefghttttaaccta" ~~ /<$r>/ <+camelia> rakudo-parrot 82f2fd: OUTPUT«「ttaacct」␤␤» < FROGGS> p: my $r = "e|tta<[agt]>cct"; say "abcdefghttttaaccta" ~~ /<$r>/ <+camelia> rakudo-parrot 82f2fd: OUTPUT«「abcdefg」␤␤» < FROGGS> see, it always has the same length < FROGGS> p: my $r = "tta<[agt]>cct"; say "abcdefghttttaaccta" ~~ /<e|$r>/ <+camelia> rakudo-parrot 82f2fd: OUTPUT«===SORRY!=== Error while compiling /tmp/4mFkdy5iGB␤Unable to parse expression in metachar:sym<assert>; couldn't find final '>' ␤at /tmp/4mFkdy5iGB:1␤------> t]>cct"; say "abcdefghttttaaccta" ~~ /<e⏏|$r>/␤ …» < FROGGS> in theory it should explode like that < Util> FROGGS: I observed that as well, but do not know what to make of it. < FROGGS> p: my $r = "e||tta<[agt]>cct"; say "abcdefghttttaaccta" ~~ /<$r>/ <+camelia> rakudo-parrot 82f2fd: OUTPUT«「abcdefg」␤␤» < Util> p: my $r = "tta<[agt]>cct"; say "abcdefghttttaaccta" ~~ /e|<$r>/ <+camelia> rakudo-parrot 82f2fd: OUTPUT«「e」␤␤» < FROGGS> p: my $r = "tta<[agt]>cct"; say "abcdefghttttaaccta" ~~ /z|<$r>/ <+camelia> rakudo-parrot 82f2fd: OUTPUT«「ttaacct」␤␤» < FROGGS> that works well < Util> FROGGS: FYI, I think that /<e|$var>/ is incorrect syntax, no matter what is in $var < FROGGS> Util: correct < FROGGS> and that is why /<$var>/ should explode too it there is just a | in it < FROGGS> if* < Util> FROGGS: Are you asserting that alternation (|) is never valid in a interpolated regex? < FROGGS> Util: it would think that you have to add [ ] < FROGGS> a similar question would be if a + in such an assertion should be valid, and if it should be a quantifier for the thing on the left of the assertion < Util> I p: my $r = "z|abc"; say "abcde" ~~ /<$r>/ < Util> p: my $r = "z|abc"; say "abcde" ~~ /<$r>/ <+camelia> rakudo-parrot 82f2fd: OUTPUT«「abc」␤␤» < FROGGS> p: my $r = "z|abc"; say "peterabcde" ~~ /<$r>/ <+camelia> rakudo-parrot 82f2fd: OUTPUT«「pet」␤␤» < Util> I would expect your "peterabce" example to have worked correctly. < Util> S05 says that /$var/ no longer interpolates like it did in Perl 5. /<$var>/ is how you get the old behavior. < FROGGS> p: my $r = "[z|abc]"; say "peterabcde" ~~ /<$r>/ <+camelia> rakudo-parrot 82f2fd: OUTPUT«「abc」␤␤» < FROGGS> see < Util> I don't see anything *restricting* the old behavior when done as /<$var>/ < Util> (in S05) < FROGGS> it fails because it needs a group < Util> FROGGS: Thanks! That may be the breakthrough thought that I needed. < Util> Just like S05 says that /moose*/ matches multiple 'e', but /'moose'*/ matches multiple 'moose’. < Util> That might give me a workaround, but the current behavior is still a bug, IMO. < FROGGS> report it < Util> Will do. Thanks again! 2014-01-17 18:49-18:52 < TimToady> on / <$foo> /, the assertion is supposed to match as a subgroup, so it should not be necessary to supply [], and bare z|abc should work < TimToady> we do not provide any way to interpolate a string as a regex without it being a submatch, unless you use EVAL < japhb> TimToady: I read that discussion as resolving to: <$foo>'s implementation should just implicitly wrap the contents in []. < TimToady> that would be...unhygienic < TimToady> that's what a submatch means < japhb> Fair enough. < TimToady> well, more like (), in the sense that it hides inner () < TimToady> but not in the sense of supplying an outer () < TimToady> unless you bind it explicitly < TimToady> just as if you'd called <.foo> < japhb> gotcha. < TimToady> nothing is captured, and the () inside foo are hidden -- Thank you, Bruce Gray (Util of PerlMonks)
RT-Send-CC: perl6-compiler [...] perl.org
Download (untitled) / with headers
text/plain 378b
FWIW the first two evaluations give identical results now: $ perl6 -e 'my $s = "abcdefghtttaaccta"; my @pats = /ttta<[agt]>cct/, /z|ttta<[agt]>cct/; for @pats -> $pat { say $s.comb( /$pat/ ); }' (tttaacct) (tttaacct) $ perl6 -e 'my $s = "abcdefghtttaaccta"; my @pats = "ttta<[agt]>cct", "z|ttta<[agt]>cct"; for @pats -> $pat { say $s.comb( /<$pat>/ ); }' (tttaacct) (tttaacct)


This service is sponsored and maintained by Best Practical Solutions and runs on Perl.org infrastructure.

For issues related to this RT instance (aka "perlbug"), please contact perlbug-admin at perl.org