Skip Menu |
Report information
Id: 130910
Status: open
Priority: 0/
Queue: perl6

Owner: Nobody
Requestors: alex.jakimenko [at] gmail.com
Cc:
AdminCc:

Severity: (no value)
Tag: (no value)
Platform: (no value)
Patch Status: (no value)
VM: (no value)



Subject: [WEIRD] Sometimes parameterized regexes print nonsense about the number of arguments (/<smth(42)>/)
Download (untitled) / with headers
text/plain 610b
Here are 3 examples that work as expected: Code: my regex meh($t) { xy }; say 'xy' ~~ /^ <meh(42)> $/ Result: 「xy」 meh => 「xy」 Code my regex meh($t) { xy }; say 'ab' ~~ /^ <meh(42)> $/ Result: Nil Code: my regex meh($t) { .. }; say 'xy' ~~ /^ <meh(42)> $/ Result: 「xy」 meh => 「xy」 And here is one that doesn't: Code: my regex meh($t) { .. }; say 'xyz' ~~ /^ <meh(42)> $/ Result: Too few positionals passed; expected 2 arguments but got 1 in regex meh at <tmp> line 1 in block <unit> at <tmp> line 1 Why? What second argument is it expecting? Nil is the right answer here.
On Fri, 03 Mar 2017 20:25:27 -0800, alex.jakimenko@gmail.com wrote: Show quoted text
> Here are 3 examples that work as expected: > > Code: > my regex meh($t) { xy }; say 'xy' ~~ /^ <meh(42)> $/ > > Result: > 「xy」 > meh => 「xy」 > > > Code > my regex meh($t) { xy }; say 'ab' ~~ /^ <meh(42)> $/ > > Result: > Nil > > > Code: > my regex meh($t) { .. }; say 'xy' ~~ /^ <meh(42)> $/ > > Result: > 「xy」 > meh => 「xy」 > > > > And here is one that doesn't: > > Code: > my regex meh($t) { .. }; say 'xyz' ~~ /^ <meh(42)> $/ > > Result: > Too few positionals passed; expected 2 arguments but got 1 > in regex meh at <tmp> line 1 > in block <unit> at <tmp> line 1 > > > > Why? What second argument is it expecting? > Nil is the right answer here.
Some more examples [23:10] <MasterDuke> m: my regex meh($t, $s) { .. }; say 'xyz' ~~ /^ <meh(1)> $/ [23:10] <evalable6> MasterDuke, rakudo-moar 11ee2fe17: OUTPUT: «(exit code 1) Too few positionals passed; expected 3 arguments but got 2␤ in regex meh at /tmp/cCSpiqVwyt line 1␤ in block <unit> at /tmp/cCSpiqVwyt line 1␤» [23:11] <MasterDuke> m: my regex meh($t, $s) { .. }; say 'xyz' ~~ /^ <meh(1, 2)> $/ [23:11] <evalable6> MasterDuke, rakudo-moar 11ee2fe17: OUTPUT: «(exit code 1) Too few positionals passed; expected 3 arguments but got 1␤ in regex meh at /tmp/FVoRYxVyfG line 1␤ in block <unit> at /tmp/FVoRYxVyfG line 1␤» [23:11] <MasterDuke> m: my regex meh($t, $s) { .. }; say 'xyz' ~~ /^ <meh(1, 2, 3)> $/ [23:11] <evalable6> MasterDuke, rakudo-moar 11ee2fe17: OUTPUT: «(exit code 1) Too many positionals passed; expected 3 arguments but got 4␤ in regex meh at /tmp/RumqR3rpLh line 1␤ in block <unit> at /tmp/RumqR3rpLh line 1␤» [23:11] <MasterDuke> very weird [23:11] <MasterDuke> m: my regex meh($t, $s) { .. }; say 'xyz' ~~ /^ <meh(1, 2, 3, 4)> $/ [23:11] <evalable6> MasterDuke, rakudo-moar 11ee2fe17: OUTPUT: «(exit code 1) Too many positionals passed; expected 3 arguments but got 5␤ in regex meh at /tmp/Mtn6BJJxmP line 1␤ in block <unit> at /tmp/Mtn6BJJxmP line 1␤»
Download (untitled) / with headers
text/plain 1.6k
It has to do with backtracking, because: 1) The problem disappears when `:ratchet` mode is enabled in the top-level regex: ➜ my regex meh ($t) { . }; ➜ say "ab" ~~ /^ :ratchet <meh(1)> $ /; Nil 2) The problem disappears when the named regex is made a `token`: ➜ my token meh ($t) { . }; ➜ say "ab" ~~ /^ <meh(1)> $ /; Nil Of course, the regex engine could avoid backtracking entirely in that example, but maybe it's just not optimized enough to know that. Here's a different example in which backtracking is actually necessary: my regex meh ($t) { { say "meh start"} .+? { say "meh end"} } say "abcde" ~~ / ^ <meh(42)> { say '$<meh> = ', $<meh> } $ /; It outputs: meh start meh end $<meh> = 「abcde」 Too few positionals passed; expected 2 arguments but got 1 in regex meh at [...] Note how the error message appears after having reached the end of the regex for the first time, just before it would have backtracked into `meh` for the first time. In comparison, when removing the parameterization of `meh`, the example prints the following (Note how it backtracked into `meh` four times, like it should): meh start meh end $<meh> = 「a」 meh end $<meh> = 「ab」 meh end $<meh> = 「abc」 meh end $<meh> = 「abcd」 meh end $<meh> = 「abcde」 In summary, what *appears* to be happening, is this: - If a named subrule is called with parameters... - And it matched... - But then the regex engine wants to backtrack into it... - Then it "calls" the subrule again, but fails to pass the parameters again.
Download (untitled) / with headers
text/plain 201b
Sorry, copy-pasto in the second-to-last output listing. It is: meh start meh end $<meh> = 「a」 Too few positionals passed; expected 2 arguments but got 1 in regex meh at [...]


This service is sponsored and maintained by Best Practical Solutions and runs on Perl.org infrastructure.

For issues related to this RT instance (aka "perlbug"), please contact perlbug-admin at perl.org