Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

recursive grammars #2888

Open
p6rt opened this issue Sep 4, 2012 · 4 comments
Open

recursive grammars #2888

p6rt opened this issue Sep 4, 2012 · 4 comments
Labels
regex Regular expressions, pattern matching, user-defined grammars, tokens and rules

Comments

@p6rt
Copy link

p6rt commented Sep 4, 2012

Migrated from rt.perl.org#114748 (status was 'open')

Searchable as RT114748$

@p6rt
Copy link
Author

p6rt commented Sep 4, 2012

From @gfldex

# gammars seam to get confused when used recursively
use v6;

my $s1 = 'AA foo; BB baz; AA bar;';
my $s2 = 'AA buzz;';

grammar G {
  rule TOP { [ [ <ta> | <tb> ] <.ws> ]+ }
  rule ta { 'AA' <name>';' { say $/<name>.Str; } }
  rule tb { 'BB' <name>';' { G.parse($s2); say $/<name>.Str; } }
  rule name { \w+ }
}

say ?G.parse($s1);


foo
buzz
use of uninitialized value of type Any in string context in regex tb at
recursive-grammar.p6​:9

bar
True

@p6rt
Copy link
Author

p6rt commented Sep 4, 2012

From @pmichaud

Apparently the .parse method sets $/. I'm not sure I agree that it
should do this (seems like that should be via explicit assignment), but
RT #​107236 seems to claim that the .parse method sets $/ to the result
of the match.

As a result, when you get back from G.parse($s2), the $/ variable has
been bound to the result of the $s2 match. This match doesn't have a
$<name> submatch and thus produces the "use of uninitialized value"
warning. The $s1 match then continues on to match (and say) "bar", and
then True.

So, at least according to current spectests, this grammar is functioning
entirely as expected.

I'll hold off on rejecting the ticket for a bit in case anyone wants to
comment on the behavior of .parse explicitly setting $/.

Pm

On Tue Sep 04 14​:59​:30 2012, gfldex wrote​:

# gammars seam to get confused when used recursively
use v6;

my $s1 = 'AA foo; BB baz; AA bar;';
my $s2 = 'AA buzz;';

grammar G {
rule TOP { [ [ <ta> | <tb> ] <.ws> ]+ }
rule ta { 'AA' <name>';' { say $/<name>.Str; } }
rule tb { 'BB' <name>';' { G.parse($s2); say $/<name>.Str; }
}
rule name { \w+ }
}

say ?G.parse($s1);

-----------------------

foo
buzz
use of uninitialized value of type Any in string context in regex tb
at
recursive-grammar.p6​:9

bar
True

@p6rt
Copy link
Author

p6rt commented Sep 4, 2012

The RT System itself - Status changed from 'new' to 'open'

@p6rt
Copy link
Author

p6rt commented Sep 5, 2012

From @gfldex

I would like to add the RL example that drove me mad until I understood
the problem. I'm trying to parse a DTD. Those nasty bugger can have
external entities, that are DTDs. Like so​:

<!ENTITY % HTMLlat1 PUBLIC
  "-//W3C//ENTITIES Latin 1 for XHTML//EN"
  "xhtml-lat1.ent">

The problem of parsing a DTD is quite easy. We have macros with names that
need replacement after they are defined. Some macros are defined in
another file. Since it's just a non-cyclic include I might as well just
think recursively -- .parse does not agree.

It's not so much that I could not get around the overwritten $/ but that
it happens silently and debugging grammars is not the most enjoyable thing
in the world. Those haskellists might have a point after all.

On Tue, 4 Sep 2012, Patrick R. Michaud via RT wrote​:

Apparently the .parse method sets $/. I'm not sure I agree that it
should do this (seems like that should be via explicit assignment), but
RT #​107236 seems to claim that the .parse method sets $/ to the result
of the match.

As a result, when you get back from G.parse($s2), the $/ variable has
been bound to the result of the $s2 match. This match doesn't have a
$<name> submatch and thus produces the "use of uninitialized value"
warning. The $s1 match then continues on to match (and say) "bar", and
then True.

So, at least according to current spectests, this grammar is functioning
entirely as expected.

I'll hold off on rejecting the ticket for a bit in case anyone wants to
comment on the behavior of .parse explicitly setting $/.

Pm

On Tue Sep 04 14​:59​:30 2012, gfldex wrote​:

# gammars seam to get confused when used recursively
use v6;

my $s1 = 'AA foo; BB baz; AA bar;';
my $s2 = 'AA buzz;';

grammar G {
rule TOP { [ [ <ta> | <tb> ] <.ws> ]+ }
rule ta { 'AA' <name>';' { say $/<name>.Str; } }
rule tb { 'BB' <name>';' { G.parse($s2); say $/<name>.Str; }
}
rule name { \w+ }
}

say ?G.parse($s1);

-----------------------

foo
buzz
use of uninitialized value of type Any in string context in regex tb
at
recursive-grammar.p6​:9

bar
True

@p6rt p6rt added the regex Regular expressions, pattern matching, user-defined grammars, tokens and rules label Jan 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
regex Regular expressions, pattern matching, user-defined grammars, tokens and rules
Projects
None yet
Development

No branches or pull requests

1 participant