Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

<:Digit> apparently matches anything #5999

Closed
p6rt opened this issue Jan 13, 2017 · 3 comments
Closed

<:Digit> apparently matches anything #5999

p6rt opened this issue Jan 13, 2017 · 3 comments
Labels

Comments

@p6rt
Copy link

p6rt commented Jan 13, 2017

Migrated from rt.perl.org#130549 (status was 'resolved')

Searchable as RT130549$

@p6rt
Copy link
Author

p6rt commented Jan 13, 2017

From @briandfoy

I mistakenly tried to match the Unicode property <​:Digit> when I meant
number. It's not one of the properties listed in the table in Regexes[1],
although it is in perluniprops[2] as a Perl 5 extension as a synonym
for XPosixDigit. I didn't mean to use it and I don't particularly care if
Perl 6 supports it. However, I didn't get an error and it appears to
match everything (almost)​:

  $ perl6 -v
  This is Rakudo version 2016.11 built on MoarVM version 2016.11
  implementing Perl 6.c.
  $ perl6
  To exit type 'exit' or '^D'
  > q/'/ ~~ rx/ <​:Digit> /
  「'」
  > q/a/ ~~ rx/ <​:Digit> /
  「a」
  > qq/\c[CAT FACE]/ ~~ rx/ <​:Digit> /
  「🐱」
  > q/'/ ~~ rx/ <​:SomeStupidThingIMadeUp> /
  (Any)

The <​:SomeStupidThingIMadeUp> non-existent property fails to match,
which is fine. Regex[1] says​:

  ...<​:property> , where property can be a short or long Unicode
  General Category name.

"can" is a bit ambiguous since it might mean it "is limited to" or is
"not disallowed". It should probably be the more limited form ("it
must be a") and fail otherwise.

There are about 2,625 characters that don't match in this range​:

  my $matches = 0;
  for 0 .. 0x10fffd {
  unless chr($_) ~~ / <​:Digit> / {
  put $_.fmt('%#​6X'), "​: ", chr($_);
  next;
  }
  $matches++ if chr($_) ~~ / <​:Digit> /;
  }

  put "Characters not matching​: ", 0x10fffd - $matches;

[1] https://docs.perl6.org/language/regexes#Unicode_properties
[2] http://perldoc.perl.org/perluniprops.html

@p6rt
Copy link
Author

p6rt commented Nov 30, 2017

From @samcv

This has now been resolved. It was fixed in MoarVM/MoarVM@d2cf724 and the updated Unicode database generated from that change is here​: MoarVM/MoarVM@356d9db

Tests were added Raku/roast@ac52954 and revised here Raku/roast@994f5fe

@p6rt
Copy link
Author

p6rt commented Nov 30, 2017

@samcv - Status changed from 'new' to 'resolved'

@p6rt p6rt closed this as completed Nov 30, 2017
@p6rt p6rt added the uni label Jan 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant