Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Solved on MoarVM / Needs NFC support on JVM | Str.Int confused by diacritics #5418

Closed
p6rt opened this issue Jul 5, 2016 · 6 comments
Closed
Labels
JVM Related to Rakudo-JVM

Comments

@p6rt
Copy link

p6rt commented Jul 5, 2016

Migrated from rt.perl.org#128542 (status was 'resolved')

Searchable as RT128542$

@p6rt
Copy link
Author

p6rt commented Jul 5, 2016

From zefram@fysh.org

The Str.Int coercion method is usually strict about the string content
looking numeric​:

"345".Int
345
"3z5".Int
Cannot convert string to number​: trailing characters after number in '3^z5' (indicated by ^)
  in block <unit> at <unknown file> line 1

But if the content is non-numeric only by having diacritics on the digits,
the coercion produces a bad result​:

"34\x[308]5".Int
3

This is a bug. The "4\x[308]" grapheme (digit four with diaeresis, das
ist Numberwang) should be treated either as an acceptable digit of value
four or as an impermissible character for which an error is signalled.
To silently terminate the digit sequence early is inconsistent with the
intent of the conversion semantics.

This bug only shows up when the string otherwise consists solely
of digits. It does not happen if the string also contains leading or
trailing whitespace, a minus sign, a fractional part, or any of the other
complications that are deliberately permitted in numeric coercion. In all
of those cases, the modified digit is treated sanely as a non-digit,
signalling an error.

-zefram

@p6rt
Copy link
Author

p6rt commented Jul 15, 2016

From @zoffixznet

Question for the [@​LARRY] team​: should this throw an error or silently ignore the diacritic?

@p6rt
Copy link
Author

p6rt commented Jul 15, 2016

The RT System itself - Status changed from 'new' to 'open'

@p6rt
Copy link
Author

p6rt commented Nov 11, 2016

From @zoffixznet

On Tue, 05 Jul 2016 08​:34​:51 -0700, zefram@​fysh.org wrote​:

But if the content is non-numeric only by having diacritics on the
digits,
the coercion produces a bad result​:

"34\x[308]5".Int
3

Thanks for the report. The issue has been fixed on MoarVM. On JVM, we still need proper NFC support for this to work, so I'm leaving the ticket open as a JVM-only thing.

Fixed in rakudo/rakudo@d540fc8
Tests added in Raku/roast@48d32a8

@p6rt
Copy link
Author

p6rt commented Nov 23, 2016

From @usev6

On Fri, 11 Nov 2016 08​:57​:19 -0800, cpan@​zoffix.com wrote​:

On Tue, 05 Jul 2016 08​:34​:51 -0700, zefram@​fysh.org wrote​:

But if the content is non-numeric only by having diacritics on the
digits,
the coercion produces a bad result​:

"34\x[308]5".Int
3

Thanks for the report. The issue has been fixed on MoarVM. On JVM, we
still need proper NFC support for this to work, so I'm leaving the
ticket open as a JVM-only thing.

Fixed in
rakudo/rakudo@d540fc8
Tests added in
Raku/roast@48d32a8

The test in S32-str/numeric.t actually passes on JVM. I'm going to unfudge that test.

I'm not sure whether there are other cases which do not work on JVM (due to missing NFC support) or whether this ticket can be closed.

@p6rt p6rt closed this as completed Nov 26, 2016
@p6rt
Copy link
Author

p6rt commented Nov 26, 2016

@zoffixznet - Status changed from 'open' to 'resolved'

@p6rt p6rt added the JVM Related to Rakudo-JVM label Jan 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
JVM Related to Rakudo-JVM
Projects
None yet
Development

No branches or pull requests

1 participant