Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

\c[BELL] returns the U+0007 control code not U+1F514 BELL #5998

Closed
p6rt opened this issue Jan 11, 2017 · 9 comments
Closed

\c[BELL] returns the U+0007 control code not U+1F514 BELL #5998

p6rt opened this issue Jan 11, 2017 · 9 comments
Labels
JVM Related to Rakudo-JVM uni

Comments

@p6rt
Copy link

p6rt commented Jan 11, 2017

Migrated from rt.perl.org#130542 (status was 'resolved')

Searchable as RT130542$

@p6rt
Copy link
Author

p6rt commented Jan 11, 2017

From @samcv

Fudged test in S02-literals/char-by-name.t

is "\c[BELL]", "🔔", '\c[BELL] returns 🔔, BELL symbol not the control
character'

@p6rt
Copy link
Author

p6rt commented Jan 13, 2017

From @samcv

This has been fixed on MoarVM as of MoarVM/MoarVM@8161864

BELL now resolves to 🔔 U+1F514 on MoarVM, but this is still broken on the JVM

@p6rt
Copy link
Author

p6rt commented Jan 14, 2017

From @toolforger

BELL now resolves to 🔔 U+1F514 on MoarVM, but this is still broken on the JVM

What causes this kind of difference?

@p6rt
Copy link
Author

p6rt commented Jan 14, 2017

The RT System itself - Status changed from 'new' to 'open'

@p6rt
Copy link
Author

p6rt commented Jan 14, 2017

From @samcv

On Saturday, 14 January 2017 02.06.57 PST you wrote​:

BELL now resolves to 🔔 U+1F514 on MoarVM, but this is still broken on the JVM

What causes this kind of difference?

U+0007's Unicode 1 name was BELL, and with version 2 the name was removed.

Unicode 1 names are essentially totally deprecated and shouldn't be used for naming characters.
Since Unicode version 2, names are guaranteed never to change, so the Unicode 1 names are very poor to rely on for functionality. In addition, Alias names never change either (though they could add more, they will never change or be removed).

For this reason it has been decided we should only guarantee standard Unicode names and Name Alias's.

Concerning BELL, in Unicode 1, U+0007 control code was named bell. Since Unicode 2, the control code's names were removed and they were given stable Alias's. As proof that Unicode 1 names shouldn't be relied on, the U+1F514 bell symbol is called BELL.

Regarding the JVM, it seems to give us back the canonical Unicode names if they exist, and otherwise give us the Unicode 1 names. There may be some way to get the Name Alias's, but I do not know.

I will have to manually go in and add U+1F514 as BELL, and add U+0007 as BEL and ALERT. I manually added a few other Alias Names to JVM recently to fix a few roast tests.

See here for the commit that added some Alias Names to JVM​: Raku/nqp@0c249e7

Hopefully I have explained this well enough.

@p6rt
Copy link
Author

p6rt commented Jan 14, 2017

From @toolforger

Am 14.01.2017 um 11​:29 schrieb Samantha McVey​:

See here for the commit that added some Alias Names to JVM​: Raku/nqp@0c249e7

Hopefully I have explained this well enough.

I kinda expected this to be implemented in NQP and hence be identical
across implementations, and was just worried that the JVM implementation
would rely on the JDK's Unicode implementation in java.lang.Character
and friends, but I see that this is not the goal.

I am seeing tangential points about this; where should I raise them?
(Synchronous communication like IRC does not work well for me.)

@p6rt
Copy link
Author

p6rt commented Jan 17, 2017

From @coke

On Sat, Jan 14, 2017 at 6​:40 AM, Joachim Durchholz <jo@​durchholz.org> wrote​:

Am 14.01.2017 um 11​:29 schrieb Samantha McVey​:

See here for the commit that added some Alias Names to JVM​:
Raku/nqp@0c249e7

Hopefully I have explained this well enough.

I kinda expected this to be implemented in NQP and hence be identical across
implementations, and was just worried that the JVM implementation would rely
on the JDK's Unicode implementation in java.lang.Character and friends, but
I see that this is not the goal.

I am seeing tangential points about this; where should I raise them?
(Synchronous communication like IRC does not work well for me.)

Depends on the points.

If you want to get the attention of the core developers for a
discussion, IRC is your best bet. We often use it non-synchronously,
but it's the best place to start.

The mailing lists (https://perl6.org/archive/lists/) work as a very
distant second.

To report a bug with rakudo, open a ticket via an email to rakudobug@​perl.org

--
Will "Coke" Coleda

@p6rt
Copy link
Author

p6rt commented Oct 5, 2017

From @samcv

This has now been fixed as of Raku/nqp@deb8cb03e

Tests have been added for this and other non-BMP codepoints. Marking resolved.

@p6rt
Copy link
Author

p6rt commented Oct 5, 2017

@samcv - Status changed from 'open' to 'resolved'

@p6rt p6rt closed this as completed Oct 5, 2017
@p6rt p6rt added JVM Related to Rakudo-JVM uni labels Jan 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
JVM Related to Rakudo-JVM uni
Projects
None yet
Development

No branches or pull requests

1 participant