Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

.perl method mishandling combining characters #4219

Closed
p6rt opened this issue May 5, 2015 · 10 comments
Closed

.perl method mishandling combining characters #4219

p6rt opened this issue May 5, 2015 · 10 comments

Comments

@p6rt
Copy link

p6rt commented May 5, 2015

Migrated from rt.perl.org#125110 (status was 'resolved')

Searchable as RT125110$

@p6rt
Copy link
Author

p6rt commented May 5, 2015

From @dwarring

The .perl method seems to have problems serializing combining characters. I've observed the following, for characters in the range U+768 - U+879, at least​:

% perl6-m --version
This is perl6 version 2015.04-168-g1763049 built on MoarVM version 2015.04-62-g052aca0
% perl6-j -e'say .uniname, .chr.perl for 780'
COMBINING CARON""
% perl6-m -e'say .uniname, .chr.perl for 780'
COMBINING CARON""

All characters in this range are serializing to an empty string on both Moar ad JVM back-ends.

@p6rt
Copy link
Author

p6rt commented May 10, 2015

From @dwarring

Tests have been added to S02-names-vars/perl.t

Failing on Moar, but passing on JVM. Moar is not roundtripping EVAL 780.chr.perl, whereas JVM is. Just difficult to see visually.

Imho single combining characters probably should be escaped, e.g.

780.chr.perl should serialize to "\x[30C]".

On Tue May 05 14​:54​:04 2015, david.warring wrote​:

The .perl method seems to have problems serializing combining
characters. I've observed the following, for characters in the range
U+768 - U+879, at least​:

% perl6-m --version
This is perl6 version 2015.04-168-g1763049 built on MoarVM version
2015.04-62-g052aca0
% perl6-j -e'say .uniname, .chr.perl for 780'
COMBINING CARON""
% perl6-m -e'say .uniname, .chr.perl for 780'
COMBINING CARON""

All characters in this range are serializing to an empty string on
both Moar ad JVM back-ends.

@p6rt
Copy link
Author

p6rt commented May 26, 2015

From @Mouq

21​:05 <Mouq> m​: say "\x35A".perl
21​:05 <camelia> rakudo-moar c2a57e​: OUTPUT«"͚"␤»
21​:05 <Mouq> m​: say EVAL "\x35A".perl
21​:05 <camelia> rakudo-moar c2a57e​: OUTPUT«===SORRY!=== Error while
compiling
  EVAL_0␤Bogus statement␤at EVAL_0​:1␤------> <BOL>⏏"͚"␤
  expecting any of​:␤ prefix␤ term␤»

Spotted with Rakudo 2015.5 by Paul Cochrane while attempting to use
HTML​::Entity, where the list of entities were being generated from the
official json list with ".perl".

@p6rt
Copy link
Author

p6rt commented May 27, 2015

From @hoelzro

Is this related to https://rt-archive.perl.org/perl6/Ticket/Display.html?id=125255?

On Sun May 10 14​:57​:52 2015, david.warring wrote​:

Tests have been added to S02-names-vars/perl.t

Failing on Moar, but passing on JVM. Moar is not roundtripping EVAL
780.chr.perl, whereas JVM is. Just difficult to see visually.

Imho single combining characters probably should be escaped, e.g.

780.chr.perl should serialize to "\x[30C]".

On Tue May 05 14​:54​:04 2015, david.warring wrote​:

The .perl method seems to have problems serializing combining
characters. I've observed the following, for characters in the range
U+768 - U+879, at least​:

% perl6-m --version
This is perl6 version 2015.04-168-g1763049 built on MoarVM version
2015.04-62-g052aca0
% perl6-j -e'say .uniname, .chr.perl for 780'
COMBINING CARON""
% perl6-m -e'say .uniname, .chr.perl for 780'
COMBINING CARON""

All characters in this range are serializing to an empty string on
both Moar ad JVM back-ends.

@p6rt
Copy link
Author

p6rt commented May 27, 2015

The RT System itself - Status changed from 'new' to 'open'

@p6rt
Copy link
Author

p6rt commented May 27, 2015

From @hoelzro

Is this related to https://rt-archive.perl.org/perl6/Ticket/Display.html?id=125110?

On Tue May 26 14​:11​:54 2015, Mouq wrote​:

21​:05 <Mouq> m​: say "\x35A".perl
21​:05 <camelia> rakudo-moar c2a57e​: OUTPUT«"͚"␤»
21​:05 <Mouq> m​: say EVAL "\x35A".perl
21​:05 <camelia> rakudo-moar c2a57e​: OUTPUT«===SORRY!=== Error while
compiling
EVAL_0␤Bogus statement␤at EVAL_0​:1␤------> <BOL>⏏"͚"␤
expecting any of​:␤ prefix␤ term␤»

Spotted with Rakudo 2015.5 by Paul Cochrane while attempting to use
HTML​::Entity, where the list of entities were being generated from the
official json list with ".perl".

@p6rt
Copy link
Author

p6rt commented May 27, 2015

The RT System itself - Status changed from 'new' to 'open'

@p6rt
Copy link
Author

p6rt commented May 27, 2015

From @dwarring

@​rob
Yes that looks the same problem as this ticket.
- David
On Wed May 27 07​:46​:01 2015, rob@​hoelz.ro wrote​:

Is this related to https://rt-archive.perl.org/perl6/Ticket/Display.html?id=125255?

On Sun May 10 14​:57​:52 2015, david.warring wrote​:

Tests have been added to S02-names-vars/perl.t

Failing on Moar, but passing on JVM. Moar is not roundtripping EVAL
780.chr.perl, whereas JVM is. Just difficult to see visually.

Imho single combining characters probably should be escaped, e.g.

780.chr.perl should serialize to "\x[30C]".

On Tue May 05 14​:54​:04 2015, david.warring wrote​:

The .perl method seems to have problems serializing combining
characters. I've observed the following, for characters in the range
U+768 - U+879, at least​:

% perl6-m --version
This is perl6 version 2015.04-168-g1763049 built on MoarVM version
2015.04-62-g052aca0
% perl6-j -e'say .uniname, .chr.perl for 780'
COMBINING CARON""
% perl6-m -e'say .uniname, .chr.perl for 780'
COMBINING CARON""

All characters in this range are serializing to an empty string on
both Moar ad JVM back-ends.

@p6rt
Copy link
Author

p6rt commented Jun 29, 2015

From @jnthn

On Sun May 10 14​:57​:52 2015, david.warring wrote​:

Tests have been added to S02-names-vars/perl.t

Failing on Moar, but passing on JVM. Moar is not roundtripping EVAL
780.chr.perl, whereas JVM is. Just difficult to see visually.

Imho single combining characters probably should be escaped, e.g.

780.chr.perl should serialize to "\x[30C]".

Fixed it to do exactly that now. The tests pass and are unfudged.

@p6rt
Copy link
Author

p6rt commented Jun 29, 2015

@jnthn - Status changed from 'open' to 'resolved'

@p6rt p6rt closed this as completed Jun 29, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant