Skip Menu |
Report information
Id: 131801
Status: resolved
Priority: 0/
Queue: perl6

Owner: samantham [at] posteo.net
Requestors: alex.jakimenko [at] gmail.com
Cc:
AdminCc:

Severity: (no value)
Tag: (no value)
Platform: (no value)
Patch Status: (no value)
VM: Moar



Subject: Stranded strings with combiners or ZWJ on borders break my NFG expectations ( (“\x[0305]a” x 2).chars.say )
Download (untitled) / with headers
text/plain 653b
Code: say (“\c[COMBINING OVERLINE]a” x 2).chars Result: 4 Code: say (“\c[COMBINING OVERLINE]a” ~ “\c[COMBINING OVERLINE]a”).chars Result: 3 Both should produce the same result (3). What happens here is “a” on one side is not being squished into one grapheme with a combiner on another side. Please note that combiners are not the only thing can cause this. Here is the same thing with ZWJ: Code: my $x = “\x[2695]\x[FE0F]a\x[1F468]\x[200D]”; say ($x ~ $x).chars; say ($x x 2).chars Result: 5 6 I have a feeling that this is a known issue, and that there might be a ticket for this already. However, I couldn't find it.
Subject: [UNI] Combiners are counted as separate graphemes ( ("\c[COMBINING ACUTE ACCENT]" x 5).chars )
Download (untitled) / with headers
text/plain 286b
Submitting so that it does not slip through the cracks. <AlexDaniel> m: ("\c[COMBINING ACUTE ACCENT]" x 5).chars.say <camelia> rakudo-moar 636a3c: OUTPUT: «5␤» <samcv> AlexDaniel, that is a bug <samcv> i will be fixing it for my grant though <samcv> it only occurs in certain cases
This is the same issue, but it is interesting nonetheless:

<AlexDaniel> m: dd (0x0F75.chr x 2).uninames
<camelia> rakudo-moar 636a3c: OUTPUT: «("TIBETAN VOWEL SIGN AA", "TIBETAN VOWEL SIGN U", "TIBETAN VOWEL SIGN AA", "TIBETAN VOWEL SIGN U").Seq␤»
<AlexDaniel> m: dd (0x0F75.chr ~ 0x0F75.chr).uninames
<camelia> rakudo-moar 636a3c: OUTPUT: «("TIBETAN VOWEL SIGN AA", "TIBETAN VOWEL SIGN AA", "TIBETAN VOWEL SIGN U", "TIBETAN VOWEL SIGN U").Seq␤»

Note that the order should be normalized.
On 2017-07-26 03:36:32, alex.jakimenko@gmail.com wrote:
Show quoted text
> Code:
> say (“\c[COMBINING OVERLINE]a” x 2).chars
>
> Result:
> 4
>
>
> Code:
> say (“\c[COMBINING OVERLINE]a” ~ “\c[COMBINING OVERLINE]a”).chars
>
> Result:
> 3
>
>
>
> Both should produce the same result (3). What happens here is “a” on
> one side is not being squished into one grapheme with a combiner on
> another side.
>
> Please note that combiners are not the only thing can cause this. Here
> is the same thing with ZWJ:
>
> Code:
> my $x = “\x[2695]\x[FE0F]a\x[1F468]\x[200D]”;
> say ($x ~ $x).chars;
> say ($x x 2).chars
>
>
> Result:
> 5
> 6
>
>
>
>
> I have a feeling that this is a known issue, and that there might be a
> ticket for this already. However, I couldn't find it.




This service is sponsored and maintained by Best Practical Solutions and runs on Perl.org infrastructure.

For issues related to this RT instance (aka "perlbug"), please contact perlbug-admin at perl.org