New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
degenerates: Mo or Mn Unicode characters combine with punctuation #5902
Comments
From @samcvsay "ୈ"; # U+0B48 ORIYA VOWEL SIGN AI Discovered this while trying to add a test to roast to cover the The most telling part of the bug is: It seems these combining characters are combining with characters they should If I try Q style quoting normally: Q<ୈ> It seems this is also true for other Mn or Mo charactcers |
From @samcvIt looks like according to the Unicode grapheme things, ‘degenerates’ do not
So we don't *have* to support this case, but the spec makes it very clear that These degenerate cases are also not tested for in any of the Unicode grapheme |
From @samcvLooks like JVM handles these degenerates nicely: JVM: say "<ୈ".chars #> 2 Moar: |
From @samcvOn Wed, 28 Dec 2016 16:19:48 -0800, samantham@posteo.net wrote:
Looks like the JVM backend doesn't implement character counting except by codepoint. So it is just not aware of it except on the codepoint level. |
From @samcvChanging the subject to indicate that our current functionality is not technically incorrect. Am not sure if I want to add a LTA tag to this or not, since I have not determined yet resolving this in any way is feasible or wanted from a technical standpoint. I am definitely going to leave this open and will add more information or notes if there is new information. |
From @samcvBug has been open a while, and I have not forgotten it, I had just not reached a final decision. After further thought I'm closing this WONTFIX. It would needlessly complicate our grapheme concatenation and in addition I believe it may break some of the grapheme concatenation tests. |
@samcv - Status changed from 'new' to 'rejected' |
Migrated from rt.perl.org#130384 (status was 'rejected')
Searchable as RT130384$
The text was updated successfully, but these errors were encountered: