New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
possible bug in perl "\u" string processing #15517
Comments
From @jmdhThe following bug report was sent to Debian. I don't believe it is specific to Debian, and have no expertise in the area, so I am forwarding without further comment. It was reported against Debian perl 5.22.2-3, for which perl -V output is available[1] and I was also able to reproduce it against our perl 5.24.0 build. The following might be a bug in how Perl uppercases strings as in e.g.: perlop(1) already documents, that in Unicode context, this can result Anyway, when in perl I do e.g.: Now IMO that's an error, \u says "titlecase (not uppercase!) next character Cheers, [0] Not sure if this is still the case in most recent versions, as there [1] Summary of my perl5 (revision 5 version 22 subversion 2) configuration: Characteristics of this binary (from libperl): |
From @maukeAm 15.08.2016 um 23:14 schrieb Dominic Hargreaves (via RT):
This is not specific to \u. ucfirst("ß") also returns "Ss", which makes -- |
The RT System itself - Status changed from 'new' to 'open' |
From @cpansproutOn Mon Aug 15 14:14:57 2016, dom wrote:
I.e., titlecase, not capitalization.
But you did not ask for capitalization, but for titlecase.
I think it should result in Ss, because that is the titlecase version of ß. Karl Williamson should be able to confirm whether I am right. -- Father Chrysostomos |
From @cpansproutOn Mon Aug 15 14:27:49 2016, sprout wrote:
I was a bit sloppy with my wording there, because ‘capitalization’ in English usually means what geeks call ‘titlecase’. So: You asked for capitalization and got exactly that. -- Father Chrysostomos |
From @khwilliamsonOn 08/15/2016 03:27 PM, Father Chrysostomos via RT wrote:
It is deliberate. Consider the name "titlecase". It means how a title One would not have a word in a title that was "SSisch" But I'm told that this situation would never come up in natural German. The Unicode Standard has not changed the capitalization of ß with the |
From zefram@fysh.orgDominic Hargreaves wrote:
unicore/SpecialCasing.txt has # # The German es-zed is special--the normal mapping is to SS. So it's deliberate, but a known funny case.
That one's even funnier. It's so rarely used that it doesn't count as -zefram |
From @khwilliamsonRejecting, as Zefram pointed out the behavior is explicitly what The Unicode Standard specifies. |
@khwilliamson - Status changed from 'open' to 'rejected' |
From eric.herman@booking.comOn 16-08-16 00:03, Karl Williamson wrote:
This may not come up in German, it will come up at least some in dutch. For example the IJsselmeer lake. https://en.wikipedia.org/wiki/IJ_%28digraph%29#Capitalisation It should be noted, however, that the dutch U+0132 'IJ' and U+0133 'ij' I have no good idea of how to correctly handle title case for words -- |
From zefram@fysh.orgEric Herman via perl5-porters wrote:
With different characters that have their own capitalisation rules.
Unicode can't handle that automatically. One would need a -zefram |
From @TuxOn Tue, 16 Aug 2016 15:24:39 +0100, Zefram <zefram@fysh.org> wrote:
Note that the Dutch law has banished the use of the IJ and ij ligatures Does that help at all?
-- |
From @demerphqOn 16 August 2016 at 17:10, H.Merijn Brand <h.m.brand@xs4all.nl> wrote:
Not really. If they pass a law that means that Unicode can't do their Yves |
Migrated from rt.perl.org#128950 (status was 'rejected')
Searchable as RT128950$
The text was updated successfully, but these errors were encountered: