New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Interaction of \U/\L and \u/\l escapes are undocumented #5467
Comments
From scs@superior.inland-sea.comThis is a bug report for perl from scs@di.org, When doing case manipulation with \L, \U, \l and \u, the result of use strict; prints hostNaMe is in lower case except for the N and M. with both perl5.6.1 and perl5.00503. In the second line printed, One might argue that these are functional (and indeed, they may print lc( "host" . uc( "n" ) . "aMe" ) . I believe that a state-wise interpretation is more reasonable, ie, The usage I suggest is more consistant with perls current treatment print "\Uhost\nnaMe\E\n"; does not print HOSTNAME IMHO the programmer who writes "\Lhost\unaMe\E" clearly wants Flags: Site configuration information for perl v5.6.1: Configured by root at Tue Mar 26 11:46:11 GMT 2002. Summary of my perl5 (revision 5.0 version 6 subversion 1) configuration: Locally applied patches: @INC for perl v5.6.1: Environment for perl v5.6.1: |
From dland@landgren.netThis isn't limited to FreeBSD, it's general to all perls from at least (\l is "lower case next letter", \U is "uppercase until \E or EOS"). For instance: "\Un\lext" eq "NeXT" # wrong, currently "NEXT" because toke.c breaks this up as uc("n" . lcfirst("ext")) And while this is fixable, I think it would be unwise for maint. For maint+blead, should we document that \U takes precedence over \l, Do we fix it for blead? If not, then the bug could be rejected. David |
From @dcollinsnIt seems to me that we should at least document this behavior and test for it. Since changing it now could affect existing code, and evidently is not easy (since the tokenizer implements them as uc() and lcfirst(), it would take a significant overhaul to change this behavior), we should resolve the documentation ambiguity. This patch does so, and adds a test. |
From @dcollinsn0001-RT-9360-Document-interaction-of-U-L-u-l.patchFrom f699759c922a0976bdd5295e9f7e7c58ceee7ffc Mon Sep 17 00:00:00 2001
From: Dan Collins <dcollinsn@gmail.com>
Date: Mon, 4 Jul 2016 21:25:30 -0400
Subject: [PATCH] [RT #9360] Document interaction of \U \L \u \l
---
pod/perlop.pod | 5 +++++
t/op/lc.t | 6 +++++-
2 files changed, 10 insertions(+), 1 deletion(-)
diff --git a/pod/perlop.pod b/pod/perlop.pod
index 365c962..42ef2d7 100644
--- a/pod/perlop.pod
+++ b/pod/perlop.pod
@@ -1592,6 +1592,11 @@ C<\E> for each. For example:
say"This \Qquoting \ubusiness \Uhere isn't quite\E done yet,\E is it?";
This quoting\ Business\ HERE\ ISN\'T\ QUITE\ done\ yet\, is it?
+In the case of a conflict between C<\L>, C<\U>, C<\l>, and C<\u>, the
+outermost escape sequence will apply. For example, C<\L\Utest> is C<test>,
+and C<\Lt\uest> is C<test>. If you find this surprising, consider that
+the latter example is interpreted as C<lc('t' . ucfirst('est'))>.
+
If a S<C<use locale>> form that includes C<LC_CTYPE> is in effect (see
L<perllocale>), the case map used by C<\l>, C<\L>, C<\u>, and C<\U> is
taken from the current locale. If Unicode (for example, C<\N{}> or code
diff --git a/t/op/lc.t b/t/op/lc.t
index 2ce65ac..565c1cb 100644
--- a/t/op/lc.t
+++ b/t/op/lc.t
@@ -16,7 +16,7 @@ BEGIN {
use feature qw( fc );
-plan tests => 139 + 4 * 256;
+plan tests => 142 + 4 * 256;
is(lc(undef), "", "lc(undef) is ''");
is(lcfirst(undef), "", "lcfirst(undef) is ''");
@@ -341,6 +341,10 @@ SKIP: {
is($x, "A", "first { fc }");
}
+# RT #9360: \L and \U vs \l and \u
+is("\Utest", 'TEST', 'RT #9360: \L, \U, \l, \u');
+is("\Ute\Est", 'TEst', 'RT #9360: \L, \U, \l, \u');
+is("\Ut\lest", 'TEST', 'RT #9360: \L, \U, \l, \u');
my $utf8_locale = find_utf8_ctype_locale();
--
2.8.1
|
From @khwilliamsonOn Mon Jul 04 18:31:20 2016, dcollinsn@gmail.com wrote:
Please read the thread beginning at I'm unsure as to the right course. |
Migrated from rt.perl.org#9360 (status was 'open')
Searchable as RT9360$
The text was updated successfully, but these errors were encountered: