New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LC_NUMERIC documentation issues #10740
Comments
From @ntyniThis is a bug report for perl from Niko Tyni <ntyni@debian.org>, Most of the confusion around LC_NUMERIC was fixed with commits - the early parts of perllocale.pod still say printf() uses LC_NUMERIC - according to my tests, format() hasn't used LC_NUMERIC unconditionally Proposed patch attached. I think that this matches what format() Triggered by http://bugs.debian.org/379329 Flags: Site configuration information for perl 5.13.5: Configured by niko at Tue Oct 19 21:45:01 EEST 2010. Summary of my perl5 (revision 5 version 13 subversion 5) configuration: Locally applied patches: @INC for perl 5.13.5: Environment for perl 5.13.5: |
From @ntyni0001-LC_NUMERIC-documentation-updates.patchFrom 5023698ddfa5227fa5c5952b7ddd3b6a6009e591 Mon Sep 17 00:00:00 2001
From: Niko Tyni <ntyni@debian.org>
Date: Tue, 19 Oct 2010 21:55:14 +0300
Subject: [PATCH] LC_NUMERIC documentation updates
Most of the confusion around LC_NUMERIC was fixed with commits
7e4353e96785be675a69a6886d154405dbfdc124 and
2095dafae09cfface71d4202b3188926ea0ccc1c
but two errors remain:
- the early parts of perllocale.pod still say printf() uses LC_NUMERIC
with just 'use locale' when actually a POSIX::setlocale() call is
also needed
- format() hasn't used LC_NUMERIC unconditionally since 5.005_03
(commit 097ee67dff1c60f201bc09435bc6eaeeafcd8123).
---
pod/perlform.pod | 20 ++++++++------------
pod/perllocale.pod | 15 ++++++---------
2 files changed, 14 insertions(+), 21 deletions(-)
diff --git a/pod/perlform.pod b/pod/perlform.pod
index 3cfa1b7..df0f0a1 100644
--- a/pod/perlform.pod
+++ b/pod/perlform.pod
@@ -166,9 +166,9 @@ token on the first line. If an expression evaluates to a number with a
decimal part, and if the corresponding picture specifies that the decimal
part should appear in the output (that is, any picture except multiple "#"
characters B<without> an embedded "."), the character used for the decimal
-point is B<always> determined by the current LC_NUMERIC locale. This
-means that, if, for example, the run-time environment happens to specify a
-German locale, "," will be used instead of the default ".". See
+point is determined by the current LC_NUMERIC locale if C<use locale> is in
+effect. This means that, if, for example, the run-time environment happens
+to specify a German locale, "," will be used instead of the default ".". See
L<perllocale> and L<"WARNINGS"> for more information.
@@ -442,15 +442,11 @@ Lexical variables (declared with "my") are not visible within a
format unless the format is declared within the scope of the lexical
variable. (They weren't visible at all before version 5.001.)
-Formats are the only part of Perl that unconditionally use information
-from a program's locale; if a program's environment specifies an
-LC_NUMERIC locale, it is always used to specify the decimal point
-character in formatted output. Perl ignores all other aspects of locale
-handling unless the C<use locale> pragma is in effect. Formatted output
-cannot be controlled by C<use locale> because the pragma is tied to the
-block structure of the program, and, for historical reasons, formats
-exist outside that block structure. See L<perllocale> for further
-discussion of locale handling.
+If a program's environment specifies an LC_NUMERIC locale and C<use
+locale> is in effect when the format is declared, the locale is used
+to specify the decimal point character in formatted output. Formatted
+output cannot be controlled by C<use locale> at the time when write()
+is called. See L<perllocale> for further discussion of locale handling.
Within strings that are to be displayed in a fixed length text field,
each control character is substituted by a space. (But remember the
diff --git a/pod/perllocale.pod b/pod/perllocale.pod
index 0dbabe7..0bec423 100644
--- a/pod/perllocale.pod
+++ b/pod/perllocale.pod
@@ -115,8 +115,7 @@ ucfirst(), and lcfirst()) use C<LC_CTYPE>
=item *
-B<The formatting functions> (printf(), sprintf() and write()) use
-C<LC_NUMERIC>
+B<Format declarations> (format()) use C<LC_NUMERIC>
=item *
@@ -967,13 +966,11 @@ system's implementation of the locale system than by Perl.
=head2 write() and LC_NUMERIC
-Formats are the only part of Perl that unconditionally use information
-from a program's locale; if a program's environment specifies an
-LC_NUMERIC locale, it is always used to specify the decimal point
-character in formatted output. Formatted output cannot be controlled by
-C<use locale> because the pragma is tied to the block structure of the
-program, and, for historical reasons, formats exist outside that block
-structure.
+If a program's environment specifies an LC_NUMERIC locale and C<use
+locale> is in effect when the format is declared, the locale is used
+to specify the decimal point character in formatted output. Formatted
+output cannot be controlled by C<use locale> at the time when write()
+is called.
=head2 Freely available locale definitions
--
1.7.1
|
@cpansprout - Status changed from 'new' to 'rejected' |
@cpansprout - Status changed from 'rejected' to 'new' |
From @cpansproutOn Tue Oct 19 12:19:19 2010, ntyni@debian.org wrote:
I know very little about locales, but if you could provide some tests |
The RT System itself - Status changed from 'new' to 'open' |
From @ntyniOn Sun, Oct 24, 2010 at 01:48:14PM -0700, Father Chrysostomos via RT wrote:
Thanks for looking at this. New patches attached, now with proper tests. (The comments about constant folding bugs are because I'm about to perlbug |
From @ntyni0001-Refactor-LC_NUMERIC-test-out-of-t-run-fresh_perl.t.patchFrom 8fe1e0991520cf973f8fd5d492e8d0750a681e93 Mon Sep 17 00:00:00 2001
From: Niko Tyni <ntyni@debian.org>
Date: Wed, 27 Oct 2010 09:49:38 +0300
Subject: [PATCH 1/2] Refactor LC_NUMERIC test out of t/run/fresh_perl.t
Neither lib/locale.t nor t/run/fresh_perl.t should be used for new tests,
so take the locale related tests and the setup code from fresh_perl.t
to make ground for more.
---
MANIFEST | 1 +
t/run/fresh_perl.t | 36 ------------------------------------
t/run/locale.t | 50 ++++++++++++++++++++++++++++++++++++++++++++++++++
3 files changed, 51 insertions(+), 36 deletions(-)
create mode 100644 t/run/locale.t
diff --git a/MANIFEST b/MANIFEST
index a69f37a..2f83e7f 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -4801,6 +4801,7 @@ t/re/uniprops.t Test unicode \p{} regex constructs
t/run/cloexec.t Test close-on-exec.
t/run/exit.t Test perl's exit status.
t/run/fresh_perl.t Tests that require a fresh perl.
+t/run/locale.t Tests related to locale handling
t/run/noswitch.t Test aliasing ARGV for other switch tests
t/run/runenv.t Test if perl honors its environment variables.
t/run/script.t See if script invocation works
diff --git a/t/run/fresh_perl.t b/t/run/fresh_perl.t
index 3666f09..927d7f6 100644
--- a/t/run/fresh_perl.t
+++ b/t/run/fresh_perl.t
@@ -565,42 +565,6 @@ EOT
EXPECT
ok
########
-# This test is here instead of lib/locale.t because
-# the bug depends on in the internal state of the locale
-# settings and pragma/locale messes up that state pretty badly.
-# We need a "fresh run".
-BEGIN {
- eval { require POSIX };
- if ($@) {
- exit(0); # running minitest?
- }
-}
-use Config;
-my $have_setlocale = $Config{d_setlocale} eq 'define';
-$have_setlocale = 0 if $@;
-# Visual C's CRT goes silly on strings of the form "en_US.ISO8859-1"
-# and mingw32 uses said silly CRT
-$have_setlocale = 0 if (($^O eq 'MSWin32' || $^O eq 'NetWare') && $Config{cc} =~ /^(cl|gcc)/i);
-exit(0) unless $have_setlocale;
-my @locales;
-if (-x "/usr/bin/locale" && open(LOCALES, "/usr/bin/locale -a 2>/dev/null|")) {
- while(<LOCALES>) {
- chomp;
- push(@locales, $_);
- }
- close(LOCALES);
-}
-exit(0) unless @locales;
-for (@locales) {
- use POSIX qw(locale_h);
- use locale;
- setlocale(LC_NUMERIC, $_) or next;
- my $s = sprintf "%g %g", 3.1, 3.1;
- next if $s eq '3.1 3.1' || $s =~ /^(3.+1) \1$/;
- print "$_ $s\n";
-}
-EXPECT
-########
# [ID 20001202.002] and change #8066 added 'at -e line 1';
# reversed again as a result of [perl #17763]
die qr(x)
diff --git a/t/run/locale.t b/t/run/locale.t
new file mode 100644
index 0000000..9f9d32c
--- /dev/null
+++ b/t/run/locale.t
@@ -0,0 +1,50 @@
+#!./perl
+BEGIN {
+ chdir 't' if -d 't';
+ @INC = '../lib';
+ require './test.pl'; # for fresh_perl_is() etc
+}
+
+use strict;
+
+########
+# This test is here instead of lib/locale.t because
+# the bug depends on in the internal state of the locale
+# settings and pragma/locale messes up that state pretty badly.
+# We need a "fresh run".
+BEGIN {
+ eval { require POSIX };
+ if ($@) {
+ skip_all("could not load the POSIX module"); # running minitest?
+ }
+}
+use Config;
+my $have_setlocale = $Config{d_setlocale} eq 'define';
+$have_setlocale = 0 if $@;
+# Visual C's CRT goes silly on strings of the form "en_US.ISO8859-1"
+# and mingw32 uses said silly CRT
+$have_setlocale = 0 if (($^O eq 'MSWin32' || $^O eq 'NetWare') && $Config{cc} =~ /^(cl|gcc)/i);
+skip_all("no setlocale available") unless $have_setlocale;
+my @locales;
+if (-x "/usr/bin/locale" && open(LOCALES, "/usr/bin/locale -a 2>/dev/null|")) {
+ while(<LOCALES>) {
+ chomp;
+ push(@locales, $_);
+ }
+ close(LOCALES);
+}
+skip_all("no locales available") unless @locales;
+
+plan tests => &last;
+fresh_perl_is("for (qw(@locales)) {\n" . <<'EOF',
+ use POSIX qw(locale_h);
+ use locale;
+ setlocale(LC_NUMERIC, "$_") or next;
+ my $s = sprintf "%g %g", 3.1, 3.1;
+ next if $s eq '3.1 3.1' || $s =~ /^(3.+1) \1$/;
+ print "$_ $s\n";
+}
+EOF
+ "", {}, "no locales where LC_NUMERIC breaks");
+
+sub last { 1 }
--
1.7.2.3
|
From @ntyni0002-LC_NUMERIC-documentation-updates-tests.patchFrom 954257213d7277918ce61eac4edf7dbced233c62 Mon Sep 17 00:00:00 2001
From: Niko Tyni <ntyni@debian.org>
Date: Tue, 19 Oct 2010 21:55:14 +0300
Subject: [PATCH 2/2] LC_NUMERIC documentation updates + tests
Most of the confusion around LC_NUMERIC was fixed with commits
7e4353e96785be675a69a6886d154405dbfdc124 and
2095dafae09cfface71d4202b3188926ea0ccc1c
but two errors remain:
- the early parts of perllocale.pod still say printf() uses LC_NUMERIC
with just 'use locale' when actually a POSIX::setlocale() call is
also needed
- format() hasn't used LC_NUMERIC unconditionally since 5.005_03
(commit 097ee67dff1c60f201bc09435bc6eaeeafcd8123).
Update the documentation and test the claims in t/run/locale.t.
---
pod/perlform.pod | 20 ++++-------
pod/perllocale.pod | 15 +++-----
t/run/locale.t | 93 +++++++++++++++++++++++++++++++++++++++++++++++++---
3 files changed, 102 insertions(+), 26 deletions(-)
diff --git a/pod/perlform.pod b/pod/perlform.pod
index 3cfa1b7..df0f0a1 100644
--- a/pod/perlform.pod
+++ b/pod/perlform.pod
@@ -166,9 +166,9 @@ token on the first line. If an expression evaluates to a number with a
decimal part, and if the corresponding picture specifies that the decimal
part should appear in the output (that is, any picture except multiple "#"
characters B<without> an embedded "."), the character used for the decimal
-point is B<always> determined by the current LC_NUMERIC locale. This
-means that, if, for example, the run-time environment happens to specify a
-German locale, "," will be used instead of the default ".". See
+point is determined by the current LC_NUMERIC locale if C<use locale> is in
+effect. This means that, if, for example, the run-time environment happens
+to specify a German locale, "," will be used instead of the default ".". See
L<perllocale> and L<"WARNINGS"> for more information.
@@ -442,15 +442,11 @@ Lexical variables (declared with "my") are not visible within a
format unless the format is declared within the scope of the lexical
variable. (They weren't visible at all before version 5.001.)
-Formats are the only part of Perl that unconditionally use information
-from a program's locale; if a program's environment specifies an
-LC_NUMERIC locale, it is always used to specify the decimal point
-character in formatted output. Perl ignores all other aspects of locale
-handling unless the C<use locale> pragma is in effect. Formatted output
-cannot be controlled by C<use locale> because the pragma is tied to the
-block structure of the program, and, for historical reasons, formats
-exist outside that block structure. See L<perllocale> for further
-discussion of locale handling.
+If a program's environment specifies an LC_NUMERIC locale and C<use
+locale> is in effect when the format is declared, the locale is used
+to specify the decimal point character in formatted output. Formatted
+output cannot be controlled by C<use locale> at the time when write()
+is called. See L<perllocale> for further discussion of locale handling.
Within strings that are to be displayed in a fixed length text field,
each control character is substituted by a space. (But remember the
diff --git a/pod/perllocale.pod b/pod/perllocale.pod
index 0dbabe7..0bec423 100644
--- a/pod/perllocale.pod
+++ b/pod/perllocale.pod
@@ -115,8 +115,7 @@ ucfirst(), and lcfirst()) use C<LC_CTYPE>
=item *
-B<The formatting functions> (printf(), sprintf() and write()) use
-C<LC_NUMERIC>
+B<Format declarations> (format()) use C<LC_NUMERIC>
=item *
@@ -967,13 +966,11 @@ system's implementation of the locale system than by Perl.
=head2 write() and LC_NUMERIC
-Formats are the only part of Perl that unconditionally use information
-from a program's locale; if a program's environment specifies an
-LC_NUMERIC locale, it is always used to specify the decimal point
-character in formatted output. Formatted output cannot be controlled by
-C<use locale> because the pragma is tied to the block structure of the
-program, and, for historical reasons, formats exist outside that block
-structure.
+If a program's environment specifies an LC_NUMERIC locale and C<use
+locale> is in effect when the format is declared, the locale is used
+to specify the decimal point character in formatted output. Formatted
+output cannot be controlled by C<use locale> at the time when write()
+is called.
=head2 Freely available locale definitions
diff --git a/t/run/locale.t b/t/run/locale.t
index 9f9d32c..483123f 100644
--- a/t/run/locale.t
+++ b/t/run/locale.t
@@ -8,12 +8,12 @@ BEGIN {
use strict;
########
-# This test is here instead of lib/locale.t because
-# the bug depends on in the internal state of the locale
+# These tests are here instead of lib/locale.t because
+# some bugs depend on in the internal state of the locale
# settings and pragma/locale messes up that state pretty badly.
-# We need a "fresh run".
+# We need "fresh runs".
BEGIN {
- eval { require POSIX };
+ eval { require POSIX; POSIX->import("locale_h") };
if ($@) {
skip_all("could not load the POSIX module"); # running minitest?
}
@@ -47,4 +47,87 @@ fresh_perl_is("for (qw(@locales)) {\n" . <<'EOF',
EOF
"", {}, "no locales where LC_NUMERIC breaks");
-sub last { 1 }
+fresh_perl_is("for (qw(@locales)) {\n" . <<'EOF',
+ use POSIX qw(locale_h);
+ use locale;
+ my $in = 4.2;
+ my $s = sprintf "%g", $in; # avoid any constant folding bugs
+ next if $s eq "4.2";
+ print "$_ $s\n";
+}
+EOF
+ "", {}, "LC_NUMERIC without setlocale() has no effect in any locale");
+
+# try to find out a locale where LC_NUMERIC makes a difference
+my $original_locale = setlocale(LC_NUMERIC);
+
+my ($base, $different, $difference);
+for ("C", @locales) { # prefer C for the base if available
+ use locale;
+ setlocale(LC_NUMERIC, $_) or next;
+ my $in = 4.2; # avoid any constant folding bugs
+ if ((my $s = sprintf("%g", $in)) eq "4.2") {
+ $base ||= $_;
+ } else {
+ $different ||= $_;
+ $difference ||= $s;
+ }
+
+ last if $base && $different;
+}
+setlocale(LC_NUMERIC, $original_locale);
+
+SKIP: {
+ skip("no locale available where LC_NUMERIC makes a difference", &last - 2)
+ if !$different;
+ note("using the '$different' locale for LC_NUMERIC tests");
+ for ($different) {
+ local $ENV{LC_NUMERIC} = $_;
+ local $ENV{LC_ALL}; # so it never overrides LC_NUMERIC
+
+ fresh_perl_is(<<'EOF', "4.2", {},
+format STDOUT =
+@.#
+4.179
+.
+write;
+EOF
+ "format() does not look at LC_NUMERIC without 'use locale'");
+
+ {
+ fresh_perl_is(<<'EOF', $difference, {},
+use locale;
+format STDOUT =
+@.#
+4.179
+.
+write;
+EOF
+ "format() looks at LC_NUMERIC with 'use locale'");
+ }
+
+ {
+ fresh_perl_is(<<'EOF', "4.2", {},
+format STDOUT =
+@.#
+4.179
+.
+{ use locale; write; }
+EOF
+ "too late to look at the locale at write() time");
+ }
+
+ {
+ fresh_perl_is(<<'EOF', $difference, {},
+use locale; format STDOUT =
+@.#
+4.179
+.
+{ no locale; write; }
+EOF
+ "too late to ignore the locale at write() time");
+ }
+ }
+} # SKIP
+
+sub last { 6 }
--
1.7.2.3
|
From @cpansproutOn Wed Oct 27 02:09:56 2010, ntyni@debian.org wrote:
|
@cpansprout - Status changed from 'open' to 'resolved' |
Migrated from rt.perl.org#78452 (status was 'resolved')
Searchable as RT78452$
The text was updated successfully, but these errors were encountered: