Skip Menu |
Report information
Id: 121317
Status: resolved
Priority: 0/
Queue: perl5

Owner: khw <khw [at] cpan.org>
Requestors: calle [at] init.se
Cc:
AdminCc:

Operating System: darwin
PatchStatus: (no value)
Severity: medium
Type: core
Perl Version: 5.19.9
Fixed In: (no value)



Subject: Gconvert() obeys LC_NUMERIC without "use locale" in 5.19.8 and 5.19.9
To: perlbug [...] perl.org
Date: Mon, 24 Feb 2014 15:39:38 +0100 (CET)
From: calle [...] init.se
This is a bug report for perl from calle@init.se, generated with the help of perlbug 1.40 running under perl 5.19.9. ----------------------------------------------------------------- [Please describe your issue here] Commit bc8ec7cc020d0562094a551b280fd3f32bf5eb04 makes Gconvert() obey the LC_NUMERIC environment variable even without "use locale" (or with "no locale"). The behavior is the same on at least OSX 10.9, FreeBSD 10 and Debian 7. Gconvert() may not be listed in perlapi, but there is at least one module on CPAN that uses it (which is why I found this), and if this change is intentional it would be nice if it was at least mentioned in perldelta. [Please do not change anything below this line] ----------------------------------------------------------------- --- Flags: category=core severity=medium --- Site configuration information for perl 5.19.9: Configured by called at Thu Feb 20 15:54:16 CET 2014. Summary of my perl5 (revision 5 version 19 subversion 9) configuration: Platform: osname=darwin, osvers=13.0.0, archname=darwin-2level uname='darwin necronomicon-ii.local 13.0.0 darwin kernel version 13.0.0: thu sep 19 22:22:27 pdt 2013; root:xnu-2422.1.72~6release_x86_64 x86_64 ' config_args='-de -Dprefix=/Users/called/perl5/perlbrew/perls/perl-5.19.9 -Dusedevel -Aeval:scriptdir=/Users/called/perl5/perlbrew/perls/perl-5.19.9/bin' hint=recommended, useposix=true, d_sigaction=define useithreads=undef, usemultiplicity=undef use64bitint=define, use64bitall=define, uselongdouble=undef usemymalloc=n, bincompat5005=undef Compiler: cc='cc', ccflags ='-fno-common -DPERL_DARWIN -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include', optimize='-O3', cppflags='-fno-common -DPERL_DARWIN -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include' ccversion='', gccversion='4.2.1 Compatible Apple LLVM 5.0 (clang-500.2.79)', gccosandvers='' intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16 ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8 alignbytes=8, prototype=define Linker and Libraries: ld='env MACOSX_DEPLOYMENT_TARGET=10.3 cc', ldflags =' -fstack-protector -L/usr/local/lib' libpth=/usr/local/lib /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../lib/clang/5.0/lib /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib /usr/lib libs=-lgdbm -ldbm -ldl -lm -lutil -lc perllibs=-ldl -lm -lutil -lc libc=, so=dylib, useshrplib=false, libperl=libperl.a gnulibc_version='' Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=bundle, d_dlsymun=undef, ccdlflags=' ' cccdlflags=' ', lddlflags=' -bundle -undefined dynamic_lookup -L/usr/local/lib -fstack-protector' --- @INC for perl 5.19.9: /Users/called/perl5/perlbrew/perls/perl-5.19.9/lib/site_perl/5.19.9/darwin-2level /Users/called/perl5/perlbrew/perls/perl-5.19.9/lib/site_perl/5.19.9 /Users/called/perl5/perlbrew/perls/perl-5.19.9/lib/5.19.9/darwin-2level /Users/called/perl5/perlbrew/perls/perl-5.19.9/lib/5.19.9 . --- Environment for perl 5.19.9: DYLD_LIBRARY_PATH (unset) HOME=/Users/called LANG=sv_SE.UTF-8 LANGUAGE (unset) LC_CTYPE=en_GB.UTF-8 LD_LIBRARY_PATH (unset) LOGDIR (unset) PATH=/Users/called/perl5/perlbrew/bin:/Users/called/perl5/perlbrew/perls/perl-5.19.9/bin:/Users/called/.gem/ruby/2.1.0/bin:/Users/called/.rubies/ruby-2.1.0/lib/ruby/gems/2.1.0/bin:/Users/called/.rubies/ruby-2.1.0/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/X11/bin:/Users/called/bin:/usr/local/sbin:/usr/local/opt/ruby/bin PERLBREW_BASHRC_VERSION=0.64 PERLBREW_HOME=/Users/called/.perlbrew PERLBREW_MANPATH=/Users/called/perl5/perlbrew/perls/perl-5.19.9/man PERLBREW_PATH=/Users/called/perl5/perlbrew/bin:/Users/called/perl5/perlbrew/perls/perl-5.19.9/bin PERLBREW_PERL=perl-5.19.9 PERLBREW_ROOT=/Users/called/perl5/perlbrew PERLBREW_VERSION=0.64 PERL_BADLANG (unset) SHELL=/bin/bash
Subject: Re: [perl #121317] Gconvert() obeys LC_NUMERIC without "use locale" in 5.19.8 and 5.19.9
Date: Wed, 26 Feb 2014 20:38:46 -0700
To: perl5-porters [...] perl.org
From: Karl Williamson <public [...] khwilliamson.com>
Download (untitled) / with headers
text/plain 2.1k
On 02/24/2014 07:39 AM, (via RT) wrote: Show quoted text
> # New Ticket Created by > # Please include the string: [perl #121317] > # in the subject line of all future correspondence about this issue. > # <URL: https://rt.perl.org/Ticket/Display.html?id=121317 > > > > > This is a bug report for perl from calle@init.se, > generated with the help of perlbug 1.40 running under perl 5.19.9. > > > ----------------------------------------------------------------- > [Please describe your issue here] > > Commit bc8ec7cc020d0562094a551b280fd3f32bf5eb04 makes Gconvert() obey > the LC_NUMERIC environment variable even without "use locale" (or with > "no locale"). The behavior is the same on at least OSX 10.9, FreeBSD > 10 and Debian 7. Gconvert() may not be listed in perlapi, but there is > at least one module on CPAN that uses it (which is why I found this), > and if this change is intentional it would be nice if it was at least > mentioned in perldelta.
I'm thinking we should revert the code changes part of that commit. The basic premise of this bug is wrong, but the underlying essence may indicate the need to change back. The commit did not change Gconvert in any way. All the bottom level interfaces to libc do not depend on being in scope of 'use locale' or not. In other words, GConvert only by chance obeyed 'use locale' in the cases that this ticket mentions. There were ways to get it to not obey 'use locale' before the commit. That said, the commit changes the behavior. I don't believe we are under any obligation really to maintain behavior of undocumented features, and avoiding such changes has the deleterious effect of encouraging people to use things they shouldn't be. If someone wants to use an undocumented feature, they should at least indicate to p5p that they want this supported in the future, and see where that leads. On the other hand, if we don't have to break existing code, then I don't think we should. Since that commit was made, I was forced to come up with a more general scheme for other reasons, and it turns out that after reverting it, the code still passes all the tests it added, and all others currently in the suite. So I'm thinking it's best to revert.
RT-Send-CC: perl5-porters [...] perl.org
Download (untitled) / with headers
text/plain 1.2k
On Mon Feb 24 06:39:54 2014, calle@init.se wrote: Show quoted text
> > This is a bug report for perl from calle@init.se, > generated with the help of perlbug 1.40 running under perl 5.19.9. > > > ----------------------------------------------------------------- > [Please describe your issue here] > > Commit bc8ec7cc020d0562094a551b280fd3f32bf5eb04 makes Gconvert() obey > the LC_NUMERIC environment variable even without "use locale" (or with > "no locale"). The behavior is the same on at least OSX 10.9, FreeBSD > 10 and Debian 7. Gconvert() may not be listed in perlapi, but there is > at least one module on CPAN that uses it (which is why I found this), > and if this change is intentional it would be nice if it was at least > mentioned in perldelta. >
I'm wondering if you have experienced GConvert prior to 5.19 being affected by "use locale". My reading of the code indicates not. It appears to me that it generally used the dot for a decimal point no matter what the locale and regardless of "use locale". The place I know of where it could use a comma, say, stemmed from using POSIX::strod() prior to it, as strtod didn't properly clean up after itself, and 'use locale' is irrelevant. But if you know cases other than this, I would appreciate hearing about them. Karl Williamson
From: Calle Dybedahl <calle [...] init.se>
To: perlbug-followup [...] perl.org
Date: Mon, 10 Mar 2014 09:18:42 +0100
Subject: Re: [perl #121317] Gconvert() obeys LC_NUMERIC without "use locale" in 5.19.8 and 5.19.9
On 7 mar 2014, at 21:05, Karl Williamson via RT <perlbug-followup@perl.org> wrote: Show quoted text
> I'm wondering if you have experienced GConvert prior to 5.19 being affected by "use locale". My reading of the code indicates not. It appears to me that it generally used the dot for a decimal point no matter what the locale and regardless of "use locale". The place I know of where it could use a comma, say, stemmed from using POSIX::strod() prior to it, as strtod didn't properly clean up after itself, and 'use locale' is irrelevant. But if you know cases other than this, I would appreciate hearing about them.
I haven’t seen that, no. But then I’ve never actually used locales in real-life code, since I’ve found them to contribute far more problems than value. My mentioning them in the problem report was more of a “this should definitely not happen without ‘use locale’” than a statement that it should happen with it. In retrospect, that may have been more confusing than useful. -- Calle Dybedahl calle@init.se -*- +46 703 - 970 612
From: Slaven Rezic <slaven [...] rezic.de>
CC: perlbug-followup [...] perl.org
To: Calle Dybedahl <calle [...] init.se>
Date: Thu, 13 Mar 2014 18:39:40 +0100
Subject: Re: [perl #121317] Gconvert() obeys LC_NUMERIC without "use locale" in 5.19.8 and 5.19.9
Download (untitled) / with headers
text/plain 1.7k
Calle Dybedahl <calle@init.se> writes: Show quoted text
> On 7 mar 2014, at 21:05, Karl Williamson via RT <perlbug-followup@perl.org> wrote: >
>> I'm wondering if you have experienced GConvert prior to 5.19 being >> affected by "use locale". My reading of the code indicates not. It >> appears to me that it generally used the dot for a decimal point no >> matter what the locale and regardless of "use locale". The place I >> know of where it could use a comma, say, stemmed from using >> POSIX::strod() prior to it, as strtod didn't properly clean up after >> itself, and 'use locale' is irrelevant. But if you know cases other >> than this, I would appreciate hearing about them.
> > I haven’t seen that, no. But then I’ve never actually used locales in > real-life code, since I’ve found them to contribute far more problems > than value. My mentioning them in the problem report was more of a > “this should definitely not happen without ‘use locale’” than a > statement that it should happen with it. In retrospect, that may have > been more confusing than useful.
I don't know if it's related, but there's a number of CPAN modules which fail now if a locale with "," for the decimal point is in effect. The list of failing modules: * Number::Format * JSON::XS * String::Print * Cpanel::JSON::XS * SHARYANTO::Number::Util * Tk (if you happen to not get the segfault happening on some architectures) The test suites of these modules fail if LC_NUMERIC is set to de_DE.UTF-8 on a Linux system (Debian/squeeze) or on FreeBSD systems (9.2 or 10.0). Regards, Slaven -- Slaven Rezic - slaven <at> rezic <dot> de BBBike - route planner for cyclists in Berlin WWW version: http://www.bbbike.de Perl/Tk version for Unix and Windows: http://bbbike.sourceforge.net
Date: Fri, 14 Mar 2014 08:34:04 +0100
To: perlbug-followup [...] perl.org
Subject: Re: [perl #121317] Gconvert() obeys LC_NUMERIC without "use locale" in 5.19.8 and 5.19.9
From: Calle Dybedahl <calle [...] init.se>
Download (untitled) / with headers
text/plain 493b
On 13 mar 2014, at 18:45, slaven@rezic.de via RT <perlbug-followup@perl.org> wrote: Show quoted text
> I don't know if it's related, but there's a number of CPAN modules which > fail now if a locale with "," for the decimal point is in effect. The > list of failing modules: > * Number::Format > * JSON::XS
Failing to install JSON::XS was how I noticed this in the first place, I just tried to narrow it down as far as I could before reporting it. -- Calle Dybedahl calle@init.se -*- +46 703 - 970 612
Date: Sun, 16 Mar 2014 09:25:13 +0000
To: Slaven Rezic <slaven [...] rezic.de>
Subject: Re: [perl #121317] Gconvert() obeys LC_NUMERIC without "use locale" in 5.19.8 and 5.19.9
CC: Calle Dybedahl <calle [...] init.se>, perlbug-followup [...] perl.org
From: Paul Johnson <paul [...] pjcj.net>
Download (untitled) / with headers
text/plain 771b
On Thu, Mar 13, 2014 at 06:39:40PM +0100, Slaven Rezic wrote: Show quoted text
> I don't know if it's related, but there's a number of CPAN modules which > fail now if a locale with "," for the decimal point is in effect. The > list of failing modules: > * Number::Format > * JSON::XS > * String::Print > * Cpanel::JSON::XS > * SHARYANTO::Number::Util > * Tk (if you happen to not get the segfault happening on some > architectures) > > The test suites of these modules fail if LC_NUMERIC is set to > de_DE.UTF-8 on a Linux system (Debian/squeeze) or on FreeBSD systems > (9.2 or 10.0).
Just for information, Devel::Cover can be added to this list: http://www.cpantesters.org/cpan/report/71ecad34-ac83-11e3-86a6-6f69e0bfc7aa -- Paul Johnson - paul@pjcj.net http://www.pjcj.net
From: Karl Williamson <public [...] khwilliamson.com>
To: Paul Johnson <paul [...] pjcj.net>, Slaven Rezic <slaven [...] rezic.de>
Date: Sun, 16 Mar 2014 08:54:37 -0600
Subject: Re: [perl #121317] Gconvert() obeys LC_NUMERIC without "use locale" in 5.19.8 and 5.19.9
CC: Calle Dybedahl <calle [...] init.se>, perlbug-followup [...] perl.org
Download (untitled) / with headers
text/plain 896b
On 03/16/2014 03:25 AM, Paul Johnson wrote: Show quoted text
> On Thu, Mar 13, 2014 at 06:39:40PM +0100, Slaven Rezic wrote: >
>> I don't know if it's related, but there's a number of CPAN modules which >> fail now if a locale with "," for the decimal point is in effect. The >> list of failing modules: >> * Number::Format >> * JSON::XS >> * String::Print >> * Cpanel::JSON::XS >> * SHARYANTO::Number::Util >> * Tk (if you happen to not get the segfault happening on some >> architectures) >> >> The test suites of these modules fail if LC_NUMERIC is set to >> de_DE.UTF-8 on a Linux system (Debian/squeeze) or on FreeBSD systems >> (9.2 or 10.0).
> > Just for information, Devel::Cover can be added to this list: > > http://www.cpantesters.org/cpan/report/71ecad34-ac83-11e3-86a6-6f69e0bfc7aa >
Thanks Slaven and Paul. This is most helpful. Is there any guess as to how complete these lists may be?
From: Paul Johnson <paul [...] pjcj.net>
CC: Slaven Rezic <slaven [...] rezic.de>, Calle Dybedahl <calle [...] init.se>, perlbug-followup [...] perl.org
Subject: Re: [perl #121317] Gconvert() obeys LC_NUMERIC without "use locale" in 5.19.8 and 5.19.9
To: Karl Williamson <public [...] khwilliamson.com>
Date: Mon, 17 Mar 2014 00:01:07 +0000
Download (untitled) / with headers
text/plain 1.3k
On Sun, Mar 16, 2014 at 08:54:37AM -0600, Karl Williamson wrote: Show quoted text
> On 03/16/2014 03:25 AM, Paul Johnson wrote:
> >On Thu, Mar 13, 2014 at 06:39:40PM +0100, Slaven Rezic wrote: > >
> >>I don't know if it's related, but there's a number of CPAN modules which > >>fail now if a locale with "," for the decimal point is in effect. The > >>list of failing modules: > >>* Number::Format > >>* JSON::XS > >>* String::Print > >>* Cpanel::JSON::XS > >>* SHARYANTO::Number::Util > >>* Tk (if you happen to not get the segfault happening on some > >> architectures) > >> > >>The test suites of these modules fail if LC_NUMERIC is set to > >>de_DE.UTF-8 on a Linux system (Debian/squeeze) or on FreeBSD systems > >>(9.2 or 10.0).
> > > >Just for information, Devel::Cover can be added to this list: > > > > http://www.cpantesters.org/cpan/report/71ecad34-ac83-11e3-86a6-6f69e0bfc7aa > >
> > > Thanks Slaven and Paul. This is most helpful. Is there any guess > as to how complete these lists may be?
I only know about Devel::Cover because of the cpantesters report from Slaven. Fortuitously, he was sat in the same room as me when I read the mail and he was able to diagnose the problem there and then. Perhaps Slaven has an idea of how much of CPAN has passed through his smokers since this problem started? -- Paul Johnson - paul@pjcj.net http://www.pjcj.net
Subject: Re: [perl #121317] Gconvert() obeys LC_NUMERIC without "use locale" in 5.19.8 and 5.19.9
CC: Slaven Rezic <slaven [...] rezic.de>, Calle Dybedahl <calle [...] init.se>, perlbug-followup [...] perl.org
Date: Mon, 17 Mar 2014 14:44:34 -0600
To: Paul Johnson <paul [...] pjcj.net>
From: Karl Williamson <public [...] khwilliamson.com>
Download (untitled) / with headers
text/plain 5.6k
On 03/16/2014 06:01 PM, Paul Johnson wrote: Show quoted text
> On Sun, Mar 16, 2014 at 08:54:37AM -0600, Karl Williamson wrote:
>> On 03/16/2014 03:25 AM, Paul Johnson wrote:
>>> On Thu, Mar 13, 2014 at 06:39:40PM +0100, Slaven Rezic wrote: >>>
>>>> I don't know if it's related, but there's a number of CPAN modules which >>>> fail now if a locale with "," for the decimal point is in effect. The >>>> list of failing modules: >>>> * Number::Format >>>> * JSON::XS >>>> * String::Print >>>> * Cpanel::JSON::XS >>>> * SHARYANTO::Number::Util >>>> * Tk (if you happen to not get the segfault happening on some >>>> architectures) >>>> >>>> The test suites of these modules fail if LC_NUMERIC is set to >>>> de_DE.UTF-8 on a Linux system (Debian/squeeze) or on FreeBSD systems >>>> (9.2 or 10.0).
>>> >>> Just for information, Devel::Cover can be added to this list: >>> >>> http://www.cpantesters.org/cpan/report/71ecad34-ac83-11e3-86a6-6f69e0bfc7aa >>>
>> >> >> Thanks Slaven and Paul. This is most helpful. Is there any guess >> as to how complete these lists may be?
> > I only know about Devel::Cover because of the cpantesters report from > Slaven. Fortuitously, he was sat in the same room as me when I read the > mail and he was able to diagnose the problem there and then. > > Perhaps Slaven has an idea of how much of CPAN has passed through his > smokers since this problem started? >
First, I'd like to thank Calle for submitting this ticket, and doing the initial leg work on it. It is true that Gconvert is not listed in perlapi, so one might argue that any module can't rely on it even existing in a future Perl release, much less that its behavior should be unchangeable. However, Gconvert is described in Porting/Glossary, so I think this means it effectively is part of the API. That description says nothing of locale effects on it, though the man pages for the functions it wraps do. I had been leaning towards reverting this commit, but after doing more research and experimentation, I've come to the conclusion that these modules were already broken, albeit much more rarely. The premise of this ticket is incorrect. The commit did not change whether Gconvert() obeys 'use locale' or not. It always hasn't obeyed 'use locale'. What changed in 5.19 is that the LC_NUMERIC category started to inherit from the environment variables that are in effect at start-up, like the documentation says it does, and like all the other locale categories. That it didn't inherit the environment caused pain for some people, who rightly filed a ticket, and which 5.19 fixed. I had thought that the only way to get Gconvert to use other than the C locale was to use POSIX::strtod(), as that function changed the locale to the underlying one unconditionally, and didn't change it back. But I was wrong, even though I had worked on this code recently. Besides strtod(), any call by any code anywhere to POSIX::setlocale() will set the underlying locale to the new locale, and thus causes Gconvert() to use that locale. (And BTW, strtod() has been fixed to restore the locale afterwards.) In other words, code using Gconvert cannot expect that the decimal point is going to be a dot unless it has taken steps to ensure that. Code that makes that assumption and fails to take such steps is buggy, and the 5.19 changes merely expose these bugs. A simple way to expose these bugs in earlier Perl releases is to add somewhere in a program that uses one of these modules, the line BEGIN { POSIX::setlocale(LC_NUMERIC, "de_DE.utf8"); } (after making sure POSIX:: is loaded). I tried this with 5.18.0 and a JSON::XS .t file, and sure enough, I get the same failures as with 5.19.9. Now, it may be that JSON::XS is agnostic about the radix character: "if the caller wants it to be a comma, fine; if it wants it to be a dot, also fine." I don't know the semantics of this module enough to know what it should do. It may also be that the caller does not do anything with locale itself, and now in 5.19, it is unexpectedly experiencing the effects of the user's locale. But the setlocale() that creates this failure in all releases (including 5.18 and before) doesn't have to come directly from the caller, it could be some other module down the dependency chain that gets loaded. Thus this bug is lurking even if we revert the blamed commit. This succinctly demonstrates the crux of the bug: JSON::XS shouldn't introduce failures because of locale changes outside its control, but it does. One solution to this is to wrap Gconvert, as the core now does in its uses of it, so as to make sure that the radix character is what it should be. The problem is that this is XS code, and I'm not sure we know what the writer's intentions are. I *think* that this is too low level code to be making assumptions about that, but I'm open to suggestions to the contrary. My current thinking is to make the wrapping macros part of the API, and tell XS writers they should use these when calling libc functions which are affected by LC_NUMERIC. There are macros that save and restore the current state, and set it in the meantime to one of the following, depending on the macro: 1) based on being in 'use locale' or not; 2) to "C"; 3) to the current underlying one. Also, we could make a pre-wrapped Gconvert that uses the C locale when called from outside 'use locale'. JSON::XS and other modules could convert to use this macro, depending on what they are trying to accomplish. It could be backported through PPPort. I haven't looked at all the modules in the list; I know that several of them use Gconvert, but not all. The one that I've looked at that doesn't, is Number::Format. It has a different set of problems, which I'll comment on in a different message.
Subject: Re: [perl #121317] Gconvert() obeys LC_NUMERIC without "use locale" in 5.19.8 and 5.19.9
CC: Slaven Rezic <slaven [...] rezic.de>, Calle Dybedahl <calle [...] init.se>, perlbug-followup [...] perl.org, "William R. Ward" [...] smtp.indra.com, wrw [...] cpan.org
To: Paul Johnson <paul [...] pjcj.net>
Date: Wed, 19 Mar 2014 14:19:29 -0600
From: Karl Williamson <public [...] khwilliamson.com>
Download (untitled) / with headers
text/plain 2.2k
On 03/17/2014 02:44 PM, Karl Williamson wrote: Show quoted text
> I haven't looked at all the modules in the list; I know that several of > them use Gconvert, but not all. The one that I've looked at that > doesn't, is Number::Format. It has a different set of problems, which > I'll comment on in a different message.
And here is that message. I'm repeating a bit of the context of the previous email, as I've added the maintainer of Number::Format to the cc list. The tests in this module are buggy, and again it was exposed by the change in 5.19 to have LC_NUMERIC be set by the environment variables in effect at the time perl is started. As I said before, this change was in response to a bug report by people adversely affected by it previously not doing this, and it brings the actual behavior of LC_NUMERIC into line with its documented behavior, and how all the other locale categories behave. The Number::Format tests do seem to think that in fact the locale is inherited by the environment, as they do attempt to set it to an expected value, by the following: setlocale(&LC_ALL, 'en_US'); However, this is omitted from one file, format_bytes.t, and the return code is not checked. On my machine, and I suspect many other modern Linux versions, there is no locale 'en_US', and the setlocale fails. Prior to 5.19.9, this didn't matter as the LC_NUMERIC was set to the C locale, ignoring the outside environment; the changes to 5.19 caused LC_NUMERIC to be set to what the outside environment says it should be, thus causing the radix character to possibly be a comma in appropriate locales, thus exposing this bug, which is only in the test files. If I change the 'en_US' to 'C', they pass (also adding the appropriate setlocale call to the one file that is missing it). 'C' is the only locale that is guaranteed to exist on all systems (that have locales). In researching this, I found another bug, in locale.t. It explicitly sets the thousands separator to a dot, but leaves the monetary thousands separator unchanged, and then asks for a monetary format expecting a dot. On my machine that separator is a space in that locale. I haven't done the legwork to see if this a bug in my machine's locales or not, but it seems likely that if one has to set the regular separator, one also has to set the monetary one.
Subject: Re: [perl #121317] Gconvert() obeys LC_NUMERIC without "use locale" in 5.19.8 and 5.19.9
Date: Thu, 20 Mar 2014 09:21:50 -0400
To: perl5-porters [...] perl.org
From: Ricardo Signes <perl.p5p [...] rjbs.manxome.org>
Download (untitled) / with headers
text/plain 501b
* Karl Williamson <public@khwilliamson.com> [2014-03-19T16:19:29] Show quoted text
> On 03/17/2014 02:44 PM, Karl Williamson wrote:
> >I haven't looked at all the modules in the list; I know that several of > >them use Gconvert, but not all. The one that I've looked at that > >doesn't, is Number::Format. It has a different set of problems, which > >I'll comment on in a different message.
> > And here is that message.
Thanks for both of these enlightening messages. Your reasoning seems sound to me. -- rjbs
Download signature.asc
application/pgp-signature 473b

Message body not shown because it is not plain text.

CC: perl5-porters [...] perl.org, "perlbug-followup [...] perl.org" <perlbug-followup [...] perl.org>
Subject: Re: Perl 5.20.0 Blockers, 2014-03-24 [perl #121317]
To: slaven [...] rezic.de, Ricardo Signes <perl.p5p [...] rjbs.manxome.org>
Date: Sun, 30 Mar 2014 22:51:11 -0600
From: Karl Williamson <public [...] khwilliamson.com>
Download (untitled) / with headers
text/plain 3.3k
On 03/30/2014 09:58 PM, Karl Williamson wrote: Show quoted text
> On 03/30/2014 11:38 AM, Slaven Rezic wrote:
>> I suspect that every CPAN module using strtod/sprintf indirectly through >> a shared library is broken.
> > I think you didn't understand my previous post on this > <53275EB2.4000809@khwilliamson.com>. These modules were already broken; > it's just that their breakage didn't surface very often prior to the > blamed patch. > > It's like the hash key order randomization change. Most modules that > "broke" as a result of the change were already broken. It's just that > their tests and typical usage didn't cause the hashes to grow enough to > cause an hsplit(), which, when it happens, causes the key order to > change, IIRC. The change, besides being necessary for security reasons, > did the maintainers a favor by exposing a problem that could > occasionally occur in the field and would be very hard to reproduce and > debug. > > In my post on this, I show how to easily get the same breakage symptoms > on earlier Perl releases as the blamed commit gives in 5.19. > > The blamed commit is not necessary for security, so we as a project > might decide that it's not worth fixing these bugs, and to permanently > revert the patch, documenting the issue. But that is very different > from the idea that this patch "broke" modules, and I believe it's > important to keep that distinction in mind when making whatever decision > gets made. > > "The truth shall set you free, but first it will make you miserable" > -- origin disputed, often (mis-)attributed to U.S. president James > Garfield, who BTW came up with an original proof of the > Pythagorean theorem > > > >
Having thought about this a little more, I have yet another idea: 1) Revert the commit for 5.20. 2) In 5.21, change POSIX::setlocale() so that it always leaves the LC_NUMERIC locale as "C", but sets an interpreter variable (which already is done BTW) to indicate what locale to use when doing LC_NUMERIC operations within the scope of "use locale". The core already wraps all such operations it performs (unless I've missed any) so that it uses the correct locale based on that flag, and "use locale" scope. Thus pure perl code is unaffected. 3) This would mean that all libc calls from XS would normally get a dot radix, and the modules that Slaven has given would be automatically fixed from those bugs I said existed in 5.18 and earlier. 4) There are undoubtedly XS modules that depend on the radix not being dot when a setlocale asking for such is executed. But judging from the responses here, these are far fewer than those that always want a dot. By doing the change very early in 5.21, we find out for sure. Anyway, such modules would have to save/set/restore LC_NUMERIC around their non-dot need. There are macros in perl.h that manage this for you. Some of these have been around and been used by the core and POSIX::XS since at least 1996, v5.003. 5) POSIX::strtod() has for a long time assumed that the LC_NUMERIC locale was possibly wrongly C, and used the macros to change it to what the interpreter variable says it should be (but there was a bug until 5.19 in which it failed to change it back). There are other POSIX:: functions that should do the same. Maybe only localeconv(). This would probably mean we wouldn't deprecate Gconvert, but as I suggested earlier in the [perl #121317] thread, we document those macros (for the first time).
RT-Send-CC: perl5-porters [...] perl.org
Changed by commit 52686f2a73483730c9ee6d16084c57a769f58495 -- Karl Williamson
Subject: Re: [perl #121317] Gconvert() obeys LC_NUMERIC without "use locale" in 5.19.8 and 5.19.9
From: Torsten Schoenfeld <kaffeetisch [...] gmx.de>
Date: Sun, 01 Jun 2014 16:48:32 +0200
To: perlbug-followup [...] perl.org, public [...] khwilliamson.com
Download (untitled) / with headers
text/plain 1.2k
According to 'git bisect', commit 52686f2a73483730c9ee6d16084c57a769f58495 broke perl's version parsing when Gtk3 is loaded: # perl -e'use Gtk3; BEGIN{ Gtk3::init (); } use 5.8.0;' && echo OK Invalid version format (non-numeric data) at -e line 1. Indirectly, this causes failures in Gtk3's test suite. A similar bug was reported in <https://rt.perl.org/Public/Bug/Display.html?id=120723> and fixed by commit bc8ec7cc020d0562094a551b280fd3f32bf5eb04, which the new commit 52686f2a73483730c9ee6d16084c57a769f58495 partly reverts. Gtk3::init is a wrapper around the C function gtk_init which for this bug boils down to setlocale (LC_ALL, ""), see <https://git.gnome.org/browse/gtk+/tree/gtk/gtkmain.c#n622>. However, in contrast to the situation prior to commit bc8ec7cc020d0562094a551b280fd3f32bf5eb04, I'm now unable to reproduce the problem with POSIX::setlocale alone: # perl -e'use POSIX qw/locale_h/; BEGIN{ setlocale (LC_ALL, ""); } use 5.8.0;' && echo OK OK My locale environment is: LANG=en_US.UTF-8 LANGUAGE=en_US:en LC_CTYPE="en_US.UTF-8" LC_NUMERIC=de_DE.UTF-8 LC_TIME=de_DE.UTF-8 LC_COLLATE="en_US.UTF-8" LC_MONETARY=de_DE.UTF-8 LC_MESSAGES="en_US.UTF-8" LC_PAPER=de_DE.UTF-8 LC_NAME=de_DE.UTF-8 LC_ADDRESS=de_DE.UTF-8 LC_TELEPHONE=de_DE.UTF-8 LC_MEASUREMENT=de_DE.UTF-8 LC_IDENTIFICATION=de_DE.UTF-8 LC_ALL=
Subject: Re: [perl #121317] Gconvert() obeys LC_NUMERIC without "use locale" in 5.19.8 and 5.19.9
From: Karl Williamson <public [...] khwilliamson.com>
Date: Mon, 02 Jun 2014 11:01:18 -0600
To: Torsten Schoenfeld <kaffeetisch [...] gmx.de>, perlbug-followup [...] perl.org
Download (untitled) / with headers
text/plain 1.6k
On 06/01/2014 08:48 AM, Torsten Schoenfeld wrote: Show quoted text
> According to 'git bisect', commit > 52686f2a73483730c9ee6d16084c57a769f58495 broke perl's version parsing > when Gtk3 is loaded: > > # perl -e'use Gtk3; BEGIN{ Gtk3::init (); } use 5.8.0;' && echo OK > Invalid version format (non-numeric data) at -e line 1. > > Indirectly, this causes failures in Gtk3's test suite. > > A similar bug was reported in > <https://rt.perl.org/Public/Bug/Display.html?id=120723> and fixed by > commit bc8ec7cc020d0562094a551b280fd3f32bf5eb04, which the new commit > 52686f2a73483730c9ee6d16084c57a769f58495 partly reverts. > > Gtk3::init is a wrapper around the C function gtk_init which for this > bug boils down to setlocale (LC_ALL, ""), see > <https://git.gnome.org/browse/gtk+/tree/gtk/gtkmain.c#n622>. > > However, in contrast to the situation prior to commit > bc8ec7cc020d0562094a551b280fd3f32bf5eb04, I'm now unable to reproduce > the problem with POSIX::setlocale alone: > > # perl -e'use POSIX qw/locale_h/; BEGIN{ setlocale (LC_ALL, ""); } use > 5.8.0;' && echo OK > OK > > My locale environment is: > > LANG=en_US.UTF-8 > LANGUAGE=en_US:en > LC_CTYPE="en_US.UTF-8" > LC_NUMERIC=de_DE.UTF-8 > LC_TIME=de_DE.UTF-8 > LC_COLLATE="en_US.UTF-8" > LC_MONETARY=de_DE.UTF-8 > LC_MESSAGES="en_US.UTF-8" > LC_PAPER=de_DE.UTF-8 > LC_NAME=de_DE.UTF-8 > LC_ADDRESS=de_DE.UTF-8 > LC_TELEPHONE=de_DE.UTF-8 > LC_MEASUREMENT=de_DE.UTF-8 > LC_IDENTIFICATION=de_DE.UTF-8 > LC_ALL= >
Please see the discussion of https://rt.perl.org/Ticket/Display.html?id=121930 I think that the branch at http://perl5.git.perl.org/perl.git/shortlog/refs/heads/smoke-me/khw-locale should fix this, and would appreciate it if you would try it out


This service is sponsored and maintained by Best Practical Solutions and runs on Perl.org infrastructure.

For issues related to this RT instance (aka "perlbug"), please contact perlbug-admin at perl.org