Skip Menu |
Report information
Id: 112208
Status: resolved
Priority: 0/
Queue: perl5

Owner: Nobody
Requestors: doherty [at] cpan.org
dom <dom [at] earth.li>
Cc:
AdminCc:

Operating System: (no value)
PatchStatus: (no value)
Severity: low
Type: unknown
Perl Version: (no value)
Fixed In: 5.22.0

Attachments
Attached Message Part



Subject: printing $! when open.pm sets utf8 default on filehandles yeilds garbage
Date: Mon, 02 Apr 2012 22:35:40 -0400
To: perlbug [...] perl.org
From: Mike Doherty <doherty [...] cs.dal.ca>
Download (untitled) / with headers
text/plain 3.8k
This is a bug report for perl from doherty@cpan.org, generated with the help of perlbug 1.39 running under perl 5.14.2. ----------------------------------------------------------------- [Please describe your issue here] If your locale is set to something like ru_RU.UTF-8, then the following program will output garbage: use strict; use open qw(:std :encoding(UTF-8); use IO::Socket; unless (IO::Socket::INET->new("localhost:1111")) { print $!, "\n"; } __END__ Or another example: perl -CS -MErrno -le '$!=Errno::ETIMEDOUT; print $!' This was originally reported as a bug against the utf8::all module, which uses open.pm to set default PerlIO layers for the caller: https://github.com/doherty/utf8-all/issues/9 [Please do not change anything below this line] ----------------------------------------------------------------- --- Flags: category=library severity=medium module=open --- Site configuration information for perl 5.14.2: Configured by mike at Fri Sep 30 15:36:02 ADT 2011. Summary of my perl5 (revision 5 version 14 subversion 2) configuration: Platform: osname=linux, osvers=2.6.32-34-generic, archname=x86_64-linux uname='linux charron 2.6.32-34-generic #77-ubuntu smp tue sep 13 19:39:17 utc 2011 x86_64 gnulinux ' config_args='-de -Dprefix=/home/mike/perl5/perlbrew/perls/perl-5.14.2' hint=recommended, useposix=true, d_sigaction=define useithreads=undef, usemultiplicity=undef useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef use64bitint=define, use64bitall=define, uselongdouble=undef usemymalloc=n, bincompat5005=undef Compiler: cc='cc', ccflags ='-fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64', optimize='-O2', cppflags='-fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include' ccversion='', gccversion='4.4.3', gccosandvers='' intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16 ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8 alignbytes=8, prototype=define Linker and Libraries: ld='cc', ldflags =' -fstack-protector -L/usr/local/lib' libpth=/usr/local/lib /lib/../lib /usr/lib/../lib /lib /usr/lib /usr/lib/x86_64-linux-gnu /lib64 /usr/lib64 libs=-lnsl -ldl -lm -lcrypt -lutil -lc perllibs=-lnsl -ldl -lm -lcrypt -lutil -lc libc=/lib/libc-2.11.1.so, so=so, useshrplib=false, libperl=libperl.a gnulibc_version='2.11.1' Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E' cccdlflags='-fPIC', lddlflags='-shared -O2 -L/usr/local/lib -fstack-protector' Locally applied patches: --- @INC for perl 5.14.2: /home/mike/perl5/perlbrew/perls/perl-5.14.2/lib/site_perl/5.14.2/x86_64-linux /home/mike/perl5/perlbrew/perls/perl-5.14.2/lib/site_perl/5.14.2 /home/mike/perl5/perlbrew/perls/perl-5.14.2/lib/5.14.2/x86_64-linux /home/mike/perl5/perlbrew/perls/perl-5.14.2/lib/5.14.2 . --- Environment for perl 5.14.2: HOME=/home/mike LANG=en_CA.UTF-8 LANGUAGE=en_CA:en LD_LIBRARY_PATH=/usr/lib/oracle/11.2/client64/lib LOGDIR (unset) PATH=/home/mike/.bin:/home/mike/perl5/perlbrew/bin:/home/mike/perl5/perlbrew/perls/perl-5.14.2/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/home/mike/Downloads/android-sdk-linux_x86/tools:/home/mike/Downloads/android-sdk-linux_x86/platform-tools:/usr/lib/oracle/11.2/client64/bin PERLBREW_BASHRC_VERSION=0.42 PERLBREW_HOME=/home/mike/.perlbrew PERLBREW_MANPATH=/home/mike/perl5/perlbrew/perls/perl-5.14.2/man PERLBREW_PATH=/home/mike/perl5/perlbrew/bin:/home/mike/perl5/perlbrew/perls/perl-5.14.2/bin PERLBREW_PERL=perl-5.14.2 PERLBREW_ROOT=/home/mike/perl5/perlbrew PERLBREW_VERSION=0.42 PERL_BADLANG (unset) SHELL=/bin/bash

Message body is not shown because sender requested not to inline it.

RT-Send-CC: perl5-porters [...] perl.org
Download (untitled) / with headers
text/plain 177b
On Mon Apr 02 19:36:07 2012, doherty@cpan.org wrote: Show quoted text
> > use open qw(:std :encoding(UTF-8);
Looks like it's missing a closing parenthesis: use open qw(:std :encoding(UTF-8));
RT-Send-CC: perl5-porters [...] perl.org
On Mon Apr 02 19:36:07 2012, doherty@cpan.org wrote: Show quoted text
> > This is a bug report for perl from doherty@cpan.org, > generated with the help of perlbug 1.39 running under perl 5.14.2. > > > ----------------------------------------------------------------- > [Please describe your issue here] > > If your locale is set to something like ru_RU.UTF-8, then the > following > program will output garbage: > > use strict; > use open qw(:std :encoding(UTF-8); > use IO::Socket; > > unless (IO::Socket::INET->new("localhost:1111")) { > print $!, "\n"; > } > __END__ > > Or another example: perl -CS -MErrno -le '$!=Errno::ETIMEDOUT; print > $!' > > This was originally reported as a bug against the utf8::all module, > which uses open.pm to set default PerlIO layers for the caller: > https://github.com/doherty/utf8-all/issues/9
There has been some discussion about making syscalls work properly with Unicode, under a pragma. This seems like something that should be taken into account, too, so I’m linking this to the meta ticket (#105914). -- Father Chrysostomos
RT-Send-CC: perl5-porters [...] perl.org
Download (untitled) / with headers
text/plain 454b
On Tue Apr 03 13:00:37 2012, sprout wrote: Show quoted text
> There has been some discussion about making syscalls work properly with > Unicode, under a pragma. This seems like something that should be taken > into account, too, so I’m linking this to the meta ticket (#105914).
The problem is not in any syscall but in locale handling. The problem is that strerror returns something appropriate for the current locale, but perl always assumes it is in Latin-1. Leon
RT-Send-CC: perl5-porters [...] perl.org
Download (untitled) / with headers
text/plain 713b
On Tue Apr 03 13:50:43 2012, LeonT wrote: Show quoted text
> On Tue Apr 03 13:00:37 2012, sprout wrote:
> > There has been some discussion about making syscalls work properly with > > Unicode, under a pragma. This seems like something that should be taken > > into account, too, so I’m linking this to the meta ticket (#105914).
> > The problem is not in any syscall but in locale handling. The problem is > that strerror returns something appropriate for the current locale, but > perl always assumes it is in Latin-1.
I was just wondering whether it could be made part of the same pragma. After all, it’s conceivably the same sort of thing: the byte sequence coming from the OS is not Latin-1. -- Father Chrysostomos
Subject: UTF8 error messages ($!, $@); use open qw{:utf8 :std}
Date: Mon, 01 Apr 2013 18:42:12 +0100
To: perlbug [...] perl.org
From: dom [...] earth.li
Download (untitled) / with headers
text/plain 4.6k
This is a bug report for perl from dom@earth.li, generated with the help of perlbug 1.39 running under perl 5.17.10. ----------------------------------------------------------------- [Please describe your issue here] As reported in Debian's bugtracker at <http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=409704> Quoting from Joey on the bug report: " joey@kodama:~>LANG=fr_FR.UTF-8 perl -e 'use open qw{:utf8 :std};open(foo) || print STDERR "error: $!\n";' error: Aucun fichier ou répertoire de ce type ^^ This mojibake comes about because $! is a UTF-8 string in that locale, but it is not decoded into perl's internal utf8 representation. It's possible to work around the problem with the encoding pragma, but not completely: joey@kodama:~>LANG=fr_FR.UTF-8 perl -e 'use open qw{:utf8 :std}; use encoding 'utf8';open(foo) || print STDERR "error: $!\n";' error: Aucun fichier ou répertoire de ce type joey@kodama:~>LANG=fr_FR.UTF-8 perl -e 'use open qw{:utf8 :std}; use encoding 'utf8';open(foo) || print STDERR "error: ",$!,"\n";' error: Aucun fichier ou répertoire de ce type The first example works because the encoding pragma converts the string to utf8 during concacenation, but the second example shows that this is not a solution because concacentation can't be relied on for all output. The only solution if you want to use open qw{:utf8 :std} in a program seems to be manually using Encode::decode_utf8 on every instance of $! and $@ in the program. Which is exactly the kind of error-prone busywork that IO layers and perl's unicode model are supposed to avoid.." The test in question no longer works on my Debian system, because the French error message no longer contains an accent, but the same behaviour can be reproduced using eg ja_JP.UTF-8, and also on 5.17.10. [Please do not change anything below this line] ----------------------------------------------------------------- --- Flags: category=core severity=low --- Site configuration information for perl 5.17.10: Configured by dom at Sun Mar 31 23:53:26 BST 2013. Summary of my perl5 (revision 5 version 17 subversion 10) configuration: Platform: osname=linux, osvers=3.2.0-4-686-pae, archname=i686-linux uname='linux callisto 3.2.0-4-686-pae #1 smp debian 3.2.39-2 i686 gnulinux ' config_args='-de -Dprefix=/home/dom/perl5/perlbrew/perls/perl-5.17.10 -Dusedevel' hint=recommended, useposix=true, d_sigaction=define useithreads=undef, usemultiplicity=undef useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef use64bitint=undef, use64bitall=undef, uselongdouble=undef usemymalloc=n, bincompat5005=undef Compiler: cc='cc', ccflags ='-fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64', optimize='-O2', cppflags='-fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include' ccversion='', gccversion='4.7.2', gccosandvers='' intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12 ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8 alignbytes=4, prototype=define Linker and Libraries: ld='cc', ldflags =' -fstack-protector -L/usr/local/lib' libpth=/usr/local/lib /lib/i386-linux-gnu /lib/../lib /usr/lib/i386-linux-gnu /usr/lib/../lib /lib /usr/lib libs=-lnsl -ldl -lm -lcrypt -lutil -lc perllibs=-lnsl -ldl -lm -lcrypt -lutil -lc libc=, so=so, useshrplib=false, libperl=libperl.a gnulibc_version='2.13' Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E' cccdlflags='-fPIC', lddlflags='-shared -O2 -L/usr/local/lib -fstack-protector' Locally applied patches: --- @INC for perl 5.17.10: /home/dom/perl5/perlbrew/perls/perl-5.17.10/lib/site_perl/5.17.10/i686-linux /home/dom/perl5/perlbrew/perls/perl-5.17.10/lib/site_perl/5.17.10 /home/dom/perl5/perlbrew/perls/perl-5.17.10/lib/5.17.10/i686-linux /home/dom/perl5/perlbrew/perls/perl-5.17.10/lib/5.17.10 . --- Environment for perl 5.17.10: HOME=/home/dom LANG=en_GB.UTF-8 LANGUAGE (unset) LD_LIBRARY_PATH (unset) LOGDIR (unset) PATH=/home/dom/perl5/perlbrew/bin:/home/dom/perl5/perlbrew/perls/perl-5.17.10/bin:~/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games PERLBREW_BASHRC_VERSION=0.43 PERLBREW_HOME=/home/dom/.perlbrew PERLBREW_MANPATH=/home/dom/perl5/perlbrew/perls/perl-5.17.10/man PERLBREW_PATH=/home/dom/perl5/perlbrew/bin:/home/dom/perl5/perlbrew/perls/perl-5.17.10/bin PERLBREW_PERL=perl-5.17.10 PERLBREW_ROOT=/home/dom/perl5/perlbrew PERLBREW_VERSION=0.43 PERL_BADLANG (unset) SHELL=/bin/bash
RT-Send-CC: perl5-porters [...] perl.org
Download (untitled) / with headers
text/plain 1.9k
On Mon Apr 01 10:42:44 2013, dom wrote: Show quoted text
> As reported in Debian's bugtracker at > <http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=409704> > > Quoting from Joey on the bug report: > > " > joey@kodama:~>LANG=fr_FR.UTF-8 perl -e 'use open qw{:utf8 > :std};open(foo) || print STDERR "error: $!\n";' > error: Aucun fichier ou répertoire de ce type > ^^ > This mojibake comes about because $! is a UTF-8 string in that locale, > but it > is not decoded into perl's internal utf8 representation. > > It's possible to work around the problem with the encoding pragma, but > not completely: > > joey@kodama:~>LANG=fr_FR.UTF-8 perl -e 'use open qw{:utf8 :std}; use > encoding 'utf8';open(foo) || print STDERR "error: $!\n";' > error: Aucun fichier ou r�pertoire de ce type > > joey@kodama:~>LANG=fr_FR.UTF-8 perl -e 'use open qw{:utf8 :std}; use > encoding 'utf8';open(foo) || print STDERR "error: ",$!,"\n";' > error: Aucun fichier ou répertoire de ce type > > The first example works because the encoding pragma converts the > string > to utf8 during concacenation, but the second example shows that this > is > not a solution because concacentation can't be relied on for all > output. > > The only solution if you want to use open qw{:utf8 :std} in a program > seems to be manually using Encode::decode_utf8 on every instance of $! > and $@ in the program. Which is exactly the kind of error-prone > busywork > that IO layers and perl's unicode model are supposed to avoid.." > > The test in question no longer works on my Debian system, because > the French error message no longer contains an accent, but the same > behaviour can be reproduced using eg ja_JP.UTF-8, and also on 5.17.10.
This bug is a duplicate of #112208. The return value of sterror(3) is not properly decoded in $!'s magic. I did write a proof-of-concept fix utf8::errno (https://github.com/Leont/utf8-errno/blob/master/lib/utf8/errno.xs), but a real solution would probably be different. Leon
Subject: Re: [perl #117429] UTF8 error messages ($!, $@); use open qw{:utf8 :std}
Date: Mon, 1 Apr 2013 18:57:03 +0100
To: Leon Timmermans via RT <perlbug-followup [...] perl.org>
From: Dominic Hargreaves <dom [...] earth.li>
Download (untitled) / with headers
text/plain 500b
On Mon, Apr 01, 2013 at 10:54:48AM -0700, Leon Timmermans via RT wrote: Show quoted text
> This bug is a duplicate of #112208. The return value of sterror(3) is > not properly decoded in $!'s magic. I did write a proof-of-concept fix > utf8::errno > (https://github.com/Leont/utf8-errno/blob/master/lib/utf8/errno.xs), but > a real solution would probably be different.
Aha! Thanks, merged. Dominic. -- Dominic Hargreaves | http://www.larted.org.uk/~dom/ PGP key 5178E2A5 from the.earth.li (keyserver,web,email)
RT-Send-CC: perl5-porters [...] perl.org
Fixed by commit 1500bd919ffeae0f3252f8d1bb28b03b043d328e -- Karl Williamson
RT-Send-CC: doherty [...] cs.dal.ca, dom [...] earth.li, perl5-porters [...] perl.org
Download (untitled) / with headers
text/plain 211b
The fix for this had to be reverted for v5.20, because it caused problems for other modules. There is a new plan to fix this for v5.22, and I'm adding this ticket to the blockers for 5.21.1 -- Karl Williamson
RT-Send-CC: dom [...] earth.li, doherty [...] cs.dal.ca, perl5-porters [...] perl.org
Download (untitled) / with headers
text/plain 125b
This is now fixed again in blead via commit 2c6ee1a7a1ce7cff7755f9aa43a65b8278dd82a1 and its predecessor. -- Karl Williamson
Subject: Your ticket against Perl 5 has been resolved
Download (untitled) / with headers
text/plain 222b
Thanks for submitting this ticket The issue should be resolved with the release today of Perl v5.22. If you find that the problem persists, feel free to reopen this ticket -- Karl Williamson for the Perl 5 porters team


This service is sponsored and maintained by Best Practical Solutions and runs on Perl.org infrastructure.

For issues related to this RT instance (aka "perlbug"), please contact perlbug-admin at perl.org