Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

open ':locale' does not work under locale with the modifier #9185

Closed
p5pRT opened this issue Jan 11, 2008 · 14 comments
Closed

open ':locale' does not work under locale with the modifier #9185

p5pRT opened this issue Jan 11, 2008 · 14 comments

Comments

@p5pRT
Copy link

p5pRT commented Jan 11, 2008

Migrated from rt.perl.org#49646 (status was 'resolved')

Searchable as RT49646$

@p5pRT
Copy link
Author

p5pRT commented Jan 11, 2008

From kmashrab@uni-bremen.de

I have Perl v5.8.8. My locale is Uzbek (cyrillic), which is a standard libc
locale for Uzbek language written in cyrillic script. My LANG is
uz_UZ.UTF-8@​cyrillic, which is perfectly OK.

When I use the Perl command "use open '​:locale'" I get the following error
messages​:

Cannot find encoding "UTF-8@​cyrillic" at /usr/lib/perl5/5.8.7/open.pm line
125.
Cannot find encoding "UTF-8@​cyrillic" at /usr/lib/perl5/5.8.7/open.pm line
133.

@p5pRT
Copy link
Author

p5pRT commented Jan 11, 2008

From kmashrab@uni-bremen.de

Created by kmashrab@uni-bremen.de

I have Perl v5.8.8. My locale is Uzbek (cyrillic), which is a standard libc
locale for Uzbek language written in cyrillic script. My LANG is
uz_UZ.UTF-8@​cyrillic, which is perfectly OK.

When I use the Perl command "use open '​:locale'" I get the following error
messages​:

Cannot find encoding "UTF-8@​cyrillic" at /usr/lib/perl5/5.8.7/open.pm line
125.
Cannot find encoding "UTF-8@​cyrillic" at /usr/lib/perl5/5.8.7/open.pm line
133.

Perl Info

Flags:
    category=core
    severity=high

Site configuration information for perl v5.8.8:

Configured by Mandriva at Sat Dec 22 08:05:52 EST 2007.

Summary of my perl5 (revision 5 version 8 subversion 8) configuration:
  Platform:
    osname=linux, osvers=2.6.12-12mdksmp, archname=i386-linux
    uname='linux n4.mandriva.com 2.6.12-12mdksmp #1 smp fri sep 9 17:43:23 
cest 2005 i686 intel(r) xeon(tm) cpu 2.80ghz gnulinux '
    config_args='-des -Dinc_version_list=5.8.7 5.8.7/i386-linux 5.8.6 
5.8.6/i386-linux 5.8.5 5.8.4 5.8.3 5.8.2 5.8.1 5.8.0 5.6.1 
5.6.0 -Darchname=i386-linux -Dcc=gcc -Doptimize=-O2  -pipe -Wp,-D_FORTIFY_SOURCE=2 -fstack-protector --param=ssp-buffer-size=4  -fexceptions -fomit-frame-pointer -march=i586 -mtune=generic -fasynchronous-unwind-tables -Dprefix=/usr -Dvendorprefix=/usr -Dsiteprefix=/usr -Dsitebin=/usr/local/bin -Dsiteman1dir=/usr/local/share/man/man1 -Dsiteman3dir=/usr/local/share/man/man3 -Dman3ext=3pm -Dcf_by=Mandriva -Dmyhostname=localhost -Dperladmin=root@localhost -Dcf_email=root@localhost -Dd_dosuid -Ud_csh -Duseshrplib'
    hint=recommended, useposix=true, d_sigaction=define
    usethreads=undef use5005threads=undef useithreads=undef 
usemultiplicity=undef
    useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
    use64bitint=undef use64bitall=undef uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='gcc', ccflags 
='-fno-strict-aliasing -pipe -Wdeclaration-after-statement -I/usr/local/
include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm',
    
optimize='-O2 -pipe -Wp,-D_FORTIFY_SOURCE=2 -fstack-protector --param=ssp-buffer-size=4 -f
exceptions -fomit-frame-pointer -march=i586 -mtune=generic -fasynchronous-unwind-tables',
    
cppflags='-fno-strict-aliasing -pipe -Wdeclaration-after-statement -I/usr/local/include -I
/usr/include/gdbm'
    ccversion='', gccversion='4.2.2 20071128 (prerelease) (4.2.2-2mdv2008.1)', 
gccosandvers=''
    intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
    ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', 
lseeksize=8
    alignbytes=4, prototype=define
  Linker and Libraries:
    ld='gcc', ldflags =' -L/usr/local/lib'
    libpth=/usr/local/lib /lib /usr/lib
    libs=-lnsl -lndbm -lgdbm -ldl -lm -lcrypt -lutil -lc
    perllibs=-lnsl -ldl -lm -lcrypt -lutil -lc
    libc=/lib/libc-2.6.1.so, so=so, useshrplib=true, libperl=libperl.so
    gnulibc_version='2.6.1'
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, 
ccdlflags='-Wl,-E -Wl,-rpath,/usr/lib/perl5/5.8.8/i386-linux/CORE'
    cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib'

Locally applied patches:
    Mandriva Linux patches


@INC for perl v5.8.8:
    /usr/lib/perl5/site_perl/5.8.8/i386-linux
    /usr/lib/perl5/site_perl/5.8.8
    /usr/lib/perl5/vendor_perl/5.8.8/i386-linux
    /usr/lib/perl5/vendor_perl/5.8.8
    /usr/lib/perl5/5.8.8/i386-linux
    /usr/lib/perl5/5.8.8
    /usr/lib/perl5/site_perl
    /usr/lib/perl5/vendor_perl
    .


Environment for perl v5.8.8:
    HOME=/root
    LANG=uz_UZ.UTF-8@cyrillic
    LANGUAGE=uz@cyrillic
    LC_ADDRESS=en_US.UTF-8
    LC_COLLATE=uz_UZ.UTF-8@cyrillic
    LC_CTYPE=uz_UZ.UTF-8@cyrillic
    LC_IDENTIFICATION=en_US.UTF-8
    LC_MEASUREMENT=en_US.UTF-8
    LC_MESSAGES=uz_UZ.UTF-8@cyrillic
    LC_MONETARY=en_US.UTF-8
    LC_NAME=en_US.UTF-8
    LC_NUMERIC=en_US.UTF-8
    LC_PAPER=en_US.UTF-8
    LC_SOURCED=1
    LC_TELEPHONE=en_US.UTF-8
    LC_TIME=uz_UZ.UTF-8@cyrillic
    LD_LIBRARY_PATH (unset)
    LOGDIR (unset)
    
PATH=/sbin:/usr/sbin:/bin:/usr/bin:/usr/X11R6/bin:/usr/local/bin:/usr/local/sbin
    PERL_BADLANG (unset)
    SHELL=/bin/bash

@p5pRT
Copy link
Author

p5pRT commented Jan 11, 2008

From @rgs

On 11/01/2008, via RT Mashrab Kuvatov <perlbug-followup@​perl.org> wrote​:

# New Ticket Created by Mashrab Kuvatov
# Please include the string​: [perl #49648]
# in the subject line of all future correspondence about this issue.
# <URL​: http​://rt.perl.org/rt3/Ticket/Display.html?id=49648 >

This is a bug report for perl from kmashrab@​uni-bremen.de,
generated with the help of perlbug 1.35 running under perl v5.8.8.

-----------------------------------------------------------------
[Please enter your report here]

I have Perl v5.8.8. My locale is Uzbek (cyrillic), which is a standard libc
locale for Uzbek language written in cyrillic script. My LANG is
uz_UZ.UTF-8@​cyrillic, which is perfectly OK.

When I use the Perl command "use open '​:locale'" I get the following error
messages​:

Cannot find encoding "UTF-8@​cyrillic" at /usr/lib/perl5/5.8.7/open.pm line
125.
Cannot find encoding "UTF-8@​cyrillic" at /usr/lib/perl5/5.8.7/open.pm line
133.

[Please do not change anything below this line]
-----------------------------------------------------------------
---
Flags​:
category=core
severity=high
---
Site configuration information for perl v5.8.8​:

Configured by Mandriva at Sat Dec 22 08​:05​:52 EST 2007.

Summary of my perl5 (revision 5 version 8 subversion 8) configuration​:
Platform​:
osname=linux, osvers=2.6.12-12mdksmp, archname=i386-linux
uname='linux n4.mandriva.com 2.6.12-12mdksmp #1 smp fri sep 9 17​:43​:23
cest 2005 i686 intel(r) xeon(tm) cpu 2.80ghz gnulinux '
config_args='-des -Dinc_version_list=5.8.7 5.8.7/i386-linux 5.8.6
5.8.6/i386-linux 5.8.5 5.8.4 5.8.3 5.8.2 5.8.1 5.8.0 5.6.1
5.6.0 -Darchname=i386-linux -Dcc=gcc -Doptimize=-O2 -pipe -Wp,-D_FORTIFY_SOURCE=2 -fstack-protector --param=ssp-buffer-size=4 -fexceptions -fomit-frame-pointer -march=i586 -mtune=generic -fasynchronous-unwind-tables -Dprefix=/usr -Dvendorprefix=/usr -Dsiteprefix=/usr -Dsitebin=/usr/local/bin -Dsiteman1dir=/usr/local/share/man/man1 -Dsiteman3dir=/usr/local/share/man/man3 -Dman3ext=3pm -Dcf_by=Mandriva -Dmyhostname=localhost -Dperladmin=root@​localhost -Dcf_email=root@​localhost -Dd_dosuid -Ud_csh -Duseshrplib'
hint=recommended, useposix=true, d_sigaction=define
usethreads=undef use5005threads=undef useithreads=undef
usemultiplicity=undef
useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
use64bitint=undef use64bitall=undef uselongdouble=undef
usemymalloc=n, bincompat5005=undef
Compiler​:
cc='gcc', ccflags
='-fno-strict-aliasing -pipe -Wdeclaration-after-statement -I/usr/local/
include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm',

optimize='-O2 -pipe -Wp,-D_FORTIFY_SOURCE=2 -fstack-protector --param=ssp-buffer-size=4 -f
exceptions -fomit-frame-pointer -march=i586 -mtune=generic -fasynchronous-unwind-tables',

cppflags='-fno-strict-aliasing -pipe -Wdeclaration-after-statement -I/usr/local/include -I
/usr/include/gdbm'
ccversion='', gccversion='4.2.2 20071128 (prerelease) (4.2.2-2mdv2008.1)',
gccosandvers=''
intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t',
lseeksize=8
alignbytes=4, prototype=define
Linker and Libraries​:
ld='gcc', ldflags =' -L/usr/local/lib'
libpth=/usr/local/lib /lib /usr/lib
libs=-lnsl -lndbm -lgdbm -ldl -lm -lcrypt -lutil -lc
perllibs=-lnsl -ldl -lm -lcrypt -lutil -lc
libc=/lib/libc-2.6.1.so, so=so, useshrplib=true, libperl=libperl.so
gnulibc_version='2.6.1'
Dynamic Linking​:
dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef,
ccdlflags='-Wl,-E -Wl,-rpath,/usr/lib/perl5/5.8.8/i386-linux/CORE'
cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib'

Locally applied patches​:
Mandriva Linux patches

---
@​INC for perl v5.8.8​:
/usr/lib/perl5/site_perl/5.8.8/i386-linux
/usr/lib/perl5/site_perl/5.8.8
/usr/lib/perl5/vendor_perl/5.8.8/i386-linux
/usr/lib/perl5/vendor_perl/5.8.8
/usr/lib/perl5/5.8.8/i386-linux
/usr/lib/perl5/5.8.8
/usr/lib/perl5/site_perl
/usr/lib/perl5/vendor_perl
.

---
Environment for perl v5.8.8​:
HOME=/root
LANG=uz_UZ.UTF-8@​cyrillic
LANGUAGE=uz@​cyrillic
LC_ADDRESS=en_US.UTF-8
LC_COLLATE=uz_UZ.UTF-8@​cyrillic
LC_CTYPE=uz_UZ.UTF-8@​cyrillic
LC_IDENTIFICATION=en_US.UTF-8
LC_MEASUREMENT=en_US.UTF-8
LC_MESSAGES=uz_UZ.UTF-8@​cyrillic
LC_MONETARY=en_US.UTF-8
LC_NAME=en_US.UTF-8
LC_NUMERIC=en_US.UTF-8
LC_PAPER=en_US.UTF-8
LC_SOURCED=1
LC_TELEPHONE=en_US.UTF-8
LC_TIME=uz_UZ.UTF-8@​cyrillic
LD_LIBRARY_PATH (unset)
LOGDIR (unset)

PATH=/sbin​:/usr/sbin​:/bin​:/usr/bin​:/usr/X11R6/bin​:/usr/local/bin​:/usr/local/sbin
PERL_BADLANG (unset)
SHELL=/bin/bash

@p5pRT
Copy link
Author

p5pRT commented Jan 11, 2008

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Jan 11, 2008

From @rgs

On 11/01/2008, via RT Mashrab Kuvatov <perlbug-followup@​perl.org> wrote​:

I have Perl v5.8.8. My locale is Uzbek (cyrillic), which is a standard libc
locale for Uzbek language written in cyrillic script. My LANG is
uz_UZ.UTF-8@​cyrillic, which is perfectly OK.

When I use the Perl command "use open '​:locale'" I get the following error
messages​:

Cannot find encoding "UTF-8@​cyrillic" at /usr/lib/perl5/5.8.7/open.pm line
125.
Cannot find encoding "UTF-8@​cyrillic" at /usr/lib/perl5/5.8.7/open.pm line
133.

Trimmed down to​:

$ bleadperl -wle 'binmode STDIN,q(​:encoding(UTF-8@​cyrillic))'
Cannot find encoding "UTF-8@​cyrillic" at -e line 1.

open.pm should probably ignore whatever is after the @​. However I'd
like to find a document on what all those parts mean, exactly.

@p5pRT
Copy link
Author

p5pRT commented Jan 12, 2008

From kmashrab@uni-bremen.de

Look at the following document, "3.4 Locale environment variables", for the
format of locale env. variables.

http​://www.linuxdocs.org/HOWTOs/Unicode-HOWTO-3.html

There are quite few locales with the modifier. Look at the libc sources.

http​://sourceware.org/cgi-bin/cvsweb.cgi/libc/localedata/locales/?cvsroot=glibc

Mashrab.

@p5pRT
Copy link
Author

p5pRT commented Jan 14, 2008

From @rgs

On 11/01/2008, Mashrab Kuvatov <kmashrab@​uni-bremen.de> wrote​:

Look at the following document, "3.4 Locale environment variables", for the
format of locale env. variables.

http​://www.linuxdocs.org/HOWTOs/Unicode-HOWTO-3.html

There are quite few locales with the modifier. Look at the libc sources.

OK. I found the bug, in encoding.pm, and fixed it with the appended patch.

However encoding.pm first tries I18N​::Langinfo to figure out the
locale encoding. That should return a correct answer. What does the
following command print on your system ?

perl -MI18N​::Langinfo=langinfo,CODESET -le 'print langinfo(CODESET())'

And I18N​::Langinfo is a direct wrapper around the glibc function nl_langinfo(3).

The patch is​:

Change 32977 by rgs@​scipion on 2008/01/14 22​:48​:46

  When parsing LC_ALL or LANG to get the locale's encoding, ignore
  whatever is after the @​, since that's a modifier, not an encoding.

Affected files ...

... //depot/perl/ext/Encode/encoding.pm#48 edit

Differences ...

==== //depot/perl/ext/Encode/encoding.pm#48 (text) ====

@​@​ -1,6 +1,6 @​@​
# $Id​: encoding.pm,v 2.6 2007/04/22 14​:56​:12 dankogai Exp $
package encoding;
-our $VERSION = do { my @​r = ( q$Revision​: 2.6 $ =~ /\d+/g ); sprintf
"%d." . "%02d" x $#r, @​r };
+our $VERSION = '2.6_01';

use Encode;
use strict;
@​@​ -51,10 +51,10 @​@​
  no warnings 'uninitialized';

  if ( not $locale_encoding && in_locale() ) {
- if ( $ENV{LC_ALL} =~ /^([^.]+)\.([^.]+)$/ ) {
+ if ( $ENV{LC_ALL} =~ /^([^.]+)\.([^.@​]+)(@​.*)?$/ ) {
  ( $country_language, $locale_encoding ) = ( $1, $2 );
  }
- elsif ( $ENV{LANG} =~ /^([^.]+)\.([^.]+)$/ ) {
+ elsif ( $ENV{LANG} =~ /^([^.]+)\.([^.@​]+)(@​.*)?$/ ) {
  ( $country_language, $locale_encoding ) = ( $1, $2 );
  }

@p5pRT
Copy link
Author

p5pRT commented Jan 14, 2008

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Jan 15, 2008

From kmashrab@uni-bremen.de

On Monday 14 January 2008 23​:53, Rafael Garcia-Suarez wrote​:

What does the following command print on your system ?
perl -MI18N​::Langinfo=langinfo,CODESET -le 'print langinfo(CODESET())'

UTF-8

Mashrab.

@p5pRT
Copy link
Author

p5pRT commented Jan 15, 2008

From @rgs

On 15/01/2008, Mashrab Kuvatov <kmashrab@​uni-bremen.de> wrote​:

I tried the patch proposed by Rafael Garcia-Suarez. It fixes the issue I
reported.

I think that in addition to Rafael's fix one has to fix the following logic in
encoding.pm​:

if ( not $locale_encoding && in_locale() ) {

I tried to print out $locale_encoding and in_locale() in LANG and LC_ALL set
to German (de_DE.UTF-8) and Uzbek Cyrillic (uz_UZ.UTF-8@​cyrillic). In both
cases, $locale_encoding = UTF-8 and in_locale() = 0. So, it seems the above
logic should be

if ( ( not $locale_encoding ) && in_locale() ) {

Good catch ! Yes, you're right, and I've fixed encoding.pm accordingly.
That was the cause why the value returned by I18N​::Langinfo was thrown
away and replaced by a bogus one.

Change 32980 by rgs@​stcosmo on 2008/01/15 14​:23​:04

  Boolean priority bug, found by Mashrab Kuvatov​:
 
  Subject​: Re​: [perl #49646] perlbug AutoReply​: open '​:locale' does not
work under locale with the modifier
  From​: Mashrab Kuvatov <kmashrab@​uni-bremen.de>
  Date​: Tue, 15 Jan 2008 15​:17​:42 +0100
  Message-Id​: <200801151517.46296.kmashrab@​uni-bremen.de>

Affected files ...

... //depot/perl/ext/Encode/encoding.pm#49 edit

Differences ...

==== //depot/perl/ext/Encode/encoding.pm#49 (text) ====

@​@​ -50,7 +50,7 @​@​

  no warnings 'uninitialized';

- if ( not $locale_encoding && in_locale() ) {
+ if ( (not $locale_encoding) && in_locale() ) {
  if ( $ENV{LC_ALL} =~ /^([^.]+)\.([^.@​]+)(@​.*)?$/ ) {
  ( $country_language, $locale_encoding ) = ( $1, $2 );
  }

@p5pRT
Copy link
Author

p5pRT commented Jan 15, 2008

From kmashrab@uni-bremen.de

I tried the patch proposed by Rafael Garcia-Suarez. It fixes the issue I
reported.

I think that in addition to Rafael's fix one has to fix the following logic in
encoding.pm​:

if ( not $locale_encoding && in_locale() ) {

I tried to print out $locale_encoding and in_locale() in LANG and LC_ALL set
to German (de_DE.UTF-8) and Uzbek Cyrillic (uz_UZ.UTF-8@​cyrillic). In both
cases, $locale_encoding = UTF-8 and in_locale() = 0. So, it seems the above
logic should be

if ( ( not $locale_encoding ) && in_locale() ) {

Actually, I do not understand how in_locale() works. However, my experiment
shows that it returns 0 as TRUE.

Mashrab.

@p5pRT
Copy link
Author

p5pRT commented Jan 15, 2008

From @druud62

Mashrab Kuvatov wrote​:

- if ( not $locale_encoding && in_locale() ) {
+ if ( ( not $locale_encoding ) && in_locale() ) {

Well, that's what you get when you mix the higher && lower precedence
logical operators. Good one for Perl​::Critic.

I would choose one of these​:

  if ( !$locale_encoding && in_locale() ) {

  if ( not $locale_encoding and in_locale() ) {

--
Affijn, Ruud

"Gewoon is een tijger."

@p5pRT
Copy link
Author

p5pRT commented Jan 16, 2008

From @Tux

On Tue, 15 Jan 2008 20​:43​:29 +0100, "Dr.Ruud" <rvtol+news@​isolution.nl> wrote​:

Mashrab Kuvatov wrote​:

- if ( not $locale_encoding && in_locale() ) {
+ if ( ( not $locale_encoding ) && in_locale() ) {

Well, that's what you get when you mix the higher && lower precedence
logical operators. Good one for Perl​::Critic.

I would choose one of these​:

 if \( \!$locale\_encoding && in\_locale\(\) \) \{

 if \( not $locale\_encoding and in\_locale\(\) \) \{

I would choose

  if (!$locale_encoding and in_locale ()) {

your last option does not increase legibility. The first is OK with me

--
H.Merijn Brand Amsterdam Perl Mongers (http​://amsterdam.pm.org/)
using & porting perl 5.6.2, 5.8.x, 5.10.x on HP-UX 10.20, 11.00, 11.11,
& 11.23, SuSE 10.1 & 10.2, AIX 5.2, and Cygwin. http​://qa.perl.org
http​://mirrors.develooper.com/hpux/ http​://www.test-smoke.org
  http​://www.goldmark.org/jeff/stupid-disclaimers/

@p5pRT
Copy link
Author

p5pRT commented Apr 27, 2008

p5p@spam.wizbit.be - Status changed from 'open' to 'resolved'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant