Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use locale;" breaks \w on matching c-cedilla, o-diaeresis and u-diaeresis under tr_TR.utf8 and de_DE.utf8 locales #9410

Closed
p5pRT opened this issue Jul 11, 2008 · 13 comments

Comments

@p5pRT
Copy link

p5pRT commented Jul 11, 2008

Migrated from rt.perl.org#56820 (status was 'resolved')

Searchable as RT56820$

@p5pRT
Copy link
Author

p5pRT commented Jul 11, 2008

From pva@gentoo.org

Created by pva@gentoo.org

In linux (tried Gentoo and Debian) \w and [​:alnum​:] matches does not work
(although should) with use locale. Take a look at this output​:

$ cat test-file
слово
строка с пробелами
string with spaces (not only with [​:alnum​:])
English;
hello_привет

$ perl -e 'use locale; open(IN, "< test-file"); while(<IN>) { print if /\w/; }'
string with spaces (not only with [​:alnum​:])
English;
hello_привет
Linux $

You see, only strings with English letters are matched and none with Russian
letters. locale is set to ru_RU.UTF-8, although I'vetried to use
setlocale(LC_ALL, "ru_RU.uft8"); explicetly inside ebuild. Such locale exists
(at least locale -a lists it)​:

$ locale -a
C
en_US
en_US.iso88591
en_US.utf8
POSIX
ru_RU
ru_RU.cp1251
ru_RU.koi8r
ru_RU.utf8

Also I've tried with cp1251 locale. Converted test-file with iconv into cp1251
and executed the following​:

perl -e 'use locale;
  use POSIX qw(locale_h);
  setlocale(LC_ALL, "ru_RU.cp1251");
  open(IN, "< test-file.cp1251"); while(<IN>) { print if /\w/; }'

This did not matched anything non ASCII too. This is bug filed with perl-5.10.0
installed but I've tried with perl-5.8.8 too and it does not work too.

Strange thing is that in FreeBSD this works as it should.

Perl Info

Flags:
    category=core
    severity=high

Site configuration information for perl 5.10.0:

Configured by Gentoo at Fri Jul 11 09:17:18 MSD 2008.

Summary of my perl5 (revision 5 version 10 subversion 0) configuration:
  Platform:
    osname=linux, osvers=2.6.22-ovz005, archname=i686-linux
    uname='linux camobap 2.6.22-ovz005 #6 tue jan 29 12:22:24 msk 2008 i686 intel(r) pentium(r) m processor 1700mhz genuineintel gnulinux '
    config_args='-des -Darchname=i686-linux -Dcccdlflags=-fPIC -Dccdlflags=-rdynamic -Dcc=i686-pc-linux-gnu-gcc -Dprefix=/usr -Dvendorprefix=/usr -Dsiteprefix=/usr -Dlocincpth=  -Doptimize=-O2 -mtune=pentium-m -fomit-frame-pointer -mcpu=pentium-m -pipe -Duselargefiles -Dd_semctl_semun -Dscriptdir=/usr/bin -Dman1dir=/usr/share/man/man1 -Dman3dir=/usr/share/man/man3 -Dinstallman1dir=/usr/share/man/man1 -Dinstallman3dir=/usr/share/man/man3 -Dman1ext=1 -Dman3ext=3pm -Dinc_version_list= -Dinc_version_list= -Dlocincpth=/usr/src/linux/include -Dcf_by=Gentoo -Ud_csh -Dusenm -Ui_ndbm -Ui_gdbm -Di_db'
    hint=recommended, useposix=true, d_sigaction=define
    useithreads=undef, usemultiplicity=undef
    useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef
    use64bitint=undef, use64bitall=undef, uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='i686-pc-linux-gnu-gcc', ccflags ='-fno-strict-aliasing -pipe -I/usr/src/linux/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
    optimize='-O2 -mtune=pentium-m -fomit-frame-pointer -mcpu=pentium-m -pipe',
    cppflags='-fno-strict-aliasing -pipe -I/usr/src/linux/include'
    ccversion='', gccversion='4.2.4 (Gentoo 4.2.4 p1.0)', gccosandvers=''
    intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
    ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
    alignbytes=4, prototype=define
  Linker and Libraries:
    ld='i686-pc-linux-gnu-gcc', ldflags =' -L/usr/local/lib'
    libpth=/usr/local/lib /lib /usr/lib
    libs=-lpthread -lnsl -ldb -ldl -lm -lcrypt -lutil -lc
    perllibs=-lpthread -lnsl -ldl -lm -lcrypt -lutil -lc
    libc=/lib/libc-2.6.1.so, so=so, useshrplib=false, libperl=libperl.a
    gnulibc_version='2.6.1'
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-rdynamic'
    cccdlflags='-fPIC', lddlflags='-shared -O2 -mtune=pentium-m -fomit-frame-pointer -mcpu=pentium-m -pipe -L/usr/local/lib'

Locally applied patches:
    


@INC for perl 5.10.0:
    /usr/lib/perl5/5.10.0/i686-linux
    /usr/lib/perl5/5.10.0
    /usr/lib/perl5/site_perl/5.10.0/i686-linux
    /usr/lib/perl5/site_perl/5.10.0
    /usr/lib/perl5/site_perl/5.8.8
    /usr/lib/perl5/site_perl
    /usr/lib/perl5/vendor_perl/5.10.0/i686-linux
    /usr/lib/perl5/vendor_perl/5.10.0
    /usr/lib/perl5/vendor_perl/5.8.8
    /usr/lib/perl5/vendor_perl
    .


Environment for perl 5.10.0:
    HOME=/home/peter
    LANG=ru_RU.UTF-8
    LANGUAGE (unset)
    LC_ADDRESS=ru_RU.UTF-8
    LC_ALL=ru_RU.utf8
    LC_COLLATE=ru_RU.UTF-8
    LC_CTYPE=ru_RU.UTF-8
    LC_IDENTIFICATION=ru_RU.UTF-8
    LC_MEASUREMENT=ru_RU.UTF-8
    LC_MESSAGES=ru_RU.UTF-8
    LC_MONETARY=ru_RU.UTF-8
    LC_NAME=ru_RU.UTF-8
    LC_NUMERIC=POSIX
    LC_PAPER=ru_RU.UTF-8
    LC_TELEPHONE=ru_RU.UTF-8
    LC_TIME=ru_RU.UTF-8
    LD_LIBRARY_PATH (unset)
    LOGDIR (unset)
    PATH=/home/peter/bin:/home/peter/local/bin:/home/peter/local/sbin:/sbin:/usr/sbin:/usr/kde/3.5/bin:/usr/local/bin:/usr/bin:/bin:/opt/bin:/usr/i686-pc-linux-gnu/gcc-bin/4.2.4:/usr/qt/3/bin:/opt/vmware/server/bin:/opt/vmware/server/console/bin
    PERL_BADLANG (unset)
    SHELL=/bin/bash

Flags:
    category=core
    severity=high

Site configuration information for perl v5.8.8:

Configured by Gentoo at Mon May 19 20:39:42 MSD 2008.

Summary of my perl5 (revision 5 version 8 subversion 8) configuration:
  Platform:
    osname=linux, osvers=2.6.24-gentoo-r4, archname=x86_64-linux
    uname='linux theor 2.6.24-gentoo-r4 #2 tue apr 29 19:22:40 msd 2008 x86_64 amd sempron(tm) processor 2600+ authenticamd gnulinux '
    config_args='-des -Darchname=x86_64-linux -Dcccdlflags=-fPIC -Dccdlflags=-rdynamic -Dcc=x86_64-pc-linux-gnu-gcc -Dprefix=/usr -Dvendorprefix=/usr -Dsiteprefix=/usr -Dlocincpth=  -Doptimize=-O2 -pipe -march=athlon64 -mtune=athlon64 -msse3 -fomit-frame-pointer -Duselargefiles -Dd_semctl_semun -Dscriptdir=/usr/bin -Dman1dir=/usr/share/man/man1 -Dman3dir=/usr/share/man/man3 -Dinstallman1dir=/usr/share/man/man1 -Dinstallman3dir=/usr/share/man/man3 -Dman1ext=1 -Dman3ext=3pm -Dinc_version_list=5.8.0 5.8.0/x86_64-linux 5.8.2 5.8.2/x86_64-linux 5.8.4 5.8.4/x86_64-linux 5.8.5 5.8.5/x86_64-linux 5.8.6 5.8.6/x86_64-linux 5.8.7 5.8.7/x86_64-linux  -Dcf_by=Gentoo -Ud_csh -Dusenm -Ui_ndbm -Ui_gdbm -Ui_db -Dusrinc=/usr/include/gentoo-multilib/amd64 -Dlibpth=/usr/local/lib64 /lib64 /usr/lib64'
    hint=recommended, useposix=true, d_sigaction=define
    usethreads=undef use5005threads=undef useithreads=undef usemultiplicity=undef
    useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
    use64bitint=define use64bitall=define uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='x86_64-pc-linux-gnu-gcc', ccflags ='-fno-strict-aliasing -pipe -Wdeclaration-after-statement -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
    optimize='-O2 -pipe -march=athlon64 -mtune=athlon64 -msse3 -fomit-frame-pointer',
    cppflags='-fno-strict-aliasing -pipe -Wdeclaration-after-statement'
    ccversion='', gccversion='4.1.2 (Gentoo 4.1.2 p1.0.2)', gccosandvers=''
    intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
    ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
    alignbytes=8, prototype=define
  Linker and Libraries:
    ld='x86_64-pc-linux-gnu-gcc', ldflags =' -L/usr/local/lib64'
    libpth=/usr/local/lib64 /lib64 /usr/lib64
    libs=-lpthread -lnsl -lndbm -lgdbm -ldb -ldl -lm -lcrypt -lutil -lc
    perllibs=-lpthread -lnsl -ldl -lm -lcrypt -lutil -lc
    libc=/lib/libc-2.6.1.so, so=so, useshrplib=false, libperl=libperl.a
    gnulibc_version='2.6.1'
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-rdynamic'
    cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib64'

Locally applied patches:
    


@INC for perl v5.8.8:
    /etc/perl
    /usr/lib64/perl5/vendor_perl/5.8.8/x86_64-linux
    /usr/lib64/perl5/vendor_perl/5.8.8
    /usr/lib64/perl5/vendor_perl
    /usr/lib64/perl5/site_perl/5.8.8/x86_64-linux
    /usr/lib64/perl5/site_perl/5.8.8
    /usr/lib64/perl5/site_perl
    /usr/lib64/perl5/5.8.8/x86_64-linux
    /usr/lib64/perl5/5.8.8
    /usr/local/lib/site_perl
    .


Environment for perl v5.8.8:
    HOME=/home/peter
    LANG=ru_RU.UTF-8
    LANGUAGE (unset)
    LC_ADDRESS=ru_RU.UTF-8
    LC_COLLATE=ru_RU.UTF-8
    LC_CTYPE=ru_RU.UTF-8
    LC_IDENTIFICATION=ru_RU.UTF-8
    LC_MEASUREMENT=ru_RU.UTF-8
    LC_MESSAGES=en_US
    LC_MONETARY=ru_RU.UTF-8
    LC_NAME=ru_RU.UTF-8
    LC_NUMERIC=POSIX
    LC_PAPER=ru_RU.UTF-8
    LC_TELEPHONE=ru_RU.UTF-8
    LC_TIME=ru_RU.UTF-8
    LD_LIBRARY_PATH (unset)
    LOGDIR (unset)
    PATH=/home/peter/bin:/home/peter/local/bin:/home/peter/local/sbin:/sbin:/usr/sbin:/usr/local/bin:/usr/bin:/bin:/opt/bin:/usr/x86_64-pc-linux-gnu/gcc-bin/4.1.2:/opt/blackdown-jdk-1.4.2.03/bin:/opt/blackdown-jdk-1.4.2.03/jre/bin:/usr/kde/3.5/bin:/usr/qt/3/bin
    PERL_BADLANG (unset)
    SHELL=/bin/bash

@p5pRT
Copy link
Author

p5pRT commented Jul 12, 2008

From @druud62

Peter schreef​:

locale is set to ru_RU.UTF-8, although I'vetried to use
setlocale(LC_ALL, "ru_RU.uft8"); explicetly inside ebuild.

s/uft/utf/

--
Affijn, Ruud

"Gewoon is een tijger."

@p5pRT
Copy link
Author

p5pRT commented Jul 12, 2008

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Jul 26, 2008

From p5p@spam.wizbit.be

On Fri Jul 11 00​:07​:32 2008, pva wrote​:

$ cat test-file
слово
строка с пробелами
string with spaces (not only with [​:alnum​:])
English;
hello_привет

Can you send the test-file as an attachment?

Kind regards,

Bram

@p5pRT
Copy link
Author

p5pRT commented Apr 28, 2013

From @jmdh

Created by @jmdh

This is a bug report for perl from dom@​earth.li,
generated with the help of perlbug 1.39 running under perl 5.17.12.

From <http​://bugs.debian.org/cgi-bin/bugreport.cgi?bug=529305>​:

----------------
Showcase​:
(requires installing tr_TR.utf8 and de_De.utf8 locales via 'dpkg-reconfigure
locales' or installing locales-all package)

#/usr/bin/perl
use strict;
use warnings;
use POSIX qw(setlocale LC_ALL);
setlocale(LC_ALL, "tr_TR.utf8");
print "Locale is ", setlocale(LC_ALL), "\n";

use locale;
use utf8;
binmode STDOUT, "​:utf8";

print "$_ is " . ( /\w/ ? "" : "not " ) . "a word character\n"
  for qw( ç ö ş ü ğ ı İ );

The output is

Locale is tr_TR.utf8
ç is not a word character
ö is not a word character
ş is a word character
ü is not a word character
ğ is a word character
ı is a word character
İ is a word character

Looking (with my uneducated eyes) in /usr/share/i18n/locales/tr_TR it seems
that at least c-cedilla (U00E7 in small caps and U00C7 in caps) shall be
treated as an "alpha" character so the problem seems to be in perl's
interpretation.
----------------

This is reproducible with 8b3945e
(current blead) and has been the case since at least 5.8.8.

Perl Info

Flags:
    category=library
    severity=low
    module=locale

Site configuration information for perl 5.17.12:

Configured by dom at Sun Apr 28 17:39:32 BST 2013.

Summary of my perl5 (revision 5 version 17 subversion 12) configuration:
  Commit id: 8b3945e7b7b7ae6fd2369864ebe169bd9a91cf4e
  Platform:
    osname=linux, osvers=3.2.0-4-686-pae, archname=i686-linux-thread-multi-64int
    uname='linux callisto 3.2.0-4-686-pae #1 smp debian 3.2.41-2 i686 gnulinux '
    config_args='-Dusethreads -Duselargefiles -Dccflags=-DDEBIAN -D_FORTIFY_SOURCE=2 -g -O2 -fstack-protector --param=ssp-buffer-size=4 -Wformat -Dldflags=-Wl,-z,relro -Dlddlflags=-shared -Wl,-z,relro -Dcccdlflags=-fPIC -Duse64bitint -Uafs -Ud_csh -Ud_ualarm -Uusesfio -Uusenm -Ui_libutil -DDEBUGGING=-g -Doptimize=-O2 -Duseshrplib -des -Dusedevel'
    hint=recommended, useposix=true, d_sigaction=define
    useithreads=define, usemultiplicity=define
    useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef
    use64bitint=define, use64bitall=undef, uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='cc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBIAN -D_FORTIFY_SOURCE=2 -g -O2 -fstack-protector --param=ssp-buffer-size=4 -Wformat -fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
    optimize='-O2 -g',
    cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBIAN -D_FORTIFY_SOURCE=2 -g -O2 -fstack-protector --param=ssp-buffer-size=4 -Wformat -fno-strict-aliasing -pipe -I/usr/local/include'
    ccversion='', gccversion='4.7.2', gccosandvers=''
    intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=12345678
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
    ivtype='long long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
    alignbytes=4, prototype=define
  Linker and Libraries:
    ld='cc', ldflags ='-Wl,-z,relro -fstack-protector -L/usr/local/lib'
    libpth=/usr/local/lib /lib/i386-linux-gnu /lib/../lib /usr/lib/i386-linux-gnu /usr/lib/../lib /lib /usr/lib
    libs=-lnsl -ldl -lm -lcrypt -lutil -lpthread -lc
    perllibs=-lnsl -ldl -lm -lcrypt -lutil -lpthread -lc
    libc=, so=so, useshrplib=true, libperl=libperl.so
    gnulibc_version='2.13'
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E -Wl,-rpath,/usr/local/lib/perl5/5.17.12/i686-linux-thread-multi-64int/CORE'
    cccdlflags='-fPIC', lddlflags='-shared -Wl,-z,relro -L/usr/local/lib -fstack-protector'

Locally applied patches:
    


@INC for perl 5.17.12:
    lib
    /usr/local/lib/perl5/site_perl/5.17.12/i686-linux-thread-multi-64int
    /usr/local/lib/perl5/site_perl/5.17.12
    /usr/local/lib/perl5/5.17.12/i686-linux-thread-multi-64int
    /usr/local/lib/perl5/5.17.12
    .


Environment for perl 5.17.12:
    HOME=/home/dom
    LANG=en_GB.UTF-8
    LANGUAGE (unset)
    LD_LIBRARY_PATH=/home/dom/working/perl:
    LOGDIR (unset)
    PATH=~/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
    PERL_BADLANG (unset)
    SHELL=/bin/bash

@p5pRT
Copy link
Author

p5pRT commented Apr 28, 2013

From @jmdh

On Sun Apr 28 10​:22​:47 2013, dom wrote​:

Looking (with my uneducated eyes) in /usr/share/i18n/locales/tr_TR it
seems
that at least c-cedilla (U00E7 in small caps and U00C7 in caps) shall
be
treated as an "alpha" character so the problem seems to be in perl's
interpretation.
----------------

This is reproducible with 8b3945e
(current blead) and has been the case since at least 5.8.8.

This might be the same as #56820, but I'm not sure.

@p5pRT
Copy link
Author

p5pRT commented Apr 28, 2013

@jmdh - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Apr 29, 2013

From @khwilliamson

On 04/28/2013 11​:22 AM, Dominic Hargreaves (via RT) wrote​:

# New Ticket Created by Dominic Hargreaves
# Please include the string​: [perl #117787]
# in the subject line of all future correspondence about this issue.
# <URL​: https://rt-archive.perl.org/perl5/Ticket/Display.html?id=117787 >

This is a bug report for perl from dom@​earth.li,
generated with the help of perlbug 1.39 running under perl 5.17.12.

From <http​://bugs.debian.org/cgi-bin/bugreport.cgi?bug=529305>​:

----------------
Showcase​:
(requires installing tr_TR.utf8 and de_De.utf8 locales via 'dpkg-reconfigure
locales' or installing locales-all package)

#/usr/bin/perl
use strict;
use warnings;
use POSIX qw(setlocale LC_ALL);
setlocale(LC_ALL, "tr_TR.utf8");
print "Locale is ", setlocale(LC_ALL), "\n";

use locale;
use utf8;
binmode STDOUT, "​:utf8";

print "$_ is " . ( /\w/ ? "" : "not " ) . "a word character\n"
for qw( ç ö ş ü ğ ı İ );

The output is

Locale is tr_TR.utf8
ç is not a word character
ö is not a word character
ş is a word character
ü is not a word character
ğ is a word character
ı is a word character
İ is a word character

Looking (with my uneducated eyes) in /usr/share/i18n/locales/tr_TR it seems
that at least c-cedilla (U00E7 in small caps and U00C7 in caps) shall be
treated as an "alpha" character so the problem seems to be in perl's
interpretation.
----------------

This is reproducible with 8b3945e
(current blead) and has been the case since at least 5.8.8.

I tracked this down, and it appears to me to be a bug in the C library
isalnum() function. The suppliers might argue that it is intentional,
but if so, it certainly isn't documented properly.

I'm doing some surmisal here. What I think is going on is that under a
UTF-8 locale, isalnum() (and its brethern) will only return true for
invariant characters. That is, only characters in the ASCII range.

To get whether a character above ASCII is an alnum, one must use
iswalnum() instead. There is no provision in Perl to do this. Attached
is a C program that demonstrates this on my old 10.10 Ubuntu system.
Under the de locale, isalnum() returns true only for the ASCII alnums,
but iswalnum() returns true for the whole range.

Perl assumes that isalnum() will work properly on any character whose
ordinal is 0-255. This turns out to be wrong. I don't see how the
suppliers of the C library could say that their implementation is
correct; yet they have made equally absurd claims in the past.

It would probably be a lot of work for Perl to change to also use the C
wide character classification functions. But I will now take this
opportunity to revive my proposal from a year ago to treat locales whose
name ends in UTF-8, as UTF-8 for purposes of character classification
and collation​:
http​://markmail.org/message/q4vorzd2xcxbm43y

That would fix this bug as a side effect, and is quite easy to implement.

The objections to last year's proposal all seem to me to stem from
misunderstanding it, and from not wanting to encourage the use of a
broken paradigm, locales, by fixing them. I don't consider the latter
to be a valid objection.

@p5pRT
Copy link
Author

p5pRT commented Apr 29, 2013

From @khwilliamson

#include <stdio.h>
#include <ctype.h>
#include <locale.h>
#include <wctype.h>

 int
 main(int argc, char** argv)
 {
    int i;

    if (setlocale(LC_ALL, "de_DE.utf8")) {
        printf("Locale is %s\n", setlocale(LC_ALL, NULL));

        for (i = 0; i < 256; i++) {
            if (iswpunct(i) != ispunct(i)) {
                printf("\\x%02X wpunct and punct differ\n", i);
            }
        }
    }
}

@p5pRT
Copy link
Author

p5pRT commented May 5, 2013

From @khwilliamson

Also, starting in 5.16, there is a work-around available for this
problem, as described in perllocale under "Unicode and UTF-8". What you
do is write

use locale '​:not_characters';

and use any of several I/O methods mentioned in that doc which convert
to/from the current locale
--
Karl Williamson

@p5pRT
Copy link
Author

p5pRT commented Jun 19, 2013

From @khwilliamson

On Sat May 04 19​:51​:08 2013, khw wrote​:
I merged in #117787 to this one. I'm confident they are the same cause,
which is outside Perl's control, except if we workaround the libc bug.
But for followers of #56820, here is a workaround​:

Also, starting in 5.16, there is a work-around available for this
problem, as described in perllocale under "Unicode and UTF-8". What you
do is write

use locale '​:not_characters';

and use any of several I/O methods mentioned in that doc which convert
to/from the current locale

--
Karl Williamson

@p5pRT
Copy link
Author

p5pRT commented Jan 28, 2014

From @khwilliamson

Fixed by commit
31f05a3
--
Karl Williamson

@p5pRT
Copy link
Author

p5pRT commented Jan 28, 2014

@khwilliamson - Status changed from 'open' to 'resolved'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant