Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug in File::Find with UTF filenames #12657

Open
p5pRT opened this issue Dec 19, 2012 · 4 comments
Open

bug in File::Find with UTF filenames #12657

p5pRT opened this issue Dec 19, 2012 · 4 comments
Labels
distro-Linux type-library Unicode and System Calls Bad interactions of syscalls and UTF-8

Comments

@p5pRT
Copy link

p5pRT commented Dec 19, 2012

Migrated from rt.perl.org#116134 (status was 'open')

Searchable as RT116134$

@p5pRT
Copy link
Author

p5pRT commented Dec 19, 2012

From victor@vsespb.ru

Created by victor@vsespb.ru

Module File​::Find does not process UTF-8 filename correctly when
directories to search are UTF-8 strings (with UTF-8 bit set) AND
filenames of files found are in UTF-8.
Possible workaround is to add preprocess callback​:

preprocess => sub {
map { decode("UTF-8", $_, 1) } @​_;
}

example of code here
https://github.com/vsespb/mt-aws-glacier/blob/4c483a46c282824f91802785b6620e2989c07ee4/Journal.pm#L114

Perl Info

Flags:
    category=library
    severity=medium
    module=File::Find

Site configuration information for perl 5.10.1:

Configured by Debian Project at Tue Nov 27 00:14:30 UTC 2012.

Summary of my perl5 (revision 5 version 10 subversion 1) configuration:

  Platform:
    osname=linux, osvers=2.6.42-23-generic,
archname=x86_64-linux-gnu-thread-multi
    uname='linux komainu 2.6.42-23-generic #36-ubuntu smp tue apr 10
20:39:51 utc 2012 x86_64 gnulinux '
    config_args='-Dusethreads -Duselargefiles -Dccflags=-DDEBIAN
-Dcccdlflags=-fPIC -Darchname=x86_64-linux-gnu -Dprefix=/usr
-Dprivlib=/usr/share/perl/5.10 -Darchlib=/usr/lib/perl/5.10
-Dvendorprefix=/usr -Dvendorlib=/usr/share/perl5
-Dvendorarch=/usr/lib/perl5 -Dsiteprefix=/usr/local
-Dsitelib=/usr/local/share/perl/5.10.1
-Dsitearch=/usr/local/lib/perl/5.10.1 -Dman1dir=/usr/share/man/man1
-Dman3dir=/usr/share/man/man3 -Dsiteman1dir=/usr/local/man/man1
-Dsiteman3dir=/usr/local/man/man3 -Dman1ext=1 -Dman3ext=3perl
-Dpager=/usr/bin/sensible-pager -Uafs -Ud_csh -Ud_ualarm -Uusesfio
-Uusenm -DDEBUGGING=-g -Doptimize=-O2 -Duseshrplib
-Dlibperl=libperl.so.5.10.1 -Dd_dosuid -des'
    hint=recommended, useposix=true, d_sigaction=define
    useithreads=define, usemultiplicity=define
    useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef
    use64bitint=define, use64bitall=define, uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='cc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBIAN
-fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include
-D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
    optimize='-O2 -g',
    cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBIAN -fno-strict-aliasing
-pipe -fstack-protector -I/usr/local/include'
    ccversion='', gccversion='4.4.3', gccosandvers=''
    intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
    ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t',
lseeksize=8
    alignbytes=8, prototype=define
  Linker and Libraries:
    ld='cc', ldflags =' -fstack-protector -L/usr/local/lib'
    libpth=/usr/local/lib /lib /usr/lib /lib64 /usr/lib64
    libs=-lgdbm -lgdbm_compat -ldb -ldl -lm -lpthread -lc -lcrypt
    perllibs=-ldl -lm -lpthread -lc -lcrypt
    libc=/lib/libc-2.11.1.so, so=so, useshrplib=true, libperl=libperl.so.5.10.1
    gnulibc_version='2.11.1'
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E'
    cccdlflags='-fPIC', lddlflags='-shared -O2 -g -L/usr/local/lib
-fstack-protector'

Locally applied patches:



@INC for perl 5.10.1:
    /etc/perl
    /usr/local/lib/perl/5.10.1
    /usr/local/share/perl/5.10.1
    /usr/lib/perl5
    /usr/share/perl5
    /usr/lib/perl/5.10
    /usr/share/perl/5.10
    /usr/local/lib/site_perl
    .


Environment for perl 5.10.1:
    HOME=/home/vse
    LANG=ru_RU.utf8
    LANGUAGE=en_US:en
    LC_MESSAGES=en_US.UTF-8
    LD_LIBRARY_PATH (unset)
    LOGDIR (unset)
    PATH=/home/vse/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/home/vse/.rvm/bin:/home/vse/.rvm/bin
    PERL_BADLANG (unset)
    SHELL=/bin/bash

@p5pRT
Copy link
Author

p5pRT commented Dec 19, 2012

victor@vsespb.ru - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Dec 19, 2012

From victor@vsespb.ru

Looks like duplicate of
https://rt-archive.perl.org/perl5/Public/Bug/Display.html?id=75000

However it's different issue​: it can't be fixed by UTF decoding
returning filenames. Fix is Only to "preprocessing" filenames before it
contacenated with root dir.

Срд. Дек. 19 02​:56​:18 2012, vsespb писал​:

This is a bug report for perl from victor@​vsespb.ru,
generated with the help of perlbug 1.39 running under perl 5.10.1.

-----------------------------------------------------------------
[Please describe your issue here]

Module File​::Find does not process UTF-8 filename correctly when
directories to search are UTF-8 strings (with UTF-8 bit set) AND
filenames of files found are in UTF-8.
Possible workaround is to add preprocess callback​:

preprocess => sub {
map { decode("UTF-8", $_, 1) } @​_;
}

example of code here
https://github.com/vsespb/mt-aws-
glacier/blob/4c483a46c282824f91802785b6620e2989c07ee4/Journal.pm#L114

[Please do not change anything below this line]
-----------------------------------------------------------------
---
Flags​:
category=library
severity=medium
module=File​::Find
---
Site configuration information for perl 5.10.1​:

Configured by Debian Project at Tue Nov 27 00​:14​:30 UTC 2012.

Summary of my perl5 (revision 5 version 10 subversion 1)
configuration​:

Platform​:
osname=linux, osvers=2.6.42-23-generic,
archname=x86_64-linux-gnu-thread-multi
uname='linux komainu 2.6.42-23-generic #36-ubuntu smp tue apr 10
20​:39​:51 utc 2012 x86_64 gnulinux '
config_args='-Dusethreads -Duselargefiles -Dccflags=-DDEBIAN
-Dcccdlflags=-fPIC -Darchname=x86_64-linux-gnu -Dprefix=/usr
-Dprivlib=/usr/share/perl/5.10 -Darchlib=/usr/lib/perl/5.10
-Dvendorprefix=/usr -Dvendorlib=/usr/share/perl5
-Dvendorarch=/usr/lib/perl5 -Dsiteprefix=/usr/local
-Dsitelib=/usr/local/share/perl/5.10.1
-Dsitearch=/usr/local/lib/perl/5.10.1 -Dman1dir=/usr/share/man/man1
-Dman3dir=/usr/share/man/man3 -Dsiteman1dir=/usr/local/man/man1
-Dsiteman3dir=/usr/local/man/man3 -Dman1ext=1 -Dman3ext=3perl
-Dpager=/usr/bin/sensible-pager -Uafs -Ud_csh -Ud_ualarm -Uusesfio
-Uusenm -DDEBUGGING=-g -Doptimize=-O2 -Duseshrplib
-Dlibperl=libperl.so.5.10.1 -Dd_dosuid -des'
hint=recommended, useposix=true, d_sigaction=define
useithreads=define, usemultiplicity=define
useperlio=define, d_sfio=undef, uselargefiles=define,
usesocks=undef
use64bitint=define, use64bitall=define, uselongdouble=undef
usemymalloc=n, bincompat5005=undef
Compiler​:
cc='cc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBIAN
-fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include
-D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
optimize='-O2 -g',
cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBIAN -fno-strict-aliasing
-pipe -fstack-protector -I/usr/local/include'
ccversion='', gccversion='4.4.3', gccosandvers=''
intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678
d_longlong=define, longlongsize=8, d_longdbl=define,
longdblsize=16
ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t',
lseeksize=8
alignbytes=8, prototype=define
Linker and Libraries​:
ld='cc', ldflags =' -fstack-protector -L/usr/local/lib'
libpth=/usr/local/lib /lib /usr/lib /lib64 /usr/lib64
libs=-lgdbm -lgdbm_compat -ldb -ldl -lm -lpthread -lc -lcrypt
perllibs=-ldl -lm -lpthread -lc -lcrypt
libc=/lib/libc-2.11.1.so, so=so, useshrplib=true,
libperl=libperl.so.5.10.1
gnulibc_version='2.11.1'
Dynamic Linking​:
dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E'
cccdlflags='-fPIC', lddlflags='-shared -O2 -g -L/usr/local/lib
-fstack-protector'

Locally applied patches​:

---
@​INC for perl 5.10.1​:
/etc/perl
/usr/local/lib/perl/5.10.1
/usr/local/share/perl/5.10.1
/usr/lib/perl5
/usr/share/perl5
/usr/lib/perl/5.10
/usr/share/perl/5.10
/usr/local/lib/site_perl
.

---
Environment for perl 5.10.1​:
HOME=/home/vse
LANG=ru_RU.utf8
LANGUAGE=en_US​:en
LC_MESSAGES=en_US.UTF-8
LD_LIBRARY_PATH (unset)
LOGDIR (unset)

PATH=/home/vse/bin​:/usr/local/sbin​:/usr/local/bin​:/usr/sbin​:/usr/bin​:/sbin​:/bin​:/usr/games​:/home/vse/.rvm/bin​:/home/vse/.rvm/bin

PERL\_BADLANG \(unset\)
SHELL=/bin/bash

@p5pRT
Copy link
Author

p5pRT commented Dec 19, 2012

From [Unknown Contact. See original ticket]

Looks like duplicate of
https://rt-archive.perl.org/perl5/Public/Bug/Display.html?id=75000

However it's different issue​: it can't be fixed by UTF decoding
returning filenames. Fix is Only to "preprocessing" filenames before it
contacenated with root dir.

Срд. Дек. 19 02​:56​:18 2012, vsespb писал​:

This is a bug report for perl from victor@​vsespb.ru,
generated with the help of perlbug 1.39 running under perl 5.10.1.

-----------------------------------------------------------------
[Please describe your issue here]

Module File​::Find does not process UTF-8 filename correctly when
directories to search are UTF-8 strings (with UTF-8 bit set) AND
filenames of files found are in UTF-8.
Possible workaround is to add preprocess callback​:

preprocess => sub {
map { decode("UTF-8", $_, 1) } @​_;
}

example of code here
https://github.com/vsespb/mt-aws-
glacier/blob/4c483a46c282824f91802785b6620e2989c07ee4/Journal.pm#L114

[Please do not change anything below this line]
-----------------------------------------------------------------
---
Flags​:
category=library
severity=medium
module=File​::Find
---
Site configuration information for perl 5.10.1​:

Configured by Debian Project at Tue Nov 27 00​:14​:30 UTC 2012.

Summary of my perl5 (revision 5 version 10 subversion 1)
configuration​:

Platform​:
osname=linux, osvers=2.6.42-23-generic,
archname=x86_64-linux-gnu-thread-multi
uname='linux komainu 2.6.42-23-generic #36-ubuntu smp tue apr 10
20​:39​:51 utc 2012 x86_64 gnulinux '
config_args='-Dusethreads -Duselargefiles -Dccflags=-DDEBIAN
-Dcccdlflags=-fPIC -Darchname=x86_64-linux-gnu -Dprefix=/usr
-Dprivlib=/usr/share/perl/5.10 -Darchlib=/usr/lib/perl/5.10
-Dvendorprefix=/usr -Dvendorlib=/usr/share/perl5
-Dvendorarch=/usr/lib/perl5 -Dsiteprefix=/usr/local
-Dsitelib=/usr/local/share/perl/5.10.1
-Dsitearch=/usr/local/lib/perl/5.10.1 -Dman1dir=/usr/share/man/man1
-Dman3dir=/usr/share/man/man3 -Dsiteman1dir=/usr/local/man/man1
-Dsiteman3dir=/usr/local/man/man3 -Dman1ext=1 -Dman3ext=3perl
-Dpager=/usr/bin/sensible-pager -Uafs -Ud_csh -Ud_ualarm -Uusesfio
-Uusenm -DDEBUGGING=-g -Doptimize=-O2 -Duseshrplib
-Dlibperl=libperl.so.5.10.1 -Dd_dosuid -des'
hint=recommended, useposix=true, d_sigaction=define
useithreads=define, usemultiplicity=define
useperlio=define, d_sfio=undef, uselargefiles=define,
usesocks=undef
use64bitint=define, use64bitall=define, uselongdouble=undef
usemymalloc=n, bincompat5005=undef
Compiler​:
cc='cc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBIAN
-fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include
-D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
optimize='-O2 -g',
cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBIAN -fno-strict-aliasing
-pipe -fstack-protector -I/usr/local/include'
ccversion='', gccversion='4.4.3', gccosandvers=''
intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678
d_longlong=define, longlongsize=8, d_longdbl=define,
longdblsize=16
ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t',
lseeksize=8
alignbytes=8, prototype=define
Linker and Libraries​:
ld='cc', ldflags =' -fstack-protector -L/usr/local/lib'
libpth=/usr/local/lib /lib /usr/lib /lib64 /usr/lib64
libs=-lgdbm -lgdbm_compat -ldb -ldl -lm -lpthread -lc -lcrypt
perllibs=-ldl -lm -lpthread -lc -lcrypt
libc=/lib/libc-2.11.1.so, so=so, useshrplib=true,
libperl=libperl.so.5.10.1
gnulibc_version='2.11.1'
Dynamic Linking​:
dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E'
cccdlflags='-fPIC', lddlflags='-shared -O2 -g -L/usr/local/lib
-fstack-protector'

Locally applied patches​:

---
@​INC for perl 5.10.1​:
/etc/perl
/usr/local/lib/perl/5.10.1
/usr/local/share/perl/5.10.1
/usr/lib/perl5
/usr/share/perl5
/usr/lib/perl/5.10
/usr/share/perl/5.10
/usr/local/lib/site_perl
.

---
Environment for perl 5.10.1​:
HOME=/home/vse
LANG=ru_RU.utf8
LANGUAGE=en_US​:en
LC_MESSAGES=en_US.UTF-8
LD_LIBRARY_PATH (unset)
LOGDIR (unset)

PATH=/home/vse/bin​:/usr/local/sbin​:/usr/local/bin​:/usr/sbin​:/usr/bin​:/sbin​:/bin​:/usr/games​:/home/vse/.rvm/bin​:/home/vse/.rvm/bin

PERL\_BADLANG \(unset\)
SHELL=/bin/bash

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
distro-Linux type-library Unicode and System Calls Bad interactions of syscalls and UTF-8
Projects
None yet
Development

No branches or pull requests

3 participants