New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
open() is not UTF-8-clean #9674
Comments
From zefram@fysh.orgCreated by zefram@fysh.org$ perl -lwe '$a="\x{e3}"; utf8::downgrade($a); open(my $x, ">", "x$a"); utf8::upgrade($a); open(my $y, ">", "y$a"); opendir(my $d, "."); while(defined($_ = readdir($d))) { print unpack("H*", $_) unless /\A[ -~]*\z/ }' Apparently open() is using, for the filename, the octet sequence used Perl Info
|
From lists@der-pepe.deOn Fri, Mar 06, 2009 at 03:03:14AM -0800, Zefram wrote:
It's kind of documented in perlunicode, section "When Unicode Does Not Regards, |
The RT System itself - Status changed from 'new' to 'open' |
From zefram@fysh.orgChristoph Bussenius wrote:
It seems to be documented from the point of view of perl 5.6, where But in this case, in fact, I *am* supplying you with a byte string,
Not at all. Differing standards about encodings to be used with filenames -zefram |
From tchrist@perl.comFor a real fun time (NOT!), try this on various operating systems Some things I understand, others are very mysterious, like Use of uninitialized value $count in printf at mkfiles line 77. Yet it clearly prints the correct file number. So odd. Make sure to run this on both a non utf8 aware f/s and also Remarkable! --tom
|
From lists@der-pepe.deOn Thu, Mar 12, 2009 at 03:03:01AM -0600, Tom Christiansen wrote:
I think I can answer this part of your mail: The file number is being printed correctly because of line 60, emit($name, $i); However the warning stems from line 73, emit($name); which happens after select(FH), which is why the missing number cannot Regards, |
From victor@vsespb.ruThis code perl -lwe '$a="\x{e3}"; utf8::downgrade($a); open(my $x, ">", "x$a"); actually sends different octets to open(). perl -MDevel::Peek -lwe '$a="\x{e3}"; utf8::downgrade($a); print SV = PV(0x1b45c68) at 0x1b694e8 SV = PV(0x1b45b58) at 0x1b730a0 On Fri Mar 06 03:03:13 2009, zefram@fysh.org wrote:
gnu/bin:/home/zefram/pub/common/bin:/usr/bin:/usr/X11R6/bin:/bin:/usr/local/bin:/usr/games
|
From victor@vsespb.ruI did not checked, but it looks to me that this script
is creating filenames in different "Normalization" forms (NFC/NFD). On Thu Mar 12 02:03:51 2009, tom christiansen wrote:
|
…ile::Path. (See: #956233, #956723) This provides relief from runtime errors in Lintian, but does not solve the bugs. It merely makes Lintian useable again. The offending packages sphinx and supysonic no longer abort with runtime errors. Due to a bug in Perl, strings must be "downgraded" before system calls such as stat or open. It is the proper fix [1][2], and should happen in Perl. We simply do so here as triage. [1] Perl/perl5#10550 [2] Perl/perl5#9674 More comprehensive fixes for both bugs are in the works.
Migrated from rt.perl.org#63674 (status was 'open')
Searchable as RT63674$
The text was updated successfully, but these errors were encountered: