Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unicode problems with format facility #7756

Closed
p5pRT opened this issue Jan 18, 2005 · 5 comments
Closed

Unicode problems with format facility #7756

p5pRT opened this issue Jan 18, 2005 · 5 comments

Comments

@p5pRT
Copy link

p5pRT commented Jan 18, 2005

Migrated from rt.perl.org#33832 (status was 'resolved')

Searchable as RT33832$

@p5pRT
Copy link
Author

p5pRT commented Jan 18, 2005

From vectro@yahoo.com

Created by vectro@yahoo.com

Using perl's format facility in combination with
unicode results in
unexpected output.

Example program​:

--- CUT HERE ---
#!/usr/bin/perl -wC

my @​chr = (0x44, 0x76, 0x6f, 0x159, 0xe1, 0x6b,
  0x20, 0x3d, 0x20, 0x5fb7, 0x6c83, 0x590f,
0x514b);

my $string = join('', map { chr($_) } @​chr);

print "$string\n";

format STDOUT =
^<<<<<<<<<<<<<<<<<<<<
$string
^>>>>>>>>>>>>>>>>>>>> ~
$string
.

write();
--- CUT HERE ---

We expect two (identical) lines of output, but
actually get a third line
which duplicates some of the output from the second
line. This happens
even if ISO-8859-1 characters are used; the composer's
first name (which
contains LATIN SMALL LETTER I WITH ACUTE) also breaks
output.

Presumably the source of this problem is that the
format facility is
assuming that the number of bytes is the same as the
number of
characters, instead of calling the (smarter) length
function.

Perl Info

Flags:
    category=core
    severity=medium

Site configuration information for perl v5.8.4:

Configured by Debian Project at Sun Dec 12 09:48:10
EST 2004.

Summary of my perl5 (revision 5 version 8 subversion
4) configuration:
  Platform:
    osname=linux, osvers=2.4.27-ti1211,
archname=i386-linux-thread-multi
    uname='linux kosh 2.4.27-ti1211 #1 sun sep 19
18:17:45 est 2004 i686 gnulinux '
    config_args='-Dusethreads -Duselargefiles
-Dccflags=-DDEBIAN -Dcccdlflags=-fPIC
-Darchname=i386-linux -Dprefix=/usr
-Dprivlib=/usr/share/perl/5.8
-Darchlib=/usr/lib/perl/5.8 -Dvendorprefix=/usr
-Dvendorlib=/usr/share/perl5
-Dvendorarch=/usr/lib/perl5 -Dsiteprefix=/usr/local
-Dsitelib=/usr/local/share/perl/5.8.4
-Dsitearch=/usr/local/lib/perl/5.8.4
-Dman1dir=/usr/share/man/man1
-Dman3dir=/usr/share/man/man3
-Dsiteman1dir=/usr/local/man/man1
-Dsiteman3dir=/usr/local/man/man3 -Dman1ext=1
-Dman3ext=3perl -Dpager=/usr/bin/sensible-pager -Uafs
-Ud_csh -Uusesfio -Uusenm -Duseshrplib
-Dlibperl=libperl.so.5.8.4 -Dd_dosuid -des'
    hint=recommended, useposix=true,
d_sigaction=define
    usethreads=define use5005threads=undef
useithreads=define usemultiplicity=define
    useperlio=define d_sfio=undef uselargefiles=define
usesocks=undef
    use64bitint=undef use64bitall=undef
uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='cc', ccflags ='-D_REENTRANT -D_GNU_SOURCE
-DTHREADS_HAVE_PIDS -DDEBIAN -fno-strict-aliasing
-I/usr/local/include -D_LARGEFILE_SOURCE
-D_FILE_OFFSET_BITS=64',
    optimize='-O2',
    cppflags='-D_REENTRANT -D_GNU_SOURCE
-DTHREADS_HAVE_PIDS -DDEBIAN -fno-strict-aliasing
-I/usr/local/include'
    ccversion='', gccversion='3.3.5 (Debian
1:3.3.5-3)', gccosandvers=''
    intsize=4, longsize=4, ptrsize=4, doublesize=8,
byteorder=1234
    d_longlong=define, longlongsize=8,
d_longdbl=define, longdblsize=12
    ivtype='long', ivsize=4, nvtype='double',
nvsize=8, Off_t='off_t', lseeksize=8
    alignbytes=4, prototype=define
  Linker and Libraries:
    ld='cc', ldflags =' -L/usr/local/lib'
    libpth=/usr/local/lib /lib /usr/lib
    libs=-lgdbm -lgdbm_compat -ldb -ldl -lm -lpthread
-lc -lcrypt
    perllibs=-ldl -lm -lpthread -lc -lcrypt
    libc=/lib/libc-2.3.2.so, so=so, useshrplib=true,
libperl=libperl.so.5.8.4
    gnulibc_version='2.3.2'
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef,
ccdlflags='-Wl,-E'
    cccdlflags='-fPIC', lddlflags='-shared
-L/usr/local/lib'

Locally applied patches:



@INC for perl v5.8.4:
    /etc/perl
    /usr/local/lib/perl/5.8.4
    /usr/local/share/perl/5.8.4
    /usr/lib/perl5
    /usr/share/perl5
    /usr/lib/perl/5.8
    /usr/share/perl/5.8
    /usr/local/lib/site_perl
    .


Environment for perl v5.8.4:
    HOME=/home/vectro
    LANG=en_US.UTF-8
    LANGUAGE (unset)
    LD_LIBRARY_PATH (unset)
    LOGDIR (unset)
   
PATH=/usr/local/bin:/usr/bin:/bin:/usr/X11R6/bin:/usr/games
    PERL_BADLANG (unset)
    SHELL=/bin/bash




		
__________________________________ 
Do you Yahoo!? 
Yahoo! Mail - now with 250MB free storage. Learn more.
http://info.mail.yahoo.com/mail_250

@p5pRT
Copy link
Author

p5pRT commented Jan 27, 2013

From @khwilliamson

I verified that this behavior is still present in v5.17.9

--
Karl Williamson

@p5pRT
Copy link
Author

p5pRT commented Jan 27, 2013

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented May 19, 2014

From @khwilliamson

It turns out that this was fixed by commit
commit 9b4bdfd
Author​: David Mitchell <davem@​iabyn.com>
Date​: Thu Nov 7 12​:17​:26 2013 +0000

  fix chop formats with non PV vars
 
  [perl #119847], [perl #119849], [perl #119851]

I'll add a test for this in 5.21
--
Karl Williamson

@p5pRT
Copy link
Author

p5pRT commented May 19, 2014

@khwilliamson - Status changed from 'open' to 'resolved'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant