Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PerlIO layer utf-16-le overwrites byte in filehandle backed by scalar #16453

Open
p5pRT opened this issue Mar 8, 2018 · 3 comments
Open

PerlIO layer utf-16-le overwrites byte in filehandle backed by scalar #16453

p5pRT opened this issue Mar 8, 2018 · 3 comments

Comments

@p5pRT
Copy link

p5pRT commented Mar 8, 2018

Migrated from rt.perl.org#132949 (status was 'open')

Searchable as RT132949$

@p5pRT
Copy link
Author

p5pRT commented Mar 8, 2018

From @nrdvana

This is a bug report for perl from mike@​nrdvana.net,
generated with the help of perlbug 1.40 running under perl 5.26.1.


I've run into a bug seemingly in every perl version (at least 5.12
through 5.26) where
  1) create a file handle on top of a scalar, for reading
  2) set a wide char unicode layer (16-bit, 32-bit) on the file handle
  3) read from the file handle
causes the byte at the current position to be set to zero.
The scalar should never be modified on a read-only file handle.

Example​:
https://gist.github.com/nrdvana/fe01eeda2e325d825ca811267bd349ff

use strict;
use warnings;
use autodie;
sub hexdump { join ' ', map sprintf("%02X", ord $_), split //, shift }
my ($in_fh, $input, $buf1);

print "utf-16-le\n\n";

$input= "\xFF\xFE\x11\x22\x33\x44\x55\x66";
open($in_fh, "<", \$input);
print "after open input=".hexdump($input)."\n";
binmode($in_fh, "​:encoding(utf-16-le)");
print "after binmode input=".hexdump($input)."\n";
read($in_fh, $buf1, 1);
print "after read input=".hexdump($input)
  ." buf1=".hexdump($buf1)."\n";

print "\nutf-16-be\n\n";

$input= "\xFE\xFF\x11\x22\x33\x44\x55\x66";
open($in_fh, "<", \$input);
print "after open input=".hexdump($input)."\n";
binmode($in_fh, "​:encoding(utf-16-be)");
print "after binmode input=".hexdump($input)."\n";
read($in_fh, $buf1, 1);
print "after read input=".hexdump($input)
  ." buf1= ".hexdump($buf1)."\n";

print "\nutf-32-le\n\n";

$input= "\xFF\xFE\x00\x00\x33\x44\x00\x00";
open($in_fh, "<", \$input);
print "after open input=".hexdump($input)."\n";
binmode($in_fh, "​:encoding(utf-32-le)");
print "after binmode input=".hexdump($input)."\n";
read($in_fh, $buf1, 1);
print "after read input=".hexdump($input)
  ." buf1= ".hexdump($buf1)."\n";

Output --------------------------

utf-16-le

after open input=FF FE 11 22 33 44 55 66
after binmode input=FF FE 11 22 33 44 55 66
after read input=00 FE 11 22 33 44 55 66 buf1= FEFF

utf-16-be

after open input=FE FF 11 22 33 44 55 66
after binmode input=FE FF 11 22 33 44 55 66
after read input=00 FF 11 22 33 44 55 66 buf1= FEFF

utf-32-le

after open input=FF FE 00 00 33 44 00 00
after binmode input=FF FE 00 00 33 44 00 00
after read input=00 FE 00 00 33 44 00 00 buf1= FEFF



Flags​:
  category=core
  severity=medium


Site configuration information for perl 5.26.1​:

Configured by builduser at Fri Jan 5 02​:49​:35 UTC 2018.

Summary of my perl5 (revision 5 version 26 subversion 1) configuration​:
  Platform​:
  osname=linux
  osvers=4.14.11-1-arch
  archname=x86_64-linux-thread-multi
  uname='linux felix 4.14.11-1-arch #1 smp preempt wed jan 3 07​:02​:42
utc 2018 x86_64 gnulinux '
  config_args='-des -Dusethreads -Duseshrplib
-Doptimize=-march=x86-64 -mtune=generic -O2 -pipe
-fstack-protector-strong -fno-plt -Dprefix=/usr -Dvendorprefix=/usr
-Dprivlib=/usr/share/perl5/core_perl
-Darchlib=/usr/lib/perl5/5.26/core_perl
-Dsitelib=/usr/share/perl5/site_perl
-Dsitearch=/usr/lib/perl5/5.26/site_perl
-Dvendorlib=/usr/share/perl5/vendor_perl
-Dvendorarch=/usr/lib/perl5/5.26/vendor_perl
-Dscriptdir=/usr/bin/core_perl -Dsitescript=/usr/bin/site_perl
-Dvendorscript=/usr/bin/vendor_perl -Dinc_version_list=none
-Dman1ext=1perl -Dman3ext=3perl -Dcccdlflags='-fPIC' -Dlddlflags=-shared
-Wl,-O1,--sort-common,--as-needed,-z,relro,-z,now
-Dldflags=-Wl,-O1,--sort-common,--as-needed,-z,relro,-z,now'
  hint=recommended
  useposix=true
  d_sigaction=define
  useithreads=define
  usemultiplicity=define
  use64bitint=define
  use64bitall=define
  uselongdouble=undef
  usemymalloc=n
  default_inc_excludes_dot=define
  bincompat5005=undef
  Compiler​:
  cc='cc'
  ccflags ='-D_REENTRANT -D_GNU_SOURCE -fwrapv -fno-strict-aliasing
-pipe -fstack-protector-strong -I/usr/local/include -D_LARGEFILE_SOURCE
-D_FILE_OFFSET_BITS=64 -D_FORTIFY_SOURCE=2'
  optimize='-march=x86-64 -mtune=generic -O2 -pipe
-fstack-protector-strong -fno-plt'
  cppflags='-D_REENTRANT -D_GNU_SOURCE -fwrapv -fno-strict-aliasing
-pipe -fstack-protector-strong -I/usr/local/include'
  ccversion=''
  gccversion='7.2.1 20171224'
  gccosandvers=''
  intsize=4
  longsize=8
  ptrsize=8
  doublesize=8
  byteorder=12345678
  doublekind=3
  d_longlong=define
  longlongsize=8
  d_longdbl=define
  longdblsize=16
  longdblkind=3
  ivtype='long'
  ivsize=8
  nvtype='double'
  nvsize=8
  Off_t='off_t'
  lseeksize=8
  alignbytes=8
  prototype=define
  Linker and Libraries​:
  ld='cc'
  ldflags ='-Wl,-O1,--sort-common,--as-needed,-z,relro,-z,now
-fstack-protector-strong -L/usr/local/lib'
  libpth=/usr/local/lib
/usr/lib/gcc/x86_64-pc-linux-gnu/7.2.1/include-fixed /usr/lib
/lib/../lib /usr/lib/../lib /lib /lib64 /usr/lib64
  libs=-lpthread -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lc
-lgdbm_compat
  perllibs=-lpthread -lnsl -ldl -lm -lcrypt -lutil -lc
  libc=libc-2.26.so
  so=so
  useshrplib=true
  libperl=libperl.so
  gnulibc_version='2.26'
  Dynamic Linking​:
  dlsrc=dl_dlopen.xs
  dlext=so
  d_dlsymun=undef
  ccdlflags='-Wl,-E -Wl,-rpath,/usr/lib/perl5/5.26/core_perl/CORE'
  cccdlflags='-fPIC'
  lddlflags='-shared
-Wl,-O1,--sort-common,--as-needed,-z,relro,-z,now -L/usr/local/lib
-fstack-protector-strong'


@​INC for perl 5.26.1​:
  /usr/lib/perl5/5.26/site_perl
  /usr/share/perl5/site_perl
  /usr/lib/perl5/5.26/vendor_perl
  /usr/share/perl5/vendor_perl
  /usr/lib/perl5/5.26/core_perl
  /usr/share/perl5/core_perl

(pardon for trimming the environment, but it had things I didn't want to
publish)

@p5pRT
Copy link
Author

p5pRT commented Mar 12, 2018

From @tonycoz

On Wed, 07 Mar 2018 23​:16​:28 -0800, mike@​nrdvana.net wrote​:

I've run into a bug seemingly in every perl version (at least 5.12
through 5.26) where
1) create a file handle on top of a scalar, for reading
2) set a wide char unicode layer (16-bit, 32-bit) on the file handle
3) read from the file handle
causes the byte at the current position to be set to zero.
The scalar should never be modified on a read-only file handle.

Example​:
https://gist.github.com/nrdvana/fe01eeda2e325d825ca811267bd349ff

use strict;
use warnings;
use autodie;
sub hexdump { join ' ', map sprintf("%02X", ord $_), split //, shift }
my ($in_fh, $input, $buf1);

print "utf-16-le\n\n";

$input= "\xFF\xFE\x11\x22\x33\x44\x55\x66";
open($in_fh, "<", \$input);
print "after open input=".hexdump($input)."\n";
binmode($in_fh, "​:encoding(utf-16-le)");
print "after binmode input=".hexdump($input)."\n";
read($in_fh, $buf1, 1);
print "after read input=".hexdump($input)
." buf1=".hexdump($buf1)."\n";

print "\nutf-16-be\n\n";

$input= "\xFE\xFF\x11\x22\x33\x44\x55\x66";
open($in_fh, "<", \$input);
print "after open input=".hexdump($input)."\n";
binmode($in_fh, "​:encoding(utf-16-be)");
print "after binmode input=".hexdump($input)."\n";
read($in_fh, $buf1, 1);
print "after read input=".hexdump($input)
." buf1= ".hexdump($buf1)."\n";

print "\nutf-32-le\n\n";

$input= "\xFF\xFE\x00\x00\x33\x44\x00\x00";
open($in_fh, "<", \$input);
print "after open input=".hexdump($input)."\n";
binmode($in_fh, "​:encoding(utf-32-le)");
print "after binmode input=".hexdump($input)."\n";
read($in_fh, $buf1, 1);
print "after read input=".hexdump($input)
." buf1= ".hexdump($buf1)."\n";

Output --------------------------

utf-16-le

after open input=FF FE 11 22 33 44 55 66
after binmode input=FF FE 11 22 33 44 55 66
after read input=00 FE 11 22 33 44 55 66 buf1= FEFF

utf-16-be

after open input=FE FF 11 22 33 44 55 66
after binmode input=FE FF 11 22 33 44 55 66
after read input=00 FF 11 22 33 44 55 66 buf1= FEFF

utf-32-le

after open input=FF FE 00 00 33 44 00 00
after binmode input=FF FE 00 00 33 44 00 00
after read input=00 FE 00 00 33 44 00 00 buf1= FEFF

This is a duplicate of #132833, which was fixed in fed9fe5.

I don't see this commit in the 5.26 votes file.

Tony

@p5pRT
Copy link
Author

p5pRT commented Mar 12, 2018

The RT System itself - Status changed from 'new' to 'open'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants