Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Odd behavior when string filehandles and scalar assignment collide #10812

Closed
p5pRT opened this issue Nov 9, 2010 · 6 comments
Closed

Odd behavior when string filehandles and scalar assignment collide #10812

p5pRT opened this issue Nov 9, 2010 · 6 comments

Comments

@p5pRT
Copy link

p5pRT commented Nov 9, 2010

Migrated from rt.perl.org#78980 (status was 'resolved')

Searchable as RT78980$

@p5pRT
Copy link
Author

p5pRT commented Nov 9, 2010

From @briandfoy

Created by @briandfoy

I don't know if this is a bug, but it is certainly an odd situation
that wasted a day of my time debugging a filehandle to a string
problem.

In a test file, I had opened a filehandle to a string and written some
text to it. I then changed the scalar's value to clear it out (I
thought), then reused the string filehandle thinking I'd get only the
latest output in the scalar. That's not what happens, and I understand,
I think, why that's not what happens.

Here's a short example script. Although I've written this for 5.010
and later, Perl 5.8 has the same problem​:

  #!perl
  use 5.010;
  use strict;
  use warnings;

  open my $string_fh, '>', \my $string;
  print $string_fh "Buster likes liver treats";
  show_string( $string );

  $string = '';
  show_string( $string );

  print $string_fh "Mimi";
  show_string( $string );

  sub show_string {
  state $n = 0;

  printf "%d​: string is [%s] length [%d]\n\thex [%s]\n",
  $n++, $_[0], length $_[0],
  join "​:", map { sprintf '%02X', ord } split //, $_[0]
  ;
  }

Here's the output. (0) and (1) look fine, but it looks like perl is
reusing the memory for the scalar, which has left over data from a
previous state. It took me a long time to realize that there was a
null byte at the beginning of the string for (2).

  0​: string is [Buster likes liver treats] length [25]
  hex [42​:75​:73​:74​:65​:72​:20​:6C​:69​:6B​:65​:73​:20​:6C​:69​:76​:65​:72​:20​:74​:72​:65​:61​:74​:73]
  1​: string is [] length [0]
  hex []
  2​: string is [uster likes liver treatsMimi] length [29]
  hex [00​:75​:73​:74​:65​:72​:20​:6C​:69​:6B​:65​:73​:20​:6C​:69​:76​:65​:72​:20​:74​:72​:65​:61​:74​:73​:4D​:69​:6D​:69]

Turning off buffering did not change anything. Curiously, truncate(),
the filehandley way to do this, doesn't work on a filehandle to a
string and sets $! to 'Bad file descriptor'.

seek()ing in the string works to overwrite some of the
data​:

  #!perl
  use 5.010;
  use strict;
  use warnings;

  use Fcntl qw(​:seek);

  open my $string_fh, '>', \my $string;
  print $string_fh "Buster likes liver treats";
  show_string( $string );

  seek $string_fh, 5, SEEK_SET;
  $string = '';
  show_string( $string );

  print $string_fh "Mimi";
  show_string( $string );

  sub show_string {
  state $n = 0;

  printf "%d​: string is [%s] length [%d]\n\thex [%s]\n",
  $n++, $_[0], length $_[0],
  join "​:", map { sprintf '%02X', ord } split //, $_[0]
  ;
  }

This gives similar output. There's a null byte before the 'uster' in (2)​:

  0​: string is [Buster likes liver treats] length [25]
  hex [42​:75​:73​:74​:65​:72​:20​:6C​:69​:6B​:65​:73​:20​:6C​:69​:76​:65​:72​:20​:74​:72​:65​:61​:74​:73]
  1​: string is [] length [0]
  hex []
  2​: string is [usteMimi] length [9]
  hex [00​:75​:73​:74​:65​:4D​:69​:6D​:69]

In this case, I expected five null bytes then 'Mimi', like I would get
in the string if I seek()ed to 5 then print()ed to the string​:

  #!perl
  use 5.010;
  use strict;
  use warnings;

  use Fcntl qw(​:seek);

  open my $string_fh, '>', \my $string;
  show_string( $string );

  seek $string_fh, 5, SEEK_SET;

  print $string_fh "Mimi";
  show_string( $string );

  sub show_string {
  state $n = 0;

  printf "%d​: string is [%s] length [%d]\n\thex [%s]\n",
  $n++, $_[0], length $_[0],
  join "​:", map { sprintf '%02X', ord } split //, $_[0]
  ;
  }

This gives reasonable output in (1), once I realize what is happening​:

  0​: string is [] length [0]
  hex []
  1​: string is [Mimi] length [9]
  hex [00​:00​:00​:00​:00​:4D​:69​:6D​:69]

The fix, of course, is to not do affect the string through filehandle
and normal string operations at the same time. However, I can imagine
situations where you don't know that the scalar you have is actually
hooked up to a filehandle. If perl knows this, perhaps it can at least
issue a warning.

Perl Info

Flags:
    category=core
    severity=low

Site configuration information for perl 5.13.4:

Configured by brian at Sun Aug 29 22:13:45 EDT 2010.

Summary of my perl5 (revision 5 version 13 subversion 4) configuration:

  Platform:
    osname=darwin, osvers=9.8.0, archname=darwin-2level
    uname='darwin mimibean.local 9.8.0 darwin kernel version 9.8.0:
wed jul 15 16:55:01 pdt 2009; root:xnu-1228.15.4~1release_i386 i386
i386 '
    config_args='-des -Dprefix=/usr/local/perls/perl-5.13.4 -Dusedevel'
    hint=recommended, useposix=true, d_sigaction=define
    useithreads=undef, usemultiplicity=undef
    useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef
    use64bitint=undef, use64bitall=undef, uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='cc', ccflags ='-fno-common -DPERL_DARWIN -no-cpp-precomp
-fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include',
    optimize='-O3',
    cppflags='-no-cpp-precomp -fno-common -DPERL_DARWIN
-no-cpp-precomp -fno-strict-aliasing -pipe -fstack-protector
-I/usr/local/include'
    ccversion='', gccversion='4.0.1 (Apple Inc. build 5490)', gccosandvers=''
    intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
    ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t',
lseeksize=8
    alignbytes=8, prototype=define
  Linker and Libraries:
    ld='env MACOSX_DEPLOYMENT_TARGET=10.3 cc', ldflags ='
-fstack-protector -L/usr/local/lib'
    libpth=/usr/local/lib /usr/lib
    libs=-ldbm -ldl -lm -lutil -lc
    perllibs=-ldl -lm -lutil -lc
    libc=/usr/lib/libc.dylib, so=dylib, useshrplib=false, libperl=libperl.a
    gnulibc_version=''
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=bundle, d_dlsymun=undef, ccdlflags=' '
    cccdlflags=' ', lddlflags=' -bundle -undefined dynamic_lookup
-L/usr/local/lib -fstack-protector'

Locally applied patches:



@INC for perl 5.13.4:
    /usr/local/perls/perl-5.13.4/lib/site_perl/5.13.4/darwin-2level
    /usr/local/perls/perl-5.13.4/lib/site_perl/5.13.4
    /usr/local/perls/perl-5.13.4/lib/5.13.4/darwin-2level
    /usr/local/perls/perl-5.13.4/lib/5.13.4
    .


Environment for perl 5.13.4:
    DYLD_LIBRARY_PATH (unset)
    HOME=/Users/brian
    LANG=en_US
    LANGUAGE (unset)
    LC_ALL=C
    LC_COLLATE=en_US.UTF-8
    LC_CTYPE=en_US.UTF-8
    LC_MESSAGES=en_US.UTF-8
    LC_MONETARY=en_US.UTF-8
    LC_NUMERIC=en_US.UTF-8
    LC_TIME=en_US.UTF-8
    LD_LIBRARY_PATH (unset)
    LOGDIR (unset)
    PATH=/Users/brian/bin:/usr/local/bin:/opt/local/bin:/Users/brian/TPR/scripts:/bin:/usr/bin:/sbin:/usr/sbin:/usr/local/mysql/bin:/usr/X11R6/bin:/usr/local/teTeX/bin/powerpc-apple-darwin-current:/usr/local/pgsql/bin:/usr/local/gcj/bin:/Library/Frameworks/Python.framework/Versions/Current/bin
    PERL_BADLANG (unset)
    SHELL=/bin/bash

@p5pRT
Copy link
Author

p5pRT commented Jan 7, 2012

From @jkeenan

On Tue Nov 09 15​:14​:01 2010, comdog wrote​:

Cc​: brian.d.foy@​gmail.com
Subject​: Odd behavior when string filehandles and string operations
collide
Message-Id​: <5.13.4_1213_1289341116@​mimibean.local>
Reply-To​: brian.d.foy@​gmail.com
To​: perlbug@​perl.org

This is a bug report for perl from brian.d.foy@​gmail.com,
generated with the help of perlbug 1.39 running under perl 5.13.4.

I don't know if this is a bug, but it is certainly an odd situation
that wasted a day of my time debugging a filehandle to a string
problem.

In a test file, I had opened a filehandle to a string and written some
text to it. I then changed the scalar's value to clear it out (I
thought), then reused the string filehandle thinking I'd get only the
latest output in the scalar. That's not what happens, and I
understand,
I think, why that's not what happens.

Here's a short example script. Although I've written this for 5.010
and later, Perl 5.8 has the same problem​:

\#\!perl
use 5\.010;
use strict;
use warnings;

open my $string\_fh\, '>'\, \\my $string;
print $string\_fh "Buster likes liver treats";
show\_string\( $string \);

$string = '';
show\_string\( $string \);

print $string\_fh "Mimi";
show\_string\( $string \);

sub show\_string \{
    state $n = 0;

    printf "%d&#8203;: string is \[%s\] length \[%d\]\\n\\thex \[%s\]\\n"\,
        $n\+\+\, $\_\[0\]\, length $\_\[0\]\,
        join "&#8203;:"\, map \{ sprintf '%02X'\, ord \} split //\, $\_\[0\]
        ;
    \}

Here's the output. (0) and (1) look fine, but it looks like perl is
reusing the memory for the scalar, which has left over data from a
previous state. It took me a long time to realize that there was a
null byte at the beginning of the string for (2).

0&#8203;: string is \[Buster likes liver treats\] length \[25\]
    hex

[42​:75​:73​:74​:65​:72​:20​:6C​:69​:6B​:65​:73​:20​:6C​:69​:76​:65​:72​:20​:74​:72​:65​:61​:74​:73]

1&#8203;: string is \[\] length \[0\]
    hex \[\]
2&#8203;: string is \[uster likes liver treatsMimi\] length \[29\]
    hex

[00​:75​:73​:74​:65​:72​:20​:6C​:69​:6B​:65​:73​:20​:6C​:69​:76​:65​:72​:20​:74​:72​:65​:61​:74​:73​:4D​:69​:6D​:69]

[snip]

Confirmed. Do you know why that null byte is overwriting the 'B'?

@p5pRT
Copy link
Author

p5pRT commented Jan 7, 2012

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Jan 7, 2012

From @Leont

On Sat, Jan 7, 2012 at 3​:57 AM, James E Keenan via RT
<perlbug-followup@​perl.org> wrote​:

Confirmed.  Do you know why that null byte is overwriting the 'B'?

When $string is overwritten, the nullbyte is written but the buffer is
kept at its existing size. :scalar ignores the new length and appends
to the end, setting the length to what it things it should be.

Try undefining $scalar or make it an array if you want some really
weird results. The bottom line is that :scalar makes the assumption
that no one else will change $string, and that is not a reasonable
assumption. I wouldn't be surprised if there was a way to make this
segfault.

Leon

@p5pRT
Copy link
Author

p5pRT commented Jan 20, 2012

From @cpansprout

On Sat Jan 07 05​:30​:12 2012, LeonT wrote​:

On Sat, Jan 7, 2012 at 3​:57 AM, James E Keenan via RT
<perlbug-followup@​perl.org> wrote​:

Confirmed. �Do you know why that null byte is overwriting the 'B'?

When $string is overwritten, the nullbyte is written but the buffer is
kept at its existing size. :scalar ignores the new length and appends
to the end, setting the length to what it things it should be.

Try undefining $scalar or make it an array if you want some really
weird results. The bottom line is that :scalar makes the assumption
that no one else will change $string, and that is not a reasonable
assumption. I wouldn't be surprised if there was a way to make this
segfault.

Setting the scalar to a number caused a segfault before I fixed it in
commit c5a04db.

The bug in this ticket was fixed as a side-effect of fixing #92706 in
commit b659727.

The funny thing is that your comment came along just two days after
those fixes.

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Jan 20, 2012

@cpansprout - Status changed from 'open' to 'resolved'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant