Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Odd behavior of regex match against in-memory file #11874

Closed
p5pRT opened this issue Jan 17, 2012 · 21 comments
Closed

Odd behavior of regex match against in-memory file #11874

p5pRT opened this issue Jan 17, 2012 · 21 comments

Comments

@p5pRT
Copy link

p5pRT commented Jan 17, 2012

Migrated from rt.perl.org#108398 (status was 'resolved')

Searchable as RT108398$

@p5pRT
Copy link
Author

p5pRT commented Jan 17, 2012

From jwdevel@gmail.com

This is a bug report for perl from jwdevel@​gmail.com,
generated with the help of perlbug 1.39 running under perl 5.14.1.


When I run the following code, I get 'no' printed out, which is not
what I expect.

  my $memory_file;
  my $fh;
  open ($fh, '>', \$memory_file);
  print $fh "abc";
  if( $memory_file =~ m/^.*$/ )
  { print "yes\n" }
  else
  { print "no\n" }

This issue is discussed in more detail online​:

  http​://stackoverflow.com/questions/8649916/perl-writing-to-a-memory-file-plays-tricks-with-pattern-matching/8650519
  http​://perlmonks.org/?node_id=945256

Some builds of perl 5.14.1 do not have this issue, others do (see
discussion at PerlMonks link)



Flags​:
  category=core
  severity=low


Site configuration information for perl 5.14.1​:

Configured by jrw at Wed Sep 14 07​:30​:00 PDT 2011.

Summary of my perl5 (revision 5 version 14 subversion 1) configuration​:

  Platform​:
  osname=freebsd, osvers=7.2-release, archname=amd64-freebsd-thread-multi
  uname='freebsd lumpy.fake_fake.net 7.2-release freebsd 7.2-release
#0​: fri may 107​:18​:07 utc 2009
root@​driscoll.cse.buffalo.edu​:usrobjusrsrcsysgeneric amd64 '
  config_args='-sde -Dprefix=/usr/local
-Darchlib=/usr/local/lib/perl5/5.14.1/mach-Dprivlib=/usr/local/lib/perl5/5.14.1-Dman3dir=/usr/local/lib/perl5/5.14.1/perl/man/man3
-Dman1dir=/usr/local/man/man1-Dsitearch=/usr/local/lib/perl5/site_perl/5.14.1/mach-Dsitelib=/usr/local/lib/perl5/site_perl/5.14.1
-Dscriptdir=/usr/local/bin-Dsiteman3dir=/usr/local/lib/perl5/5.14.1/man/man3
-Dsiteman1dir=/usr/local/man/man1-Ui_malloc -Ui_iconv
-Uinstallusrbinperl -Dcc=cc -Duseshrplib
-Dinc_version_list=none-Dccflags=-DAPPLLIB_EXP="/usr/local/lib/perl5/5.14.1/BSDPAN"
-Doptimize=-O2-fno-strict-aliasing -pipe -Ui_gdbm -Dusethreads=y
-Dusemymalloc=n -Duse64bitint'
  hint=recommended, useposix=true, d_sigaction=define
  useithreads=define, usemultiplicity=define
  useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef
  use64bitint=define, use64bitall=define, uselongdouble=undef
  usemymalloc=n, bincompat5005=undef
  Compiler​:
  cc='cc', ccflags
='-DAPPLLIB_EXP="/usr/local/lib/perl5/5.14.1/BSDPAN"
-DHAS_FPSETMASK-DHAS_FLOATINGPOINT_H -fno-strict-aliasing -pipe
-fstack-protector-I/usr/local/include',
  optimize='-O2 -fno-strict-aliasing -pipe',
  cppflags='-DAPPLLIB_EXP="/usr/local/lib/perl5/5.14.1/BSDPAN"
-DHAS_FPSETMASK-DHAS_FLOATINGPOINT_H -fno-strict-aliasing -pipe
-fstack-protector -I/usr/local/include'
  ccversion='', gccversion='4.2.1 20070719 [FreeBSD]', gccosandvers=''
  intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678
  d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
  ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t',
lseeksize=8
  alignbytes=8, prototype=define
  Linker and Libraries​:
  ld='cc', ldflags ='-pthread -Wl,-E -fstack-protector -L/usr/local/lib'
  libpth=/usr/lib /usr/local/lib
  libs=-lgdbm -lm -lcrypt -lutil
  perllibs=-lm -lcrypt -lutil
  libc=, so=so, useshrplib=true, libperl=libperl.so
  gnulibc_version=''
  Dynamic Linking​:
  dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef,
ccdlflags='-Wl,-R/usr/local/lib/perl5/5.14.1/mach/CORE'
  cccdlflags='-DPIC -fPIC', lddlflags='-shared -L/usr/local/lib
-fstack-protector'

Locally applied patches​:


@​INC for perl 5.14.1​:
  /usr/local/lib/perl5/5.14.1/BSDPAN
  /usr/local/lib/perl5/site_perl/5.14.1/mach
  /usr/local/lib/perl5/site_perl/5.14.1
  /usr/local/lib/perl5/5.14.1/mach
  /usr/local/lib/perl5/5.14.1
  .


Environment for perl 5.14.1​:
  HOME=/home/jrw
  LANG=en_US.UTF-8
  LANGUAGE (unset)
LD_LIBRARY_PATH=/usr/local/diablo-jdk1.6.0/jre/lib/amd64/server​:/usr/local/diablo-jdk1.6.0/jre/lib/amd64​:/usr/local/diablo-jdk1.6.0/jre/../lib/amd64
  LOGDIR (unset)
PATH=/home/jrw/usr/local/bin​:/sbin​:/bin​:/usr/sbin​:/usr/bin​:/usr/games​:/usr/local/sbin​:/usr/local/bin
  PERL_BADLANG (unset)
  SHELL=/usr/local/bin/bash

@p5pRT
Copy link
Author

p5pRT commented Jan 17, 2012

From @jkeenan

On Mon Jan 16 17​:09​:30 2012, jwdevel@​gmail.com wrote​:

This is a bug report for perl from jwdevel@​gmail.com,
generated with the help of perlbug 1.39 running under perl 5.14.1.

-----------------------------------------------------------------
When I run the following code, I get 'no' printed out, which is not
what I expect.

    my $memory\_file;
    my $fh;
    open \($fh\, '>'\, \\$memory\_file\);
    print $fh "abc";
    if\( $memory\_file =~ m/^\.\*$/ \)
    \{ print "yes\\n" \}
    else
    \{ print "no\\n" \}

This issue is discussed in more detail online​:

    http​://stackoverflow\.com/questions/8649916/perl\-writing\-to\-a\-

memory-file-plays-tricks-with-pattern-matching/8650519
http​://perlmonks.org/?node_id=945256

Some builds of perl 5.14.1 do not have this issue, others do (see
discussion at PerlMonks link)

My results​:

Darwin/PPC, Perl 5.14.2​: yes
Linux/i386, Perl 5.14.2​: no

Since Darwin/PPC is bigendian (per my Parrot %PConfig) while Linux/i386
is not, could this be an endian-ness issue?

Thank you very much.
Jim Keenan

@p5pRT
Copy link
Author

p5pRT commented Jan 17, 2012

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Jan 17, 2012

From @cpansprout

On Mon Jan 16 18​:16​:49 2012, jkeenan wrote​:

On Mon Jan 16 17​:09​:30 2012, jwdevel@​gmail.com wrote​:

This is a bug report for perl from jwdevel@​gmail.com,
generated with the help of perlbug 1.39 running under perl 5.14.1.

-----------------------------------------------------------------
When I run the following code, I get 'no' printed out, which is not
what I expect.

    my $memory\_file;
    my $fh;
    open \($fh\, '>'\, \\$memory\_file\);
    print $fh "abc";
    if\( $memory\_file =~ m/^\.\*$/ \)
    \{ print "yes\\n" \}
    else
    \{ print "no\\n" \}

This issue is discussed in more detail online​:

    http​://stackoverflow\.com/questions/8649916/perl\-writing\-to\-a\-

memory-file-plays-tricks-with-pattern-matching/8650519
http​://perlmonks.org/?node_id=945256

Some builds of perl 5.14.1 do not have this issue, others do (see
discussion at PerlMonks link)

My results​:

Darwin/PPC, Perl 5.14.2​: yes
Linux/i386, Perl 5.14.2​: no

Since Darwin/PPC is bigendian (per my Parrot %PConfig) while Linux/i386
is not, could this be an endian-ness issue?

I get yes on little-endian darwin.

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Jan 17, 2012

From @Tux

On Mon, 16 Jan 2012 20​:25​:10 -0800, "Father Chrysostomos via RT"
<perlbug-followup@​perl.org> wrote​:

My results​:

Darwin/PPC, Perl 5.14.2​: yes
Linux/i386, Perl 5.14.2​: no

Since Darwin/PPC is bigendian (per my Parrot %PConfig) while Linux/i386
is not, could this be an endian-ness issue?

I get yes on little-endian darwin.

It is not *just* the endianness​:

--8<---
use strict;
use warnings;
use Config;
my $memory_file;
open my $fh, ">", \$memory_file;
print $fh "abc";
print "perl-$Config{version} on $Config{osname} $Config{osvers} $Config{archname} ($Config{cc}) : ",
  $memory_file =~ m/^.*$/ ? "yes\n" : "no\n";
-->8---

perl-5.12.2 on hpux 11.31 IA64.ARCHREV_0-LP64-ld (cc) : no
perl-5.14.1 on hpux 11.31 IA64.ARCHREV_0 (gcc) : no
perl-5.14.1 on hpux 11.31 IA64.ARCHREV_0-LP64-ld (gcc) : no
perl-5.10.1 on hpux 11.23 IA64.ARCHREV_0-LP64 (cc) : no
perl-5.14.2 on hpux 11.23 IA64.ARCHREV_0-LP64-ld (cc) : no
perl-5.14.1 on hpux 11.23 IA64.ARCHREV_0 (gcc) : no
perl-5.14.1 on hpux 11.23 IA64.ARCHREV_0-LP64-ld (gcc) : no
perl-5.10.1 on hpux 11.11 PA-RISC2.0-LP64 (cc) : no
perl-5.14.1 on hpux 11.11 PA-RISC2.0-LP64 (gcc) : no
perl-5.14.1 on hpux 11.11 PA-RISC2.0-LP64 (gcc64) : no
perl-5.10.1 on hpux 11.00 PA-RISC2.0 (cc) : yes
perl-5.14.1 on hpux 11.00 PA-RISC2.0 (gcc) : yes
perl-5.14.1 on hpux 11.00 PA-RISC2.0-LP64 (gcc64) : no
perl-5.8.8 on hpux 10.20 PA-RISC2.0 (cc) : yes
perl-5.14.1 on hpux 10.20 PA-RISC2.0 (gcc) : yes
perl-5.10.1 on aix 5.3.0.0 aix-64all (xlc -q64) : yes
perl-5.14.1 on linux 2.6.37.6-0.5 x86_64-linux-ld (cc) : yes
perl-5.14.1 on linux 2.6.37.6-31 i686-linux-64int-ld (cc) : no
perl-5.12.3 on linux 2.6.36 i586-linux-thread-multi (cc) : no

And all perls on one box. Note the 5.13.1 result

perl-5.6.0 on linux 2.6.31.12-0.2-default i686-linux-64int (cc) : yes
perl-5.6.1 on linux 2.6.31.12-0.2-default i686-linux-64int-perlio (cc) : yes
perl-5.6.1 on linux 2.6.34.7-0.4-desktop i686-linux-thread-multi-64int-ld-perlio (cc) : yes
perl-5.6.2 on linux 2.6.31.12-0.2-default i686-linux-64int-perlio (cc) : yes
perl-5.6.2 on linux 2.6.34.7-0.4-desktop i686-linux-thread-multi-64int-ld-perlio (cc) : yes
perl-5.8.0 on linux 2.6.31.12-0.2-default i686-linux-64int (cc) : no
perl-5.8.0 on linux 2.6.34.7-0.4-desktop i686-linux-thread-multi-64int-ld (cc) : no
perl-5.8.1 on linux 2.6.31.12-0.2-default i686-linux-64int (cc) : no
perl-5.8.1 on linux 2.6.34.7-0.4-desktop i686-linux-thread-multi-64int-ld (cc) : no
perl-5.8.2 on linux 2.6.31.12-0.2-default i686-linux-64int (cc) : no
perl-5.8.2 on linux 2.6.34.7-0.4-desktop i686-linux-thread-multi-64int-ld (cc) : no
perl-5.8.3 on linux 2.6.31.12-0.2-default i686-linux-64int (cc) : no
perl-5.8.3 on linux 2.6.34.7-0.4-desktop i686-linux-thread-multi-64int-ld (cc) : no
perl-5.8.4 on linux 2.6.31.12-0.2-default i686-linux-64int (cc) : no
perl-5.8.4 on linux 2.6.34.7-0.4-desktop i686-linux-thread-multi-64int-ld (cc) : no
perl-5.8.5 on linux 2.6.31.12-0.2-default i686-linux-64int (cc) : no
perl-5.8.5 on linux 2.6.34.7-0.4-desktop i686-linux-thread-multi-64int-ld (cc) : no
perl-5.8.6 on linux 2.6.31.12-0.2-default i686-linux-64int (cc) : no
perl-5.8.6 on linux 2.6.34.7-0.4-desktop i686-linux-thread-multi-64int-ld (cc) : no
perl-5.8.7 on linux 2.6.31.12-0.2-default i686-linux-64int (cc) : no
perl-5.8.7 on linux 2.6.34.7-0.4-desktop i686-linux-thread-multi-64int-ld (cc) : no
perl-5.8.8 on linux 2.6.31.12-0.2-default i686-linux-64int (cc) : no
perl-5.8.8 on linux 2.6.34.7-0.4-desktop i686-linux-thread-multi-64int-ld (cc) : no
perl-5.8.9 on linux 2.6.31.12-0.2-default i686-linux-64int (cc) : no
perl-5.8.9 on linux 2.6.34.7-0.4-desktop i686-linux-thread-multi-64int-ld (cc) : no
perl-5.10.0 on linux 2.6.31.12-0.2-default i686-linux-64int (cc) : no
perl-5.10.0 on linux 2.6.34.7-0.4-desktop i686-linux-thread-multi-64int-ld (cc) : no
perl-5.10.1 on linux 2.6.37.6-31-desktop i686-linux-64int (cc) : no
perl-5.10.1 on linux 2.6.37.6-31-desktop i686-linux-thread-multi-64int-ld (cc) : no
perl-5.11.0 on linux 2.6.31.12-0.2-default i686-linux-64int (cc) : no
perl-5.11.0 on linux 2.6.34.7-0.4-desktop i686-linux-thread-multi-64int-ld (cc) : no
perl-5.11.1 on linux 2.6.31.12-0.2-default i686-linux-64int (cc) : no
perl-5.11.1 on linux 2.6.34.7-0.4-desktop i686-linux-thread-multi-64int-ld (cc) : no
perl-5.11.2 on linux 2.6.31.12-0.2-default i686-linux-64int (cc) : no
perl-5.11.2 on linux 2.6.34.7-0.4-desktop i686-linux-thread-multi-64int-ld (cc) : no
perl-5.11.3 on linux 2.6.31.12-0.2-default i686-linux-64int (cc) : no
perl-5.11.3 on linux 2.6.34.7-0.4-desktop i686-linux-thread-multi-64int-ld (cc) : no
perl-5.11.4 on linux 2.6.31.12-0.2-default i686-linux-64int (cc) : no
perl-5.11.4 on linux 2.6.34.7-0.4-desktop i686-linux-thread-multi-64int-ld (cc) : no
perl-5.11.5 on linux 2.6.31.12-0.2-default i686-linux-64int (cc) : no
perl-5.11.5 on linux 2.6.34.7-0.4-desktop i686-linux-thread-multi-64int-ld (cc) : no
perl-5.12.0 on linux 2.6.31.12-0.2-default i686-linux-64int (cc) : no
perl-5.12.0 on linux 2.6.34.7-0.4-desktop i686-linux-thread-multi-64int-ld (cc) : no
perl-5.12.1 on linux 2.6.31.12-0.2-default i686-linux-64int (cc) : no
perl-5.12.1 on linux 2.6.34.7-0.4-desktop i686-linux-thread-multi-64int-ld (cc) : no
perl-5.12.2 on linux 2.6.34.4-0.1-desktop i686-linux-64int (cc) : no
perl-5.12.2 on linux 2.6.34.7-0.4-desktop i686-linux-thread-multi-64int-ld (cc) : no
perl-5.12.3 on linux 2.6.34.7-0.7-desktop i686-linux-64int (cc) : no
perl-5.12.3 on linux 2.6.34.7-0.7-desktop i686-linux-thread-multi-64int-ld (cc) : no
perl-5.12.3 on linux 2.6.36 i586-linux-thread-multi (cc) : no
perl-5.12.4 on linux 2.6.37.6-32-desktop i686-linux-64int (cc) : no
perl-5.12.4 on linux 2.6.37.6-32-desktop i686-linux-thread-multi-64int-ld (cc) : no
perl-5.13.0 on linux 2.6.31.12-0.2-default i686-linux-64int (cc) : no
perl-5.13.0 on linux 2.6.34.7-0.4-desktop i686-linux-thread-multi-64int-ld (cc) : no
perl-5.13.1 on linux 2.6.31.12-0.2-default i686-linux-64int (cc) : no
perl-5.13.1 on linux 2.6.34.7-0.4-desktop i686-linux-thread-multi-64int-ld (cc) : yes
perl-5.13.2 on linux 2.6.31.12-0.2-default i686-linux-64int (cc) : no
perl-5.13.2 on linux 2.6.34.7-0.4-desktop i686-linux-thread-multi-64int-ld (cc) : no
perl-5.13.3 on linux 2.6.31.12-0.2-default i686-linux-64int (cc) : no
perl-5.13.3 on linux 2.6.34.7-0.4-desktop i686-linux-thread-multi-64int-ld (cc) : no
perl-5.13.4 on linux 2.6.31.12-0.2-desktop i686-linux-64int (cc) : no
perl-5.13.4 on linux 2.6.34.7-0.4-desktop i686-linux-thread-multi-64int-ld (cc) : no
perl-5.13.5 on linux 2.6.34.7-0.2-desktop i686-linux-64int (cc) : no
perl-5.13.5 on linux 2.6.34.7-0.4-desktop i686-linux-thread-multi-64int-ld (cc) : no
perl-5.13.6 on linux 2.6.34.7-0.4-desktop i686-linux-64int (cc) : no
perl-5.13.6 on linux 2.6.34.7-0.4-desktop i686-linux-thread-multi-64int-ld (cc) : no
perl-5.13.7 on linux 2.6.34.7-0.5-desktop i686-linux-64int (cc) : no
perl-5.13.7 on linux 2.6.34.7-0.5-desktop i686-linux-thread-multi-64int-ld (cc) : no
perl-5.13.8 on linux 2.6.34.7-0.5-desktop i686-linux-64int (cc) : no
perl-5.13.8 on linux 2.6.34.7-0.5-desktop i686-linux-thread-multi-64int-ld (cc) : no
perl-5.13.9 on linux 2.6.34.7-0.7-desktop i686-linux-64int (cc) : no
perl-5.13.9 on linux 2.6.34.7-0.7-desktop i686-linux-thread-multi-64int-ld (cc) : no
perl-5.13.10 on linux 2.6.34.7-0.7-desktop i686-linux-64int (cc) : no
perl-5.13.10 on linux 2.6.34.7-0.7-desktop i686-linux-thread-multi-64int-ld (cc) : no
perl-5.13.11 on linux 2.6.34.7-0.7-desktop i686-linux-64int (cc) : no
perl-5.13.11 on linux 2.6.34.7-0.7-desktop i686-linux-thread-multi-64int-ld (cc) : no
perl-5.14.0 on linux 2.6.37.6-0.5-desktop i686-linux-64int (cc) : no
perl-5.14.0 on linux 2.6.37.6-0.5-desktop i686-linux-thread-multi-64int-ld (cc) : no
perl-5.14.1 on linux 2.6.37.6-31-desktop i686-linux-64int (cc) : no
perl-5.14.1 on linux 2.6.37.6-31-desktop i686-linux-64int-ld (cc) : no
perl-5.14.1 on linux 2.6.37.6-31-desktop i686-linux-thread-multi-64int-ld (cc) : no
perl-5.14.2 on linux 2.6.37.6-0.7-desktop i686-linux-64int (cc) : no
perl-5.14.2 on linux 2.6.37.6-0.7-desktop i686-linux-thread-multi-64int-ld (cc) : no
perl-5.15.0 on linux 2.6.37.6-33-desktop i686-linux-64int (cc) : no
perl-5.15.0 on linux 2.6.37.6-33-desktop i686-linux-thread-multi-64int-ld (cc) : no
perl-5.15.1 on linux 2.6.37.6-42-desktop i686-linux-64int (cc) : no
perl-5.15.1 on linux 2.6.37.6-42-desktop i686-linux-thread-multi-64int-ld (cc) : no
perl-5.15.2 on linux 2.6.37.6-0.7-desktop i686-linux-64int (cc) : no
perl-5.15.2 on linux 2.6.37.6-0.7-desktop i686-linux-thread-multi-64int-ld (cc) : no
perl-5.15.3 on linux 2.6.37.6-0.7-desktop i686-linux-64int (cc) : no
perl-5.15.3 on linux 2.6.37.6-0.7-desktop i686-linux-thread-multi-64int-ld (cc) : no
perl-5.15.4 on linux 2.6.37.6-0.7-desktop i686-linux-64int (cc) : no
perl-5.15.4 on linux 2.6.37.6-0.7-desktop i686-linux-thread-multi-64int-ld (cc) : no
perl-5.15.5 on linux 2.6.37.6-0.9-desktop i686-linux-64int (cc) : no
perl-5.15.5 on linux 2.6.37.6-0.9-desktop i686-linux-thread-multi-64int-ld (cc) : no
perl-5.15.6 on linux 2.6.37.6-0.9-desktop i686-linux-64int (cc) : no
perl-5.15.6 on linux 2.6.37.6-0.9-desktop i686-linux-thread-multi-64int-ld (cc) : no

--
H.Merijn Brand http​://tux.nl Perl Monger http​://amsterdam.pm.org/
using 5.00307 through 5.14 and porting perl5.15.x on HP-UX 10.20, 11.00,
11.11, 11.23 and 11.31, OpenSuSE 10.1, 11.0 .. 11.4 and AIX 5.2 and 5.3.
http​://mirrors.develooper.com/hpux/ http​://www.test-smoke.org/
http​://qa.perl.org http​://www.goldmark.org/jeff/stupid-disclaimers/

@p5pRT
Copy link
Author

p5pRT commented Jan 17, 2012

From PeterCMartini@GMail.com

On Mon, Jan 16, 2012 at 9​:16 PM, James E Keenan via RT
<perlbug-followup@​perl.org> wrote​:

On Mon Jan 16 17​:09​:30 2012, jwdevel@​gmail.com wrote​:

This is a bug report for perl from jwdevel@​gmail.com,
generated with the help of perlbug 1.39 running under perl 5.14.1.

-----------------------------------------------------------------
When I run the following code, I get 'no' printed out, which is not
what I expect.

        my $memory_file;
        my $fh;
        open ($fh, '>', \$memory_file);
        print $fh "abc";
        if( $memory_file =~ m/^.*$/ )
        { print "yes\n" }
        else
        { print "no\n" }

This issue is discussed in more detail online​:

        http​://stackoverflow.com/questions/8649916/perl-writing-to-a-
memory-file-plays-tricks-with-pattern-matching/8650519
        http​://perlmonks.org/?node_id=945256

Some builds of perl 5.14.1 do not have this issue, others do (see
discussion at PerlMonks link)

My results​:

Darwin/PPC, Perl 5.14.2​: yes
Linux/i386, Perl 5.14.2​: no

Since Darwin/PPC is bigendian (per my Parrot %PConfig) while Linux/i386
is not, could this be an endian-ness issue?

Curious​: while "abc" gets me a "no", "abcd" gets me a yes!

@p5pRT
Copy link
Author

p5pRT commented Jan 17, 2012

From Eirik-Berg.Hanssen@allverden.no

A clue​: The regex engine seems to be advancing pos way beyond the end of
the string, reading garbage​:

sidhekin@​bluebird[10​:08​:59]~$ perl -w

my $memory_file;
my $fh;
open ($fh, '>', \$memory_file);
print $fh "abc";

if ( $memory_file =~ m#^(?​:.(?{ print pos, " $memory_file
$`&lt;&lt;$&amp;>>$'--$/"}))*$# ) {
  print "yes\n";
}
else {
  print "no\n";
}

__END__
1 abc <<a>>bc--
2 abc <<ab>>c--
3 abc <<abc>>--
Use of uninitialized value $' in concatenation (.) or string at (re_eval 1)
line 1.
4 abc <<ab>>--
Use of uninitialized value $' in concatenation (.) or string at (re_eval 1)
line 1.
5 abc <<ab�>--
Use of uninitialized value $' in concatenation (.) or string at (re_eval 1)
line 1.
6 abc <<ab�>--
Use of uninitialized value $' in concatenation (.) or string at (re_eval 1)
line 1.
7 abc <<ab�>--
yes
sidhekin@​bluebird[10​:12​:12]~$

Eirik

@p5pRT
Copy link
Author

p5pRT commented Jan 17, 2012

From @iabyn

On Tue, Jan 17, 2012 at 10​:14​:19AM +0100, Eirik Berg Hanssen wrote​:

A clue​: The regex engine seems to be advancing pos way beyond the end of
the string, reading garbage​:

Looks like the regex engine isn't coping with a string that isn't
null-terminated​:

my $memory_file;
my $fh;
open ($fh, '>', \$memory_file);
print $fh "abc";

use Devel​::Peek;
Dump "abc";
Dump $memory_file;

__END__
SV = PV(0x1477e48) at 0x1499dd0
  REFCNT = 1
  FLAGS = (PADTMP,POK,READONLY,pPOK)
  PV = 0x14a9c68 "abc"\0
  CUR = 3
  LEN = 16
SV = PV(0x1477d48) at 0x1499ce0
  REFCNT = 2
  FLAGS = (PADMY,POK,pPOK)
  PV = 0x14ac528 "abc" <=== spot the lack of \0
  CUR = 3
  LEN = 16

--
This email is confidential, and now that you have read it you are legally
obliged to shoot yourself. Or shoot a lawyer, if you prefer. If you have
received this email in error, place it in its original wrapping and return
for a full refund. By opening this email, you accept that Elvis lives.

@p5pRT
Copy link
Author

p5pRT commented Jan 17, 2012

From @nwc10

On Tue, Jan 17, 2012 at 12​:23​:35PM +0000, Dave Mitchell wrote​:

On Tue, Jan 17, 2012 at 10​:14​:19AM +0100, Eirik Berg Hanssen wrote​:

A clue​: The regex engine seems to be advancing pos way beyond the end of
the string, reading garbage​:

Looks like the regex engine isn't coping with a string that isn't
null-terminated​:

That's a shame. It would be nice if the regex engine didn't do anything
that reads the byte beyond the string (even though technically scalars
without the trailing NUL aren't well-formed), as doing so prevents one from
mmap()-ing entire files into scalars reliably.

(Still, arguably ill-formed, but it's unlikely that such scalars are going
anywhere near syscalls, whereas they are likely to be fed to the regex
engine.)

Nicholas Clark

@p5pRT
Copy link
Author

p5pRT commented Jan 17, 2012

From @Leont

On Tue, Jan 17, 2012 at 10​:14 AM, Eirik Berg Hanssen
<Eirik-Berg.Hanssen@​allverden.no> wrote​:

A clue​: The regex engine seems to be advancing pos way beyond the end of the
string, reading garbage​:

Indeed. This is most clearly shown in #73542, which actually causes a
segfault when it looks past the end of a memory mapped string.

Leon

@p5pRT
Copy link
Author

p5pRT commented Jan 17, 2012

From 2bfjdsla52kztwejndzdstsxl9athp@gmail.com

Patching PerlIO​::scalar to supply a terminating NUL is easy,
but how do you write tests for it?

/Bo Lindbergh

@p5pRT
Copy link
Author

p5pRT commented Jan 17, 2012

@p5pRT
Copy link
Author

p5pRT commented Jan 17, 2012

From @cpansprout

On Tue Jan 17 11​:20​:13 2012, 2bfjdsla52kztwejndzdstsxl9athp@​gmail.com wrote​:

Patching PerlIO​::scalar to supply a terminating NUL is easy,
but how do you write tests for it?

/Bo Lindbergh

XS​::APItest, perhaps?

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Jan 17, 2012

From @khwilliamson

On 01/17/2012 05​:36 AM, Nicholas Clark wrote​:

On Tue, Jan 17, 2012 at 12​:23​:35PM +0000, Dave Mitchell wrote​:

On Tue, Jan 17, 2012 at 10​:14​:19AM +0100, Eirik Berg Hanssen wrote​:

A clue​: The regex engine seems to be advancing pos way beyond the end of
the string, reading garbage​:

Looks like the regex engine isn't coping with a string that isn't
null-terminated​:

That's a shame. It would be nice if the regex engine didn't do anything
that reads the byte beyond the string (even though technically scalars
without the trailing NUL aren't well-formed), as doing so prevents one from
mmap()-ing entire files into scalars reliably.

(Still, arguably ill-formed, but it's unlikely that such scalars are going
anywhere near syscalls, whereas they are likely to be fed to the regex
engine.)

Nicholas Clark

A quick and dirty fix to the regex engine would be to redefine the
UCHARAT macro in regexec.c to check if the pointer parameter to it ==
PL_regeol, and if so return '\0' instead of doing the read. If the
macro gets called on something other than the input stream, no harm is
done because it uses an exact check which would fail; but doing the
exact check would not catch reads of more than one byte beyond the end.
  And I don't know how much this would slow things down or how many
things it would miss, but UCHARAT is a ubiquitous access method in regexec.c

BTW running valgrind on the sample program shows
==5994== Conditional jump or move depends on uninitialised value(s)
==5994== at 0x4844338​: S_regmatch (re_exec.c​:3216)

@p5pRT
Copy link
Author

p5pRT commented Jan 18, 2012

From @nwc10

On Tue, Jan 17, 2012 at 12​:50​:25PM -0800, Father Chrysostomos via RT wrote​:

On Tue Jan 17 11​:20​:13 2012, 2bfjdsla52kztwejndzdstsxl9athp@​gmail.com wrote​:

Patching PerlIO​::scalar to supply a terminating NUL is easy,
but how do you write tests for it?

/Bo Lindbergh

XS​::APItest, perhaps?

It *might* be easier to parse the output of Devel​::Peek.
(But it's a bit of a game involving capturing STDERR)

As for Dave's earlier example, without his annotation​:

$ perl
my $memory_file;
my $fh;
open ($fh, '>', \$memory_file);
print $fh "abc";

use Devel​::Peek;
Dump "abc";
Dump $memory_file;

__END__
SV = PV(0x100801ea8) at 0x1008131c0
  REFCNT = 1
  FLAGS = (POK,READONLY,pPOK)
  PV = 0x100202190 "abc"\0
  CUR = 3
  LEN = 16
SV = PV(0x100801da8) at 0x1008130d0
  REFCNT = 2
  FLAGS = (PADMY,POK,pPOK)
  PV = 0x100213490 "abc"\0
  CUR = 3
  LEN = 16

it should match qr/^ *PV = ".*"\\0$/
(ie not qr/^ *PV = ".*"$/)

I don't think that there's a way to do this with B.

Nicholas Clark

@p5pRT
Copy link
Author

p5pRT commented Jan 18, 2012

From @ikegami

On Wed, Jan 18, 2012 at 7​:51 AM, Nicholas Clark <nick@​ccl4.org> wrote​:

I don't think that there's a way to do this with B.

use B qw( svref_2object );

sub has_trailing_nul(\$) {
  my ($ref) = @​_;
  my $sv = svref_2object($ref);
  return undef if !$sv->isa('B​::PV');

  my $cur = $sv->CUR;
  my $len = $sv->LEN;
  return 0 if $cur >= $len;

  my $pv_addr = unpack 'J', pack 'P', $$ref;
  my $trailing = unpack 'P', pack 'J', $pv_addr+$cur;
  return $trailing eq "\0";
}

my $x = "abc";
say has_trailing_nul($x) ?1​:0;

- Eric

@p5pRT
Copy link
Author

p5pRT commented Jan 18, 2012

From @ikegami

Using the code from my last message,

my $x = "abc";
say has_trailing_nul($x) ?1​:0; # 1

open (my $fh, '>', \my $memory_file) or die $!;
print $fh "abc";

say has_trailing_nul($memory_file) ?1​:0; # 0

- Eric

@p5pRT
Copy link
Author

p5pRT commented Jan 20, 2012

From @cpansprout

On Wed Jan 18 13​:27​:59 2012, ikegami@​adaelis.com wrote​:

On Wed, Jan 18, 2012 at 7​:51 AM, Nicholas Clark <nick@​ccl4.org> wrote​:

I don't think that there's a way to do this with B.

Well, no *documented* way.

use B qw( svref_2object );

sub has_trailing_nul(\$) {
my ($ref) = @​_;
my $sv = svref_2object($ref);
return undef if !$sv->isa('B​::PV');

my $cur = $sv->CUR;
my $len = $sv->LEN;

Interesting. I searched for those in B’s docs, but didn’t find them,
and assumed they didn’t exist.

I’ve documented them in commit 5c14042.

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Jan 20, 2012

From @cpansprout

On Tue Jan 17 11​:20​:13 2012, 2bfjdsla52kztwejndzdstsxl9athp@​gmail.com wrote​:

Patching PerlIO​::scalar to supply a terminating NUL is easy,
but how do you write tests for it?

/Bo Lindbergh

Thank you. Applied as 8af8844.

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Jan 20, 2012

@cpansprout - Status changed from 'open' to 'resolved'

@p5pRT p5pRT closed this as completed Jan 20, 2012
@p5pRT
Copy link
Author

p5pRT commented Jan 20, 2012

From @cpansprout

On Wed Jan 18 13​:27​:59 2012, ikegami@​adaelis.com wrote​:

On Wed, Jan 18, 2012 at 7​:51 AM, Nicholas Clark <nick@​ccl4.org> wrote​:

I don't think that there's a way to do this with B.

use B qw( svref_2object );

sub has_trailing_nul(\$) {
my ($ref) = @​_;
my $sv = svref_2object($ref);
return undef if !$sv->isa('B​::PV');

my $cur = $sv->CUR;
my $len = $sv->LEN;
return 0 if $cur >= $len;

my $pv_addr = unpack 'J', pack 'P', $$ref;
my $trailing = unpack 'P', pack 'J', $pv_addr+$cur;
return $trailing eq "\0";
}

my $x = "abc";
say has_trailing_nul($x) ?1​:0;

Thank you. I’ve added your function to ext/PerlIO-scalar/t/scalar.t
with commit 84da560, and added some tests that use it, with commit
66ad6b0.

--

Father Chrysostomos

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant