Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PATCH] e213661 no warnings 'safesyscalls', fatal nul checks #12868

Closed
p5pRT opened this issue Mar 21, 2013 · 59 comments
Closed

[PATCH] e213661 no warnings 'safesyscalls', fatal nul checks #12868

p5pRT opened this issue Mar 21, 2013 · 59 comments

Comments

@p5pRT
Copy link

p5pRT commented Mar 21, 2013

Migrated from rt.perl.org#117265 (status was 'resolved')

Searchable as RT117265$

@p5pRT
Copy link
Author

p5pRT commented Mar 21, 2013

From @rurban

This is a bug report for perl from rurban@​cpanel.net,
generated with the help of perlbug 1.39 running under perl 5.17.8.

From ccbb86ed06799dbc844f023ab6967338296bdee1 Mon Sep 17 00​:00​:00 2001
From​: Reini Urban <rurban@​x-ray.at>
Date​: Mon, 18 Mar 2013 13​:05​:51 +0100
Subject​: [PATCH 2/2] ExtUtils-Manifest-1.62_01​: fix safesyscalls, no double
\0 for open()
MIME-Version​: 1.0
Content-Type​: multipart/mixed; boundary="------------1.7.10.4"

This is a multi-part message in MIME format.
--------------1.7.10.4
Content-Type​: text/plain; charset=UTF-8; format=fixed
Content-Transfer-Encoding​: 8bit

cp_if_diff used an old-style open(F,"< $from\0") to force an ending \0,
which is not needed with 3arg open, and leads to safesyscalls errors.


dist/ExtUtils-Manifest/lib/ExtUtils/Manifest.pm | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

--------------1.7.10.4
Content-Type​: text/x-patch; name="0002-ExtUtils-Manifest-1.62_01-fix-safesyscalls-no-double.patch"
Content-Transfer-Encoding​: 8bit
Content-Disposition​: attachment; filename="0002-ExtUtils-Manifest-1.62_01-fix-safesyscalls-no-double.patch"

Inline Patch
diff --git a/dist/ExtUtils-Manifest/lib/ExtUtils/Manifest.pm b/dist/ExtUtils-Manifest/lib/ExtUtils/Manifest.pm
index 5bcd6d6..82b4010 100644
--- a/dist/ExtUtils-Manifest/lib/ExtUtils/Manifest.pm
+++ b/dist/ExtUtils-Manifest/lib/ExtUtils/Manifest.pm
@@ -13,7 +13,7 @@ use vars qw($VERSION @ISA @EXPORT_OK
           $Is_MacOS $Is_VMS $Is_VMS_mode $Is_VMS_lc $Is_VMS_nodot
           $Debug $Verbose $Quiet $MANIFEST $DEFAULT_MSKIP);
 
-$VERSION = '1.63';
+$VERSION = '1.63_01';
 @ISA=('Exporter');
 @EXPORT_OK = qw(mkmanifest
                 manicheck  filecheck  fullcheck  skipcheck
@@ -567,8 +567,8 @@ sub cp_if_diff {
     }
     my($diff) = 0;
     local(*F,*T);
-    open(F,"< $from\0") or die "Can't read $from: $!\n";
-    if (open(T,"< $to\0")) {
+    open(F,"<",$from) or die "Can't read $from: $!\n";
+    if (open(T,"<",$to)) {
         local $_;
 	while (<F>) { $diff++,last if $_ ne <T>; }
 	$diff++ unless eof(T);

--------------1.7.10.4--


---
Flags:   category=core   severity=high

This perlbug was built using Perl 5.17.8 - Fri Feb 1 11​:00​:49 CST 2013
It is being executed now by Perl 5.17.8 - Wed Jan 9 17​:52​:45 CST 2013.

Site configuration information for perl 5.17.8​:

Configured by rurban at Wed Jan 9 17​:52​:45 CST 2013.

Summary of my perl5 (revision 5 version 17 subversion 8) configuration​:
  Commit id​: 1e9a14d0d069d64df71ad32c7174ede653a57801
  Platform​:
  osname=linux, osvers=3.2.0-4-amd64, archname=x86_64-linux-thread-multi-debug
  uname='linux reini 3.2.0-4-amd64 #1 smp debian 3.2.32-1 x86_64 gnulinux '
  config_args='-de -Dusedevel -Uversiononly -Dinstallman1dir=none -Dinstallman3dir=none -Dinstallsiteman1dir=none -Dinstallsiteman3dir=none -DEBUGGING -Doptimize=-g3 -Duseithreads -Accflags='-msse4.2' -Accflags='-march=corei7' -Dcf_email='rurban@​cpanel.net' -Dperladmin='rurban@​cpanel.net' -Duseshrplib'
  hint=recommended, useposix=true, d_sigaction=define
  useithreads=define, usemultiplicity=define
  useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef
  use64bitint=define, use64bitall=define, uselongdouble=undef
  usemymalloc=n, bincompat5005=undef
  Compiler​:
  cc='cc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -msse4.2 -march=corei7 -DDEBUGGING -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
  optimize='-g3',
  cppflags='-D_REENTRANT -D_GNU_SOURCE -msse4.2 -march=corei7 -DDEBUGGING -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include'
  ccversion='', gccversion='4.7.2', gccosandvers=''
  intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678
  d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
  ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
  alignbytes=8, prototype=define
  Linker and Libraries​:
  ld='cc', ldflags =' -fstack-protector -L/usr/local/lib'
  libpth=/usr/local/lib /lib/x86_64-linux-gnu /lib/../lib /usr/lib/x86_64-linux-gnu /usr/lib/../lib /lib /usr/lib
  libs=-lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lpthread -lc -lgdbm_compat
  perllibs=-lnsl -ldl -lm -lcrypt -lutil -lpthread -lc
  libc=, so=so, useshrplib=true, libperl=libperl.so
  gnulibc_version='2.13'
  Dynamic Linking​:
  dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E -Wl,-rpath,/usr/local/lib/perl5/5.17.8/x86_64-linux-thread-multi-debug/CORE'
  cccdlflags='-fPIC', lddlflags='-shared -g3 -L/usr/local/lib -fstack-protector'

Locally applied patches​:
 


@​INC for perl 5.17.8​:
  /usr/local/lib/perl5/site_perl/5.17.8/x86_64-linux-thread-multi-debug
  /usr/local/lib/perl5/site_perl/5.17.8
  /usr/local/lib/perl5/5.17.8/x86_64-linux-thread-multi-debug
  /usr/local/lib/perl5/5.17.8
  /usr/local/lib/perl5/site_perl/5.17.7
  /usr/local/lib/perl5/site_perl/5.17.6
  /usr/local/lib/perl5/site_perl/5.17.5
  /usr/local/lib/perl5/site_perl/5.17.4
  /usr/local/lib/perl5/site_perl/5.17.3
  /usr/local/lib/perl5/site_perl/5.17.2
  /usr/local/lib/perl5/site_perl/5.17.1
  /usr/local/lib/perl5/site_perl/5.17.0
  /usr/local/lib/perl5/site_perl/5.17
  /usr/local/lib/perl5/site_perl/5.16.2
  /usr/local/lib/perl5/site_perl/5.16.1
  /usr/local/lib/perl5/site_perl/5.16.0
  /usr/local/lib/perl5/site_perl/5.15.9
  /usr/local/lib/perl5/site_perl/5.15.8
  /usr/local/lib/perl5/site_perl/5.15.7
  /usr/local/lib/perl5/site_perl/5.15.6
  /usr/local/lib/perl5/site_perl/5.15.5
  /usr/local/lib/perl5/site_perl/5.15.4
  /usr/local/lib/perl5/site_perl/5.14.3
  /usr/local/lib/perl5/site_perl/5.14.2
  /usr/local/lib/perl5/site_perl/5.14.1
  /usr/local/lib/perl5/site_perl/5.12.4
  /usr/local/lib/perl5/site_perl/5.10.1
  /usr/local/lib/perl5/site_perl/5.8.9
  /usr/local/lib/perl5/site_perl/5.8.8
  /usr/local/lib/perl5/site_perl/5.8.7
  /usr/local/lib/perl5/site_perl/5.8.6
  /usr/local/lib/perl5/site_perl/5.8.5
  /usr/local/lib/perl5/site_perl/5.8.4
  /usr/local/lib/perl5/site_perl/5.8.3
  /usr/local/lib/perl5/site_perl/5.8.2
  /usr/local/lib/perl5/site_perl/5.8.1
  /usr/local/lib/perl5/site_perl/5.6.2
  /usr/local/lib/perl5/site_perl
  .


Environment for perl 5.17.8​:
  HOME=/home/rurban
  LANG=en_US.UTF-8
  LANGUAGE (unset)
  LD_LIBRARY_PATH (unset)
  LOGDIR (unset)
  PATH=/home/rurban/bin​:/usr/local/bin​:/usr/bin​:/bin​:/usr/local/games​:/usr/games
  PERL_BADLANG (unset)
  SHELL=/bin/bash

@p5pRT
Copy link
Author

p5pRT commented Mar 21, 2013

From @rurban

part 1 was missing from perlbug (mult. -f not supported)

@p5pRT
Copy link
Author

p5pRT commented Mar 21, 2013

From @rurban

0001-no-warnings-safesyscalls-fatal-0-checks.patch
From e213661d72731f72caf3ce470119228bf5a870be Mon Sep 17 00:00:00 2001
From: Reini Urban <rurban@x-ray.at>
Date: Tue, 12 Mar 2013 19:25:26 +0100
Subject: [PATCH 1/2] no warnings 'safesyscalls', fatal \0 checks
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="------------1.7.10.4"

This is a multi-part message in MIME format.
--------------1.7.10.4
Content-Type: text/plain; charset=UTF-8; format=fixed
Content-Transfer-Encoding: 8bit


Add the fatal warnings category safesyscalls.
Disallow binary pathnames and arguments to other syscalls, strings
with embedded \0, which are ignored in the syscall but kept in
perl. Allow an ending \0 though, as several modules add a \0 to
such strings without adjusting the length.
The hidden payloads in these invalid string args may cause unnoticed
security problems, as they are ignored by the syscalls but kept around
in perl PVs.
---
 doio.c                |    9 ++++++++-
 embed.fnc             |    4 ++--
 ext/File-Glob/Glob.xs |    1 +
 lib/warnings.pm       |   15 +++++++++------
 perl.h                |   17 +++++++++++++++++
 perlio.c              |    4 ++++
 pod/perldiag.pod      |    9 +++++++++
 regen/warnings.pl     |    6 +++---
 t/io/open.t           |   23 ++++++++++++++++++++++-
 util.c                |    1 +
 warnings.h            |    1 +
 11 files changed, 77 insertions(+), 13 deletions(-)


--------------1.7.10.4
Content-Type: text/x-patch; name="0001-no-warnings-safesyscalls-fatal-0-checks.patch"
Content-Transfer-Encoding: 8bit
Content-Disposition: attachment; filename="0001-no-warnings-safesyscalls-fatal-0-checks.patch"

diff --git a/doio.c b/doio.c
index 4e8d48a..4d7c00a 100644
--- a/doio.c
+++ b/doio.c
@@ -216,6 +216,7 @@ Perl_do_openn(pTHX_ GV *gv, const char *oname, I32 len, int as_raw,
 		goto say_false;
 	    }
 #endif /* USE_STDIO */
+	    CHECK_PATHNAME(*svp);
 	    name = (SvOK(*svp) || SvGMAGICAL(*svp)) ?
 			savesvpv (*svp) : savepvs ("");
 	    SAVEFREEPV(name);
@@ -1638,6 +1639,7 @@ Perl_apply(pTHX_ I32 type, SV **mark, SV **sp)
 		else {
 		    const char *name = SvPV_nomg_const_nolen(*mark);
 		    APPLY_TAINT_PROPER();
+		    CHECK_PATHNAME(*mark);
 		    if (PerlLIO_chmod(name, val))
 			tot--;
 		}
@@ -1672,6 +1674,7 @@ Perl_apply(pTHX_ I32 type, SV **mark, SV **sp)
 		else {
 		    const char *name = SvPV_nomg_const_nolen(*mark);
 		    APPLY_TAINT_PROPER();
+		    CHECK_PATHNAME(*mark);
 		    if (PerlLIO_chown(name, val, val2))
 			tot--;
 		}
@@ -1690,6 +1693,7 @@ nothing in the core.
 	APPLY_TAINT_PROPER();
 	if (mark == sp)
 	    break;
+	CHECK_SYSCALL(*mark);
 	s = SvPVx_const(*++mark, len);
 	if (*s == '-' && isALPHA(s[1]))
 	{
@@ -1702,7 +1706,7 @@ nothing in the core.
 		s += 3;
                 len -= 3;
             }
-           if ((val = whichsig_pvn(s, len)) < 0)
+	    if ((val = whichsig_pvn(s, len)) < 0)
                Perl_croak(aTHX_ "Unrecognized signal name \"%"SVf"\"", SVfARG(*mark));
 	}
 	else
@@ -1773,6 +1777,7 @@ nothing in the core.
 	while (++mark <= sp) {
 	    s = SvPV_nolen_const(*mark);
 	    APPLY_TAINT_PROPER();
+	    CHECK_PATHNAME(*mark);
 	    if (PerlProc_geteuid() || PL_unsafe) {
 		if (UNLINK(s))
 		    tot--;
@@ -1851,6 +1856,7 @@ nothing in the core.
 		else {
 		    const char * const name = SvPV_nomg_const_nolen(*mark);
 		    APPLY_TAINT_PROPER();
+		    CHECK_PATHNAME(*mark);
 #ifdef HAS_FUTIMES
 		    if (utimes(name, (struct timeval *)utbufp))
 #else
@@ -2343,6 +2349,7 @@ Perl_start_glob (pTHX_ SV *tmpglob, IO *io)
 
     PERL_ARGS_ASSERT_START_GLOB;
 
+    CHECK_SYSCALL(tmpglob);
     ENTER;
     SAVEFREESV(tmpcmd);
 #ifdef VMS /* expand the wildcards right here, rather than opening a pipe, */
diff --git a/embed.fnc b/embed.fnc
index 2f5e089..8c981f2 100644
--- a/embed.fnc
+++ b/embed.fnc
@@ -2253,7 +2253,7 @@ s	|void	|printbuf	|NN const char *const fmt|NN const char *const s
 #endif
 
 #if defined(PERL_IN_UNIVERSAL_C)
-s	|bool|isa_lookup	|NN HV *stash|NN const char * const name \
+s	|bool	|isa_lookup	|NN HV *stash|NN const char * const name \
                                         |STRLEN len|U32 flags
 #endif
 
@@ -2266,7 +2266,7 @@ s	|char*	|stdize_locale	|NN char* locs
 #if defined(PERL_IN_UTIL_C)
 s	|const COP*|closest_cop	|NN const COP *cop|NULLOK const OP *o
 s	|SV*	|mess_alloc
-s	|SV *|with_queued_errors|NN SV *ex
+s	|SV *	|with_queued_errors|NN SV *ex
 s	|bool	|invoke_exception_hook|NULLOK SV *ex|bool warn
 #if defined(PERL_MEM_LOG) && !defined(PERL_MEM_LOG_NOIMPL)
 sn	|void	|mem_log_common	|enum mem_log_type mlt|const UV n|const UV typesize \
diff --git a/ext/File-Glob/Glob.xs b/ext/File-Glob/Glob.xs
index df5530a..2cd762f 100644
--- a/ext/File-Glob/Glob.xs
+++ b/ext/File-Glob/Glob.xs
@@ -135,6 +135,7 @@ csh_glob(pTHX_ AV *entries, SV *patsv)
 	     pat = "", len = 0, is_utf8 = 0;
 	else pat = SvPV_nomg(patsv,len), is_utf8 = !!SvUTF8(patsv);
 	patend = pat + len;
+	CHECK_SYSCALL(patsv);
 
 	/* extract patterns */
 	s = pat-1;
diff --git a/lib/warnings.pm b/lib/warnings.pm
index c0c2cc9..4ddd83a 100644
--- a/lib/warnings.pm
+++ b/lib/warnings.pm
@@ -232,10 +232,11 @@ our %Offsets = (
     'experimental::lexical_subs'=> 104,
     'experimental::lexical_topic'=> 106,
     'experimental::regex_sets'=> 108,
+    'safesyscalls'	=> 110,
   );
 
 our %Bits = (
-    'all'		=> "\x55\x55\x55\x55\x55\x55\x55\x55\x55\x55\x55\x55\x55\x15", # [0..54]
+    'all'		=> "\x55\x55\x55\x55\x55\x55\x55\x55\x55\x55\x55\x55\x55\x55", # [0..55]
     'ambiguous'		=> "\x00\x00\x00\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00", # [29]
     'bareword'		=> "\x00\x00\x00\x00\x00\x00\x00\x10\x00\x00\x00\x00\x00\x00", # [30]
     'closed'		=> "\x00\x10\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [6]
@@ -276,8 +277,9 @@ our %Bits = (
     'redefine'		=> "\x00\x00\x00\x00\x40\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [19]
     'regexp'		=> "\x00\x00\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00", # [20]
     'reserved'		=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x04\x00\x00\x00\x00", # [37]
+    'safesyscalls'	=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x40", # [55]
     'semicolon'		=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x10\x00\x00\x00\x00", # [38]
-    'severe'		=> "\x00\x00\x00\x00\x00\x54\x05\x00\x00\x00\x00\x00\x00\x00", # [21..25]
+    'severe'		=> "\x00\x00\x00\x00\x00\x54\x05\x00\x00\x00\x00\x00\x00\x40", # [21..25,55]
     'signal'		=> "\x00\x00\x00\x00\x00\x00\x10\x00\x00\x00\x00\x00\x00\x00", # [26]
     'substr'		=> "\x00\x00\x00\x00\x00\x00\x40\x00\x00\x00\x00\x00\x00\x00", # [27]
     'surrogate'		=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x10\x00", # [50]
@@ -293,7 +295,7 @@ our %Bits = (
   );
 
 our %DeadBits = (
-    'all'		=> "\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\x2a", # [0..54]
+    'all'		=> "\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa", # [0..55]
     'ambiguous'		=> "\x00\x00\x00\x00\x00\x00\x00\x08\x00\x00\x00\x00\x00\x00", # [29]
     'bareword'		=> "\x00\x00\x00\x00\x00\x00\x00\x20\x00\x00\x00\x00\x00\x00", # [30]
     'closed'		=> "\x00\x20\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [6]
@@ -334,8 +336,9 @@ our %DeadBits = (
     'redefine'		=> "\x00\x00\x00\x00\x80\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [19]
     'regexp'		=> "\x00\x00\x00\x00\x00\x02\x00\x00\x00\x00\x00\x00\x00\x00", # [20]
     'reserved'		=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x08\x00\x00\x00\x00", # [37]
+    'safesyscalls'	=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x80", # [55]
     'semicolon'		=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x20\x00\x00\x00\x00", # [38]
-    'severe'		=> "\x00\x00\x00\x00\x00\xa8\x0a\x00\x00\x00\x00\x00\x00\x00", # [21..25]
+    'severe'		=> "\x00\x00\x00\x00\x00\xa8\x0a\x00\x00\x00\x00\x00\x00\x80", # [21..25,55]
     'signal'		=> "\x00\x00\x00\x00\x00\x00\x20\x00\x00\x00\x00\x00\x00\x00", # [26]
     'substr'		=> "\x00\x00\x00\x00\x00\x00\x80\x00\x00\x00\x00\x00\x00\x00", # [27]
     'surrogate'		=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x20\x00", # [50]
@@ -351,8 +354,8 @@ our %DeadBits = (
   );
 
 $NONE     = "\0\0\0\0\0\0\0\0\0\0\0\0\0\0";
-$DEFAULT  = "\x10\x01\x00\x00\x00\x50\x04\x00\x00\x00\x00\x00\x00\x15", # [2,52..54,4,22,23,25]
-$LAST_BIT = 110 ;
+$DEFAULT  = "\x10\x01\x00\x00\x00\x50\x04\x00\x00\x00\x00\x00\x00\x55", # [2,52..54,4,22,23,25,55]
+$LAST_BIT = 112 ;
 $BYTES    = 14 ;
 
 $All = "" ; vec($All, $Offsets{'all'}, 2) = 3 ;
diff --git a/perl.h b/perl.h
index 5639f1c..3c9544a 100644
--- a/perl.h
+++ b/perl.h
@@ -5814,6 +5814,23 @@ extern void moncontrol(int);
 #  define do_aexec(really, mark,sp)	do_aexec5(really, mark, sp, 0, 0)
 #endif
 
+/* die on embedded \0 characters in pathnames, but allow one ending \0 */
+#define CHECK_PATHNAME(pv) CHECK_SAFESYSCALLS("pathname",pv)
+/* die on embedded \0 characters in args for various syscalls.
+   user and group names, glob. */
+#define CHECK_SYSCALL(pv)  CHECK_SAFESYSCALLS("syscall",pv)
+#define CHECK_SAFESYSCALLS(what,pv)                                     \
+    if (SvPOK(pv)) {                                                    \
+        char *i;                                                        \
+        char *p = SvPVX(pv);                                            \
+        if ( (i = strchr(p, 0))&&(i-p)&&((size_t)(i-p)<SvCUR(pv)-1) ) { \
+            SETERRNO(EINVAL, LIB_INVARG);                               \
+            if (ckWARN(WARN_SAFESYSCALLS))                              \
+                Perl_croak(aTHX_ "Invalid \\0 character in %s: %s\\0%s",\
+                           what,p,++i);                                 \
+        }                                                               \
+    }                                                                   \
+
 #if defined(OEMVS)
 #define NO_ENV_ARRAY_IN_MAIN
 #endif
diff --git a/perlio.c b/perlio.c
index 097bc49..a4ada58 100644
--- a/perlio.c
+++ b/perlio.c
@@ -310,6 +310,7 @@ PerlIO_openn(pTHX_ const char *layers, const char *mode, int fd,
 	    return PerlIO_tmpfile();
 	else {
 	    const char *name = SvPV_nolen_const(*args);
+	    CHECK_PATHNAME(*args);
 	    if (*mode == IoTYPE_NUMERIC) {
 		fd = PerlLIO_open3(name, imode, perm);
 		if (fd >= 0)
@@ -2713,6 +2714,7 @@ PerlIOUnix_open(pTHX_ PerlIO_funcs *self, PerlIO_list_t *layers,
 	}
 	if (imode != -1) {
 	    const char *path = SvPV_nolen_const(*args);
+	    CHECK_PATHNAME(*args);
 	    fd = PerlLIO_open3(path, imode, perm);
 	}
     }
@@ -3027,6 +3029,7 @@ PerlIOStdio_open(pTHX_ PerlIO_funcs *self, PerlIO_list_t *layers,
 	const char * const path = SvPV_nolen_const(*args);
 	PerlIOStdio * const s = PerlIOSelf(f, PerlIOStdio);
 	FILE *stdio;
+	CHECK_PATHNAME(*args);
 	PerlIOUnix_refcnt_dec(fileno(s->stdio));
 	stdio = PerlSIO_freopen(path, (mode = PerlIOStdio_mode(mode, tmode)),
 			    s->stdio);
@@ -3039,6 +3042,7 @@ PerlIOStdio_open(pTHX_ PerlIO_funcs *self, PerlIO_list_t *layers,
     else {
 	if (narg > 0) {
 	    const char * const path = SvPV_nolen_const(*args);
+	    CHECK_PATHNAME(*args);
 	    if (*mode == IoTYPE_NUMERIC) {
 		mode++;
 		fd = PerlLIO_open3(path, imode, perm);
diff --git a/pod/perldiag.pod b/pod/perldiag.pod
index 8d29e2d..8f547c5 100644
--- a/pod/perldiag.pod
+++ b/pod/perldiag.pod
@@ -2468,6 +2468,15 @@ indicated one isn't.  See L<charnames/CUSTOM ALIASES>.
 the C<:alias> option to C<use charnames> and the specified character in
 the indicated name isn't valid.  See L<charnames/CUSTOM ALIASES>.
 
+=item Invalid \0 character in %s: %s\0%s
+
+(F) Embedded \0 in pathnames or other syscall arguments create now a
+fatal warning.  These were ignored by the binary unsafe syscalls, but
+may lead to security problems.
+
+If you know what you are doing you can turn off this warning by
+C<no warnings 'safesyscalls';>.
+
 =item Invalid conversion in %s: "%s"
 
 (W printf) Perl does not understand the given format conversion.  See
diff --git a/regen/warnings.pl b/regen/warnings.pl
index 94a9843..12b0590 100644
--- a/regen/warnings.pl
+++ b/regen/warnings.pl
@@ -16,7 +16,7 @@
 #
 # This script is normally invoked from regen.pl.
 
-$VERSION = '1.02_03';
+$VERSION = '1.02_04';
 
 BEGIN {
     require 'regen/regen_lib.pl';
@@ -56,7 +56,8 @@ my $tree = {
 	 			'internal'	=> [ 5.008, DEFAULT_OFF],
          			'debugging'	=> [ 5.008, DEFAULT_ON],
          			'malloc'	=> [ 5.008, DEFAULT_ON],
-	 		   }],
+			        'safesyscalls'  => [ 5.017, DEFAULT_ON],
+			   }],
         'deprecated'	=> [ 5.008, DEFAULT_ON],
        	'void'		=> [ 5.008, DEFAULT_OFF],
        	'recursion'	=> [ 5.008, DEFAULT_OFF],
@@ -92,7 +93,6 @@ my $tree = {
                                 'experimental::lexical_topic' =>
                                     [ 5.017, DEFAULT_ON ],
                         }],
-
        	 #'default'	=> [ 5.008, DEFAULT_ON ],
   	}],
 } ;
diff --git a/t/io/open.t b/t/io/open.t
index e06fc8e..91f1003 100644
--- a/t/io/open.t
+++ b/t/io/open.t
@@ -10,7 +10,7 @@ $|  = 1;
 use warnings;
 use Config;
 
-plan tests => 121;
+plan tests => 126;
 
 my $Perl = which_perl();
 
@@ -386,3 +386,24 @@ SKIP: {
     ok( eval { $fh->autoflush(1); 1 }, '$fh->autoflush(1) lives' );
     ok( $INC{'IO/File.pm'}, "IO::File now loaded" );
 }
+
+my $fn = "tmp\0.invalid";
+eval { open(I, $fn); };
+like("$@", qr/^Invalid \\0 character in pathname: tmp/,
+  "die on open with embedded nul in pathnames since 5.18");
+eval { unlink($fn); };
+like("$@", qr/^Invalid \\0 character in pathname: tmp/,
+  "also on unlink");
+eval { chmod(0644, $fn); };
+like("$@", qr/^Invalid \\0 character in pathname: tmp/,
+  "also on chmod");
+eval { glob($fn); };
+like("$@", qr/^Invalid \\0 character in syscall: tmp/,
+  "also on glob");
+
+{
+  no warnings 'safesyscalls';
+  eval { open(I, $fn); };
+  is($@, '',
+  "ignore die on embedded nul with no warnings safesyscalls");
+}
diff --git a/util.c b/util.c
index 2c745bf..fb6cbf5 100644
--- a/util.c
+++ b/util.c
@@ -2010,6 +2010,7 @@ Perl_unlnk(pTHX_ const char *f)	/* unlink all versions of a file */
 
     PERL_ARGS_ASSERT_UNLNK;
 
+    CHECK_PATHNAME(f);
     while (PerlLIO_unlink(f) >= 0)
 	retries++;
     return retries ? 0 : -1;
diff --git a/warnings.h b/warnings.h
index d0bf710..1f3f8b0 100644
--- a/warnings.h
+++ b/warnings.h
@@ -93,6 +93,7 @@
 #define WARN_EXPERIMENTAL__LEXICAL_SUBS 52
 #define WARN_EXPERIMENTAL__LEXICAL_TOPIC 53
 #define WARN_EXPERIMENTAL__REGEX_SETS 54
+#define WARN_SAFESYSCALLS	 55
 
 #define WARNsize		14
 #define WARN_ALLstring		"\125\125\125\125\125\125\125\125\125\125\125\125\125\125"

--------------1.7.10.4--


@p5pRT
Copy link
Author

p5pRT commented Mar 21, 2013

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Mar 23, 2013

From @iabyn

On Thu, Mar 21, 2013 at 12​:37​:27PM -0700, Reini Urban via RT wrote​:

Add the fatal warnings category safesyscalls.
Disallow binary pathnames and arguments to other syscalls, strings
with embedded \0, which are ignored in the syscall but kept in
perl. Allow an ending \0 though, as several modules add a \0 to
such strings without adjusting the length.
The hidden payloads in these invalid string args may cause unnoticed
security problems, as they are ignored by the syscalls but kept around
in perl PVs.

Thanks for this.

Your description of this this change is a bit unclear; in particular, I'm
assuming that "Disallow binary pathnames" only refers to disallowing \0,
rather than some more general non-ASCII prohibition???

From what I understand of this patch, it makes the following changes​:

First in the absence of 'use warnings' (or in the presence of
'no warnings "safesyscalls"), it makes system calls like

  open my $fh, ">", "foo\0bar"

return false and sets $! to 'no such file' or some such appropriate
error.

Second, in the presence of 'use warnings', it instead causes the open to
croak (captureable with eval as usual).

Is this a correct assessment?

Before discussing the detailed implementation on the patch, I think we
need a rough concensus on what semantics we desire.

My own personal opinion is that I like the first part​: causing system
calls to mandatorally (and unchangeably) return failure in the presence of
\0.

I'm not very keen on the second part. For a start, it doesn't add that
much over the first part​: the file won't have been opened under any
circumstances and writes to it will still fail, even if the return code
isn't checked. It's just that suddenly the whole program dies, just because
the code contains 'use warnings'. This seems a big escalation, If people
want that behaviour, they can always use autodie.

--
The crew of the Enterprise encounter an alien life form which is
surprisingly neither humanoid nor made from pure energy.
  -- Things That Never Happen in "Star Trek" #22

@p5pRT
Copy link
Author

p5pRT commented Mar 23, 2013

From @rurban

On Mar 23, 2013, at 8​:54 AM, Dave Mitchell via RT <perlbug-followup@​perl.org> wrote​:

On Thu, Mar 21, 2013 at 12​:37​:27PM -0700, Reini Urban via RT wrote​:

Add the fatal warnings category safesyscalls.
Disallow binary pathnames and arguments to other syscalls, strings
with embedded \0, which are ignored in the syscall but kept in
perl. Allow an ending \0 though, as several modules add a \0 to
such strings without adjusting the length.
The hidden payloads in these invalid string args may cause unnoticed
security problems, as they are ignored by the syscalls but kept around
in perl PVs.

Thanks for this.

Your description of this this change is a bit unclear; in particular, I'm
assuming that "Disallow binary pathnames" only refers to disallowing \0,
rather than some more general non-ASCII prohibition???

From what I understand of this patch, it makes the following changes​:

First in the absence of 'use warnings' (or in the presence of
'no warnings "safesyscalls"), it makes system calls like

open my $fh, ">", "foo\0bar"

return false and sets $! to 'no such file' or some such appropriate
error.

No, it actually follows the old behavior​: silently ignore and go on.

Second, in the presence of 'use warnings', it instead causes the open to
croak (captureable with eval as usual).

Is this a correct assessment?

Yes, the die part is correct.

Before discussing the detailed implementation on the patch, I think we
need a rough concensus on what semantics we desire.

My own personal opinion is that I like the first part​: causing system
calls to mandatorally (and unchangeably) return failure in the presence of
\0.

I'm not very keen on the second part. For a start, it doesn't add that
much over the first part​: the file won't have been opened under any
circumstances and writes to it will still fail, even if the return code
isn't checked. It's just that suddenly the whole program dies, just because
the code contains 'use warnings'. This seems a big escalation, If people
want that behaviour, they can always use autodie.

There are four possible reactions​: ignore, silently strip, warn and strip, or die.

Without warnings or with no warnings 'syscalls' perl behaves as before​: ignore.

With warnings enabled I went for die, because such a string argument is
almost always a breakin attempt.
The second option would be a warning with strip of the string (reset the SvCUR).
But since the typical use case for this is harmful and an aggressive breakin
attempt I went for die.

I explicitly allow typical harmless programmer errors adding another \0
at the end.

@p5pRT
Copy link
Author

p5pRT commented Mar 23, 2013

From @mauke

Oops, I accidentally sent this to Reini only when I meant to send it to
the list. Sorry about that.

-------- Original Message --------
Subject​: Re​: [perl #117265] [PATCH] e213661 no warnings 'safesyscalls',
fatal nul checks
Date​: Sat, 23 Mar 2013 16​:05​:04 +0100
From​: Lukas Mai <plokinom@​gmail.com>
To​: Reini Urban <reini@​cpanel.net>

On 23.03.2013 15​:49, Reini Urban wrote​:

On Thu, Mar 21, 2013 at 12​:37​:27PM -0700, Reini Urban via RT wrote​:

Add the fatal warnings category safesyscalls.
Disallow binary pathnames and arguments to other syscalls, strings
with embedded \0, which are ignored in the syscall but kept in
perl. Allow an ending \0 though, as several modules add a \0 to
such strings without adjusting the length.
The hidden payloads in these invalid string args may cause unnoticed
security problems, as they are ignored by the syscalls but kept around
in perl PVs.

There are four possible reactions​: ignore, silently strip, warn and strip, or die.

I don't understand what you mean by "strip". Something like​:

  my $foo = "XYZ\0ABC";
  unlink $foo;
  # $foo eq "XYZ"

?

Without warnings or with no warnings 'syscalls' perl behaves as before​: ignore.

With warnings enabled I went for die, because such a string argument is
almost always a breakin attempt.
The second option would be a warning with strip of the string (reset the SvCUR).
But since the typical use case for this is harmful and an aggressive breakin
attempt I went for die.

So it works unlike every other warning category? AFAIK all other
warnings are simply ... well, warnings. They don't cause fatal errors
(unless you explicitly make them fatal).

I explicitly allow typical harmless programmer errors adding another \0
at the end.

My favorite semantics would be to not treat \0 specially, regardless of
warnings. That means failing with ENOENT if you pass such a string to
open, unlink, stat, ... etc, because as far as Perl is concerned, there
is a \0 in that string, and as far as the file system is concerned, no
such file exists. Similarly, glob("*\0") should return ().

The only problem I see with this is that it might break older scripts
that use a 2-arg open call of the form open(F, "< $foo\0") but we could
add a special exception for this case (only). That is​: In a call to
2-arg open where the filename $X contains exactly one \0 and it is the
last character, treat it as substr($X, 0, -1) instead.

--
Lukas Mai <plokinom@​gmail.com>

@p5pRT
Copy link
Author

p5pRT commented Mar 23, 2013

From @iabyn

On Sat, Mar 23, 2013 at 09​:49​:12AM -0500, Reini Urban wrote​:

No, it actually follows the old behavior​: silently ignore and go on.

Ah I was confused by this​:

+#define CHECK_SAFESYSCALLS(what,pv) \
+ if (SvPOK(pv)) { \
+ char *i; \
+ char *p = SvPVX(pv); \
+ if ( (i = strchr(p, 0))&&(i-p)&&((size_t)(i-p)<SvCUR(pv)-1) ) { \
+ SETERRNO(EINVAL, LIB_INVARG); \
+ if (ckWARN(WARN_SAFESYSCALLS)) \
+ Perl_croak(aTHX_ "Invalid \\0 character in %s​: %s\\0%s",\
+ what,p,++i); \

where you SETERRNO even when not warning. What purpose does that serve?

--
You're only as old as you look.

@p5pRT
Copy link
Author

p5pRT commented Mar 23, 2013

From tchrist@perl.com

Second, in the presence of 'use warnings', it instead causes the open to
croak (captureable with eval as usual).

Is this a correct assessment?

Yes, the die part is correct.

Before discussing the detailed implementation on the patch, I think we
need a rough concensus on what semantics we desire.

My own personal opinion is that I like the first part​: causing system
calls to mandatorally (and unchangeably) return failure in the presence of
\0.

I'm not very keen on the second part. For a start, it doesn't add that
much over the first part​: the file won't have been opened under any
circumstances and writes to it will still fail, even if the return code
isn't checked. It's just that suddenly the whole program dies, just because
the code contains 'use warnings'. This seems a big escalation, If people
want that behaviour, they can always use autodie.

There are four possible reactions​: ignore, silently strip, warn and stri<SNIP>

Without warnings or with no warnings 'syscalls' perl behaves as before​: <SNIP>

With warnings enabled I went for die, because such a string argument is
almost always a breakin attempt.

Hold on. You cannot have a fatal warning without asking for one.

A use warnings pragma must never raise an exception simply because you
asked for a warning. That's what "use warnings FATAL => blah" is for.

--tom

@p5pRT
Copy link
Author

p5pRT commented Mar 24, 2013

From @rjbs

* Lukas Mai <plokinom@​gmail.com> [2013-03-23T11​:34​:47]

I explicitly allow typical harmless programmer errors adding another \0
at the end.

My favorite semantics would be to not treat \0 specially, regardless of
warnings. That means failing with ENOENT if you pass such a string to
open, unlink, stat, ... etc, because as far as Perl is concerned, there
is a \0 in that string, and as far as the file system is concerned, no
such file exists. Similarly, glob("*\0") should return ().

This is also my feeling on the topic.

--
rjbs

@p5pRT
Copy link
Author

p5pRT commented Mar 24, 2013

From @rurban

On Sat, Mar 23, 2013 at 7​:11 PM, Ricardo Signes
<perl.p5p@​rjbs.manxome.org> wrote​:

* Lukas Mai <plokinom@​gmail.com> [2013-03-23T11​:34​:47]

I explicitly allow typical harmless programmer errors adding another \0
at the end.

My favorite semantics would be to not treat \0 specially, regardless of
warnings. That means failing with ENOENT if you pass such a string to
open, unlink, stat, ... etc, because as far as Perl is concerned, there
is a \0 in that string, and as far as the file system is concerned, no
such file exists. Similarly, glob("*\0") should return ().

tomc​:
Hold on. You cannot have a fatal warning without asking for one.

A use warnings pragma must never raise an exception simply because you
asked for a warning. That's what "use warnings FATAL => blah" is for.

You are right, my mistake. I was fooled by the existing severe
DEFAULT_ON categories.

I'll ditch the severe warnings category and go to ENOENT + return undef,
with optional warnings if enabled.

I'm also not sure if I missed the most important part​: nul in pp_require args.
I forgot a testcase for that.
--
Reini Urban
http​://cpanel.net/ http​://www.perl-compiler.org/

@p5pRT
Copy link
Author

p5pRT commented Mar 25, 2013

From @rurban

On 03/23/2013 10​:35 AM, Lukas Mai via RT wrote​:

Oops, I accidentally sent this to Reini only when I meant to send it to
the list. Sorry about that.

-------- Original Message --------
Subject​: Re​: [perl #117265] [PATCH] e213661 no warnings 'safesyscalls',
fatal nul checks
Date​: Sat, 23 Mar 2013 16​:05​:04 +0100
From​: Lukas Mai <plokinom@​gmail.com>
To​: Reini Urban <reini@​cpanel.net>

On 23.03.2013 15​:49, Reini Urban wrote​:

On Thu, Mar 21, 2013 at 12​:37​:27PM -0700, Reini Urban via RT wrote​:

Add the fatal warnings category safesyscalls.
Disallow binary pathnames and arguments to other syscalls, strings
with embedded \0, which are ignored in the syscall but kept in
perl. Allow an ending \0 though, as several modules add a \0 to
such strings without adjusting the length.
The hidden payloads in these invalid string args may cause unnoticed
security problems, as they are ignored by the syscalls but kept around
in perl PVs.

There are four possible reactions​: ignore, silently strip, warn and strip, or die.

I don't understand what you mean by "strip". Something like​:

my $foo = "XYZ\\0ABC";
unlink $foo;
\# $foo eq "XYZ"

?

The syscall behind unlink sees only "XYZ", and deletes XYZ if it exists.
So from the perl POV it's like a "silent strip" without actually
stripping it.

...

I explicitly allow typical harmless programmer errors adding another \0
at the end.

My favorite semantics would be to not treat \0 specially, regardless of
warnings. That means failing with ENOENT if you pass such a string to
open, unlink, stat, ... etc, because as far as Perl is concerned, there
is a \0 in that string, and as far as the file system is concerned, no
such file exists. Similarly, glob("*\0") should return ().

Agreed.

The only problem I see with this is that it might break older scripts
that use a 2-arg open call of the form open(F, "< $foo\0") but we could
add a special exception for this case (only). That is​: In a call to
2-arg open where the filename $X contains exactly one \0 and it is the
last character, treat it as substr($X, 0, -1) instead.

That's why I said​: "I explicitly allow typical harmless programmer
errors adding another \0 at the end." I do exactly that, and for this
case only. There were too many failures because of this habit. I only
fixed ExtUtils-Manifest.

--
Reini

Working towards a true Modern Perl.
Slim, functional, unbloated, compile-time optimizable

@p5pRT
Copy link
Author

p5pRT commented Mar 25, 2013

From @demerphq

On 25 March 2013 15​:12, Reini Urban <reini@​cpanel.net> wrote​:

On 03/23/2013 10​:35 AM, Lukas Mai via RT wrote​:

Oops, I accidentally sent this to Reini only when I meant to send it to
the list. Sorry about that.

-------- Original Message --------
Subject​: Re​: [perl #117265] [PATCH] e213661 no warnings 'safesyscalls',
fatal nul checks
Date​: Sat, 23 Mar 2013 16​:05​:04 +0100
From​: Lukas Mai <plokinom@​gmail.com>
To​: Reini Urban <reini@​cpanel.net>

On 23.03.2013 15​:49, Reini Urban wrote​:

On Thu, Mar 21, 2013 at 12​:37​:27PM -0700, Reini Urban via RT wrote​:

Add the fatal warnings category safesyscalls.
Disallow binary pathnames and arguments to other syscalls, strings
with embedded \0, which are ignored in the syscall but kept in
perl. Allow an ending \0 though, as several modules add a \0 to
such strings without adjusting the length.
The hidden payloads in these invalid string args may cause unnoticed
security problems, as they are ignored by the syscalls but kept around
in perl PVs.

There are four possible reactions​: ignore, silently strip, warn and
strip, or die.

I don't understand what you mean by "strip". Something like​:

my $foo = "XYZ\\0ABC";
unlink $foo;
\# $foo eq "XYZ"

?

The syscall behind unlink sees only "XYZ", and deletes XYZ if it exists. So
from the perl POV it's like a "silent strip" without actually stripping it.

...

I explicitly allow typical harmless programmer errors adding another \0
at the end.

My favorite semantics would be to not treat \0 specially, regardless of
warnings. That means failing with ENOENT if you pass such a string to
open, unlink, stat, ... etc, because as far as Perl is concerned, there
is a \0 in that string, and as far as the file system is concerned, no
such file exists. Similarly, glob("*\0") should return ().

Agreed.

The only problem I see with this is that it might break older scripts
that use a 2-arg open call of the form open(F, "< $foo\0") but we could
add a special exception for this case (only). That is​: In a call to
2-arg open where the filename $X contains exactly one \0 and it is the
last character, treat it as substr($X, 0, -1) instead.

That's why I said​: "I explicitly allow typical harmless programmer errors
adding another \0 at the end." I do exactly that, and for this case only.
There were too many failures because of this habit. I only fixed
ExtUtils-Manifest.

Reini, on Win32 filenames are stored internally as UTF-16. What affect
does your patch proposal have on opening files with widecharacters in
them? (Widecharacters as you know could easily contain nuls).

Yves

--
perl -Mre=debug -e "/just|another|perl|hacker/"

@p5pRT
Copy link
Author

p5pRT commented Mar 25, 2013

From @Leont

On Mon, Mar 25, 2013 at 3​:14 PM, demerphq <demerphq@​gmail.com> wrote​:

Reini, on Win32 filenames are stored internally as UTF-16. What affect
does your patch proposal have on opening files with widecharacters in
them? (Widecharacters as you know could easily contain nuls).

Perl uses legacy interfaces on Windows (that is, it accesses the
filesystem using 8-bit interfaces, the encoding is system-defined but
tends to be latin-1, which saves our ass most of the time).

I'd consider this our number one Windows bug (because it screws up
badly when trying to open files with Unicode in their names), but
fixing this will be non-trivial, and few people have the appropriate
Windows knowledge anyway.

Leon

@p5pRT
Copy link
Author

p5pRT commented Mar 25, 2013

From @demerphq

On 25 March 2013 15​:28, Leon Timmermans <fawaka@​gmail.com> wrote​:

On Mon, Mar 25, 2013 at 3​:14 PM, demerphq <demerphq@​gmail.com> wrote​:

Reini, on Win32 filenames are stored internally as UTF-16. What affect
does your patch proposal have on opening files with widecharacters in
them? (Widecharacters as you know could easily contain nuls).

Perl uses legacy interfaces on Windows (that is, it accesses the
filesystem using 8-bit interfaces, the encoding is system-defined but
tends to be latin-1, which saves our ass most of the time).

I'd consider this our number one Windows bug (because it screws up
badly when trying to open files with Unicode in their names), but
fixing this will be non-trivial, and few people have the appropriate
Windows knowledge anyway.

Im just worried that Reinis null surpression might break something
there. Sounds like you think we needn't worry on that point, as such,
I wont. :-)

--
perl -Mre=debug -e "/just|another|perl|hacker/"

@p5pRT
Copy link
Author

p5pRT commented Mar 25, 2013

From @Leont

On Mon, Mar 25, 2013 at 3​:34 PM, demerphq <demerphq@​gmail.com> wrote​:

Im just worried that Reinis null surpression might break something
there. Sounds like you think we needn't worry on that point, as such,
I wont. :-)

Well, the second question is "do all non-unix operating systems deal
with null in the same way". I'm reasonably sure Windows does, but have
no idea about for example VMS.

Leon

@p5pRT
Copy link
Author

p5pRT commented Mar 25, 2013

From @steve-m-hay

Leon Timmermans wrote on 2013-03-25​:

On Mon, Mar 25, 2013 at 3​:14 PM, demerphq <demerphq@​gmail.com> wrote​:

Reini, on Win32 filenames are stored internally as UTF-16. What affect
does your patch proposal have on opening files with widecharacters in
them? (Widecharacters as you know could easily contain nuls).

Perl uses legacy interfaces on Windows (that is, it accesses the
filesystem using 8-bit interfaces, the encoding is system-defined but
tends to be latin-1, which saves our ass most of the time).

I'd consider this our number one Windows bug (because it screws up
badly when trying to open files with Unicode in their names), but
fixing this will be non-trivial, and few people have the appropriate
Windows knowledge anyway.

It is possible to open Unicode filenames on Windows using Win32API​::File (see below). I assume this will at least still work?

use Encode qw(encode);
use Fcntl;
use Win32API​::File qw(CreateFileW OsFHandleOpenFd :Misc :GENERIC_);

my $filename = "\x{65e5}.txt\0";
my $utf16le_filename = encode('UTF-16LE', $filename, 1);

my $h = CreateFileW($utf16le_filename, GENERIC_WRITE, 0, [], CREATE_NEW, 0, []);
  die "[write] CreateFileW​: $^E\n" unless $h > 0;
my $fd = OsFHandleOpenFd($h, O_WRONLY);
  die "[write] OsFHandleOpenFd​: $^E\n" unless $fd > 0;
open my $fh, ">&=$fd";
  die "[write] open​: $!\n" unless $fh;
print $fh "Hello, world.\n";
close $fh;

@p5pRT
Copy link
Author

p5pRT commented Mar 25, 2013

From @rurban

On 03/25/2013 09​:35 AM, yves orton via RT wrote​:

On 25 March 2013 15​:28, Leon Timmermans <fawaka@​gmail.com> wrote​:

On Mon, Mar 25, 2013 at 3​:14 PM, demerphq <demerphq@​gmail.com> wrote​:

Reini, on Win32 filenames are stored internally as UTF-16. What affect
does your patch proposal have on opening files with widecharacters in
them? (Widecharacters as you know could easily contain nuls).

Perl uses legacy interfaces on Windows (that is, it accesses the
filesystem using 8-bit interfaces, the encoding is system-defined but
tends to be latin-1, which saves our ass most of the time).

I'd consider this our number one Windows bug (because it screws up
badly when trying to open files with Unicode in their names), but
fixing this will be non-trivial, and few people have the appropriate
Windows knowledge anyway.

Im just worried that Reinis null surpression might break something
there. Sounds like you think we needn't worry on that point, as such,
I wont. :-)

I'll test it.
win32ce for sure uses the wide api, and seems to use wide perl strings,
which would break then, right. For win32 plain I guess It's some global
settings which controls if the W or A version is used, and I see no
translation from perl utf8 strings to the wide strings in win32/win32io,
They are used asis.

And I'm not sure if -C allows using wide chars (seems to set only
utf-8), or if wide locales are even valid.
I only know about nul-safe encodings, and -C was used to set wide until
5.8.1 on windows. The -C wide API was disabled since then.

--
Reini

Working towards a true Modern Perl.
Slim, functional, unbloated, compile-time optimizable

@p5pRT
Copy link
Author

p5pRT commented Mar 25, 2013

From @demerphq

On 25 March 2013 15​:41, Steve Hay <Steve.Hay@​verosoftware.com> wrote​:

Leon Timmermans wrote on 2013-03-25​:

On Mon, Mar 25, 2013 at 3​:14 PM, demerphq <demerphq@​gmail.com> wrote​:

Reini, on Win32 filenames are stored internally as UTF-16. What affect
does your patch proposal have on opening files with widecharacters in
them? (Widecharacters as you know could easily contain nuls).

Perl uses legacy interfaces on Windows (that is, it accesses the
filesystem using 8-bit interfaces, the encoding is system-defined but
tends to be latin-1, which saves our ass most of the time).

I'd consider this our number one Windows bug (because it screws up
badly when trying to open files with Unicode in their names), but
fixing this will be non-trivial, and few people have the appropriate
Windows knowledge anyway.

It is possible to open Unicode filenames on Windows using Win32API​::File (see below). I assume this will at least still work?

use Encode qw(encode);
use Fcntl;
use Win32API​::File qw(CreateFileW OsFHandleOpenFd :Misc :GENERIC_);

my $filename = "\x{65e5}.txt\0";
my $utf16le_filename = encode('UTF-16LE', $filename, 1);

my $h = CreateFileW($utf16le_filename, GENERIC_WRITE, 0, [], CREATE_NEW, 0, []);
die "[write] CreateFileW​: $^E\n" unless $h > 0;
my $fd = OsFHandleOpenFd($h, O_WRONLY);
die "[write] OsFHandleOpenFd​: $^E\n" unless $fd > 0;
open my $fh, ">&=$fd";
die "[write] open​: $!\n" unless $fh;
print $fh "Hello, world.\n";
close $fh;

Is there an open ticket about this bug? (IOW, that Perl cant open
widechar filenames without jumping through the proverbial hoop?)

--
perl -Mre=debug -e "/just|another|perl|hacker/"

@p5pRT
Copy link
Author

p5pRT commented Mar 25, 2013

From @steve-m-hay

demerphq wrote on 2013-03-25​:

On 25 March 2013 15​:41, Steve Hay <Steve.Hay@​verosoftware.com> wrote​:

Leon Timmermans wrote on 2013-03-25​:

On Mon, Mar 25, 2013 at 3​:14 PM, demerphq <demerphq@​gmail.com>
wrote​:

Reini, on Win32 filenames are stored internally as UTF-16. What
affect does your patch proposal have on opening files with
widecharacters in them? (Widecharacters as you know could easily
contain nuls).

Perl uses legacy interfaces on Windows (that is, it accesses the
filesystem using 8-bit interfaces, the encoding is system-defined
but
tends to be latin-1, which saves our ass most of the time).

I'd consider this our number one Windows bug (because it screws up
badly when trying to open files with Unicode in their names), but
fixing this will be non-trivial, and few people have the
appropriate Windows knowledge anyway.

It is possible to open Unicode filenames on Windows using
Win32API​::File (see below). I assume this will at least still work?

use Encode qw(encode);
use Fcntl;
use Win32API​::File qw(CreateFileW OsFHandleOpenFd :Misc :GENERIC_);

my $filename = "\x{65e5}.txt\0";
my $utf16le_filename = encode('UTF-16LE', $filename, 1);

my $h = CreateFileW($utf16le_filename, GENERIC_WRITE, 0, [],
CREATE_NEW, 0, []);
die "[write] CreateFileW​: $^E\n" unless $h > 0; my $fd =
OsFHandleOpenFd($h, O_WRONLY); die "[write] OsFHandleOpenFd​:
$^E\n" unless $fd > 0; open my $fh, ">&=$fd"; die "[write]
open​: $!\n" unless $fh; print $fh "Hello,
world.\n"; close $fh;

Is there an open ticket about this bug? (IOW, that Perl cant open
widechar filenames without jumping through the proverbial hoop?)

Not sure whether there is an RT ticket for this specific issue without
having a look, but it's all tied up with the "Unicode in Filename" and
"Virtualize operating system access" items in Porting/todo.pod...

@p5pRT
Copy link
Author

p5pRT commented Mar 26, 2013

From @rurban

Attached is the revised patch

Check for the nul char in pathnames and string arguments to
syscalls, return undef and set errno to ENOENT.
Added to the default severe warnings category syscalls.
 
Strings with embedded \0 chars were prev. ignored in the syscall but
kept in perl. The hidden payloads in these invalid string args may cause
unnoticed security problems, as they are hard to detect, ignored by
the syscalls but kept around in perl PVs.
Allow an ending \0 though, as several modules add a \0 to
such strings without adjusting the length.
Ignored on WinCE since this uses the wide char API.

--
Reini Urban

@p5pRT
Copy link
Author

p5pRT commented Mar 26, 2013

From @rurban

0001-safesyscalls-check-embedded-nul-in-syscall-args.patch
From 376abbcc174dc5e9262a4c6a6a2c7862b611fbfd Mon Sep 17 00:00:00 2001
From: Reini Urban <rurban@x-ray.at>
Date: Tue, 12 Mar 2013 19:25:26 +0100
Subject: [PATCH] safesyscalls: check embedded nul in syscall args

Check for the nul char in pathnames and string arguments to
syscalls, return undef and set errno to ENOENT.
Added to the default severe warnings category syscalls.

Strings with embedded \0 chars were prev. ignored in the syscall but
kept in perl. The hidden payloads in these invalid string args may cause
unnoticed security problems, as they are hard to detect, ignored by
the syscalls but kept around in perl PVs.
Allow an ending \0 though, as several modules add a \0 to
such strings without adjusting the length.
Ignored on WinCE since this uses the wide char API.
---
 doio.c                  |    9 ++++++++-
 embed.fnc               |    4 ++--
 ext/File-Glob/Glob.xs   |    3 ++-
 lib/warnings.pm         |   15 +++++++++------
 perl.h                  |   23 +++++++++++++++++++++++
 perlio.c                |    4 ++++
 pod/perldiag.pod        |    9 +++++++++
 pp_ctl.c                |    5 +++++
 regen/warnings.pl       |    6 +++---
 t/io/open.t             |   27 ++++++++++++++++++++++++++-
 t/lib/warnings/6default |    5 +++++
 t/op/require_errors.t   |   18 +++++++++++++++++-
 t/porting/diag.t        |    1 +
 warnings.h              |    1 +
 14 files changed, 115 insertions(+), 15 deletions(-)

diff --git a/doio.c b/doio.c
index 4e8d48a..d6c1f81 100644
--- a/doio.c
+++ b/doio.c
@@ -216,6 +216,7 @@ Perl_do_openn(pTHX_ GV *gv, const char *oname, I32 len, int as_raw,
 		goto say_false;
 	    }
 #endif /* USE_STDIO */
+	    CHECK_PATHNAME(*svp, FALSE);
 	    name = (SvOK(*svp) || SvGMAGICAL(*svp)) ?
 			savesvpv (*svp) : savepvs ("");
 	    SAVEFREEPV(name);
@@ -1638,6 +1639,7 @@ Perl_apply(pTHX_ I32 type, SV **mark, SV **sp)
 		else {
 		    const char *name = SvPV_nomg_const_nolen(*mark);
 		    APPLY_TAINT_PROPER();
+		    CHECK_PATHNAME(*mark, --tot);
 		    if (PerlLIO_chmod(name, val))
 			tot--;
 		}
@@ -1672,6 +1674,7 @@ Perl_apply(pTHX_ I32 type, SV **mark, SV **sp)
 		else {
 		    const char *name = SvPV_nomg_const_nolen(*mark);
 		    APPLY_TAINT_PROPER();
+		    CHECK_PATHNAME(*mark, --tot);
 		    if (PerlLIO_chown(name, val, val2))
 			tot--;
 		}
@@ -1690,6 +1693,7 @@ nothing in the core.
 	APPLY_TAINT_PROPER();
 	if (mark == sp)
 	    break;
+	CHECK_SYSCALL(*mark, --tot);
 	s = SvPVx_const(*++mark, len);
 	if (*s == '-' && isALPHA(s[1]))
 	{
@@ -1702,7 +1706,7 @@ nothing in the core.
 		s += 3;
                 len -= 3;
             }
-           if ((val = whichsig_pvn(s, len)) < 0)
+	    if ((val = whichsig_pvn(s, len)) < 0)
                Perl_croak(aTHX_ "Unrecognized signal name \"%"SVf"\"", SVfARG(*mark));
 	}
 	else
@@ -1773,6 +1777,7 @@ nothing in the core.
 	while (++mark <= sp) {
 	    s = SvPV_nolen_const(*mark);
 	    APPLY_TAINT_PROPER();
+	    CHECK_PATHNAME(*mark, --tot);
 	    if (PerlProc_geteuid() || PL_unsafe) {
 		if (UNLINK(s))
 		    tot--;
@@ -1851,6 +1856,7 @@ nothing in the core.
 		else {
 		    const char * const name = SvPV_nomg_const_nolen(*mark);
 		    APPLY_TAINT_PROPER();
+		    CHECK_PATHNAME(*mark, --tot);
 #ifdef HAS_FUTIMES
 		    if (utimes(name, (struct timeval *)utbufp))
 #else
@@ -2343,6 +2349,7 @@ Perl_start_glob (pTHX_ SV *tmpglob, IO *io)
 
     PERL_ARGS_ASSERT_START_GLOB;
 
+    CHECK_SYSCALL(tmpglob, NULL);
     ENTER;
     SAVEFREESV(tmpcmd);
 #ifdef VMS /* expand the wildcards right here, rather than opening a pipe, */
diff --git a/embed.fnc b/embed.fnc
index 2f5e089..8c981f2 100644
--- a/embed.fnc
+++ b/embed.fnc
@@ -2253,7 +2253,7 @@ s	|void	|printbuf	|NN const char *const fmt|NN const char *const s
 #endif
 
 #if defined(PERL_IN_UNIVERSAL_C)
-s	|bool|isa_lookup	|NN HV *stash|NN const char * const name \
+s	|bool	|isa_lookup	|NN HV *stash|NN const char * const name \
                                         |STRLEN len|U32 flags
 #endif
 
@@ -2266,7 +2266,7 @@ s	|char*	|stdize_locale	|NN char* locs
 #if defined(PERL_IN_UTIL_C)
 s	|const COP*|closest_cop	|NN const COP *cop|NULLOK const OP *o
 s	|SV*	|mess_alloc
-s	|SV *|with_queued_errors|NN SV *ex
+s	|SV *	|with_queued_errors|NN SV *ex
 s	|bool	|invoke_exception_hook|NULLOK SV *ex|bool warn
 #if defined(PERL_MEM_LOG) && !defined(PERL_MEM_LOG_NOIMPL)
 sn	|void	|mem_log_common	|enum mem_log_type mlt|const UV n|const UV typesize \
diff --git a/ext/File-Glob/Glob.xs b/ext/File-Glob/Glob.xs
index df5530a..aa2b186 100644
--- a/ext/File-Glob/Glob.xs
+++ b/ext/File-Glob/Glob.xs
@@ -227,7 +227,8 @@ csh_glob(pTHX_ AV *entries, SV *patsv)
 
 	assert(SvTYPE(entries) != SVt_PVAV);
 	sv_upgrade((SV *)entries, SVt_PVAV);
-	
+	CHECK_SYSCALL(patsv, TRUE);
+
 	if (patav) {
 	    I32 items = AvFILLp(patav) + 1;
 	    SV **svp = AvARRAY(patav);
diff --git a/lib/warnings.pm b/lib/warnings.pm
index c0c2cc9..3d9ac00 100644
--- a/lib/warnings.pm
+++ b/lib/warnings.pm
@@ -232,10 +232,11 @@ our %Offsets = (
     'experimental::lexical_subs'=> 104,
     'experimental::lexical_topic'=> 106,
     'experimental::regex_sets'=> 108,
+    'syscalls'		=> 110,
   );
 
 our %Bits = (
-    'all'		=> "\x55\x55\x55\x55\x55\x55\x55\x55\x55\x55\x55\x55\x55\x15", # [0..54]
+    'all'		=> "\x55\x55\x55\x55\x55\x55\x55\x55\x55\x55\x55\x55\x55\x55", # [0..55]
     'ambiguous'		=> "\x00\x00\x00\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00", # [29]
     'bareword'		=> "\x00\x00\x00\x00\x00\x00\x00\x10\x00\x00\x00\x00\x00\x00", # [30]
     'closed'		=> "\x00\x10\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [6]
@@ -277,11 +278,12 @@ our %Bits = (
     'regexp'		=> "\x00\x00\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00", # [20]
     'reserved'		=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x04\x00\x00\x00\x00", # [37]
     'semicolon'		=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x10\x00\x00\x00\x00", # [38]
-    'severe'		=> "\x00\x00\x00\x00\x00\x54\x05\x00\x00\x00\x00\x00\x00\x00", # [21..25]
+    'severe'		=> "\x00\x00\x00\x00\x00\x54\x05\x00\x00\x00\x00\x00\x00\x40", # [21..25,55]
     'signal'		=> "\x00\x00\x00\x00\x00\x00\x10\x00\x00\x00\x00\x00\x00\x00", # [26]
     'substr'		=> "\x00\x00\x00\x00\x00\x00\x40\x00\x00\x00\x00\x00\x00\x00", # [27]
     'surrogate'		=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x10\x00", # [50]
     'syntax'		=> "\x00\x00\x00\x00\x00\x00\x00\x55\x55\x15\x00\x40\x00\x00", # [28..38,47]
+    'syscalls'		=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x40", # [55]
     'taint'		=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x40\x00\x00\x00\x00", # [39]
     'threads'		=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00", # [40]
     'uninitialized'	=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x04\x00\x00\x00", # [41]
@@ -293,7 +295,7 @@ our %Bits = (
   );
 
 our %DeadBits = (
-    'all'		=> "\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\x2a", # [0..54]
+    'all'		=> "\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa", # [0..55]
     'ambiguous'		=> "\x00\x00\x00\x00\x00\x00\x00\x08\x00\x00\x00\x00\x00\x00", # [29]
     'bareword'		=> "\x00\x00\x00\x00\x00\x00\x00\x20\x00\x00\x00\x00\x00\x00", # [30]
     'closed'		=> "\x00\x20\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [6]
@@ -335,11 +337,12 @@ our %DeadBits = (
     'regexp'		=> "\x00\x00\x00\x00\x00\x02\x00\x00\x00\x00\x00\x00\x00\x00", # [20]
     'reserved'		=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x08\x00\x00\x00\x00", # [37]
     'semicolon'		=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x20\x00\x00\x00\x00", # [38]
-    'severe'		=> "\x00\x00\x00\x00\x00\xa8\x0a\x00\x00\x00\x00\x00\x00\x00", # [21..25]
+    'severe'		=> "\x00\x00\x00\x00\x00\xa8\x0a\x00\x00\x00\x00\x00\x00\x80", # [21..25,55]
     'signal'		=> "\x00\x00\x00\x00\x00\x00\x20\x00\x00\x00\x00\x00\x00\x00", # [26]
     'substr'		=> "\x00\x00\x00\x00\x00\x00\x80\x00\x00\x00\x00\x00\x00\x00", # [27]
     'surrogate'		=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x20\x00", # [50]
     'syntax'		=> "\x00\x00\x00\x00\x00\x00\x00\xaa\xaa\x2a\x00\x80\x00\x00", # [28..38,47]
+    'syscalls'		=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x80", # [55]
     'taint'		=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x80\x00\x00\x00\x00", # [39]
     'threads'		=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02\x00\x00\x00", # [40]
     'uninitialized'	=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x08\x00\x00\x00", # [41]
@@ -351,8 +354,8 @@ our %DeadBits = (
   );
 
 $NONE     = "\0\0\0\0\0\0\0\0\0\0\0\0\0\0";
-$DEFAULT  = "\x10\x01\x00\x00\x00\x50\x04\x00\x00\x00\x00\x00\x00\x15", # [2,52..54,4,22,23,25]
-$LAST_BIT = 110 ;
+$DEFAULT  = "\x10\x01\x00\x00\x00\x50\x04\x00\x00\x00\x00\x00\x00\x55", # [2,52..54,4,22,23,25,55]
+$LAST_BIT = 112 ;
 $BYTES    = 14 ;
 
 $All = "" ; vec($All, $Offsets{'all'}, 2) = 3 ;
diff --git a/perl.h b/perl.h
index 5639f1c..2093c48 100644
--- a/perl.h
+++ b/perl.h
@@ -5814,6 +5814,29 @@ extern void moncontrol(int);
 #  define do_aexec(really, mark,sp)	do_aexec5(really, mark, sp, 0, 0)
 #endif
 
+/* check embedded \0 characters in pathnames passed to syscalls,
+   but allow one ending \0 */
+#define CHECK_PATHNAME(pv,r) CHECK_SAFESYSCALLS("pathname",pv,r)
+/* check embedded \0 characters in args for various other syscalls.
+   user and group names, glob. */
+#define CHECK_SYSCALL(pv,r)  CHECK_SAFESYSCALLS("syscall",pv,r)
+/* if (CHECK_SYSCALL(pv, return NULL)); */
+#if defined(WIN32) && (defined(UNICODE) || defined(UNDER_CE))
+#define CHECK_SAFESYSCALLS(what,pv,reaction)
+/* TODO: wchar API, so far only used on wince */
+#else
+#define CHECK_SAFESYSCALLS(what,pv,reaction)                            \
+    if (SvPOK(pv)) {                                                    \
+      char *i;                                                          \
+      char *p = SvPVX(pv);                                              \
+      if ( (i = strchr(p, 0))&&(i-p)&&((size_t)(i-p)<SvCUR(pv)-1) ) {   \
+        SETERRNO(ENOENT, LIB_INVARG);                                   \
+        if (ckWARN_d(WARN_SYSCALLS)) {                                  \
+            ck_warner_d(packWARN(WARN_SYSCALLS),                        \
+                        "Invalid \\0 character in %s: %s\\0%s", what,p,++i); \
+        } return reaction; }}
+#endif
+
 #if defined(OEMVS)
 #define NO_ENV_ARRAY_IN_MAIN
 #endif
diff --git a/perlio.c b/perlio.c
index 097bc49..77e6db3 100644
--- a/perlio.c
+++ b/perlio.c
@@ -310,6 +310,7 @@ PerlIO_openn(pTHX_ const char *layers, const char *mode, int fd,
 	    return PerlIO_tmpfile();
 	else {
 	    const char *name = SvPV_nolen_const(*args);
+	    CHECK_PATHNAME(*args, NULL);
 	    if (*mode == IoTYPE_NUMERIC) {
 		fd = PerlLIO_open3(name, imode, perm);
 		if (fd >= 0)
@@ -2713,6 +2714,7 @@ PerlIOUnix_open(pTHX_ PerlIO_funcs *self, PerlIO_list_t *layers,
 	}
 	if (imode != -1) {
 	    const char *path = SvPV_nolen_const(*args);
+	    CHECK_PATHNAME(*args, NULL);
 	    fd = PerlLIO_open3(path, imode, perm);
 	}
     }
@@ -3027,6 +3029,7 @@ PerlIOStdio_open(pTHX_ PerlIO_funcs *self, PerlIO_list_t *layers,
 	const char * const path = SvPV_nolen_const(*args);
 	PerlIOStdio * const s = PerlIOSelf(f, PerlIOStdio);
 	FILE *stdio;
+	CHECK_PATHNAME(*args, NULL);
 	PerlIOUnix_refcnt_dec(fileno(s->stdio));
 	stdio = PerlSIO_freopen(path, (mode = PerlIOStdio_mode(mode, tmode)),
 			    s->stdio);
@@ -3039,6 +3042,7 @@ PerlIOStdio_open(pTHX_ PerlIO_funcs *self, PerlIO_list_t *layers,
     else {
 	if (narg > 0) {
 	    const char * const path = SvPV_nolen_const(*args);
+	    CHECK_PATHNAME(*args, NULL);
 	    if (*mode == IoTYPE_NUMERIC) {
 		mode++;
 		fd = PerlLIO_open3(path, imode, perm);
diff --git a/pod/perldiag.pod b/pod/perldiag.pod
index 8d29e2d..c448208 100644
--- a/pod/perldiag.pod
+++ b/pod/perldiag.pod
@@ -2468,6 +2468,15 @@ indicated one isn't.  See L<charnames/CUSTOM ALIASES>.
 the C<:alias> option to C<use charnames> and the specified character in
 the indicated name isn't valid.  See L<charnames/CUSTOM ALIASES>.
 
+=item Invalid \0 character in %s: %s\0%s
+
+(S) Embedded \0 characters in pathnames or other syscall arguments
+create a warning since 5.18. The parts after the \0 were formerly
+ignored by syscalls.
+
+If you know what you are doing you can turn off this warning by
+C<no warnings 'syscalls';>.
+
 =item Invalid conversion in %s: "%s"
 
 (W printf) Perl does not understand the given format conversion.  See
diff --git a/pp_ctl.c b/pp_ctl.c
index c9d833f..8f4cfbd 100644
--- a/pp_ctl.c
+++ b/pp_ctl.c
@@ -3546,6 +3546,7 @@ S_check_type_and_open(pTHX_ SV *name)
 
     PERL_ARGS_ASSERT_CHECK_TYPE_AND_OPEN;
 
+    CHECK_PATHNAME(name, NULL);
     if (st_rc < 0 || S_ISDIR(st.st_mode) || S_ISBLK(st.st_mode)) {
 	return NULL;
     }
@@ -3671,6 +3672,10 @@ PP(pp_require)
     name = SvPV_const(sv, len);
     if (!(name && len > 0 && *name))
 	DIE(aTHX_ "Null filename used");
+    CHECK_PATHNAME(sv, Perl_die(aTHX_ "Can't locate %s:   %s",
+      pv_escape(newSVpvs_flags("",SVs_TEMP),SvPVX(sv),SvCUR(sv),SvCUR(sv)*2,NULL,
+                SvUTF8(sv)?PERL_PV_ESCAPE_UNI:0),
+      Strerror(ENOENT)));
     TAINT_PROPER("require");
 
 
diff --git a/regen/warnings.pl b/regen/warnings.pl
index 94a9843..d8343a9 100644
--- a/regen/warnings.pl
+++ b/regen/warnings.pl
@@ -16,7 +16,7 @@
 #
 # This script is normally invoked from regen.pl.
 
-$VERSION = '1.02_03';
+$VERSION = '1.02_04';
 
 BEGIN {
     require 'regen/regen_lib.pl';
@@ -56,7 +56,8 @@ my $tree = {
 	 			'internal'	=> [ 5.008, DEFAULT_OFF],
          			'debugging'	=> [ 5.008, DEFAULT_ON],
          			'malloc'	=> [ 5.008, DEFAULT_ON],
-	 		   }],
+			        'syscalls'      => [ 5.017, DEFAULT_ON],
+			   }],
         'deprecated'	=> [ 5.008, DEFAULT_ON],
        	'void'		=> [ 5.008, DEFAULT_OFF],
        	'recursion'	=> [ 5.008, DEFAULT_OFF],
@@ -92,7 +93,6 @@ my $tree = {
                                 'experimental::lexical_topic' =>
                                     [ 5.017, DEFAULT_ON ],
                         }],
-
        	 #'default'	=> [ 5.008, DEFAULT_ON ],
   	}],
 } ;
diff --git a/t/io/open.t b/t/io/open.t
index e06fc8e..d7eaab7 100644
--- a/t/io/open.t
+++ b/t/io/open.t
@@ -10,7 +10,7 @@ $|  = 1;
 use warnings;
 use Config;
 
-plan tests => 121;
+plan tests => 131;
 
 my $Perl = which_perl();
 
@@ -386,3 +386,28 @@ SKIP: {
     ok( eval { $fh->autoflush(1); 1 }, '$fh->autoflush(1) lives' );
     ok( $INC{'IO/File.pm'}, "IO::File now loaded" );
 }
+
+# [perl #117265] check for embedded nul in pathnames, allow ending \0 though
+{
+  my $WARN;
+  local $SIG{__WARN__} = sub { $WARN = shift };
+  my $fn = "tmp\0.invalid";
+  is(open(I, $fn), undef, "open with nul in pathnames since 5.18 [perl #117265]");
+  like($WARN, qr/^Invalid \\0 character in pathname: tmp/,
+       "warn on embedded nul"); $WARN = '';
+  is (unlink($fn), 0);
+  like($WARN, qr/^Invalid \\0 character in pathname: tmp/,
+       "also on unlink"); $WARN = '';
+  is(chmod(0644, $fn), 0);
+  like($WARN, qr/^Invalid \\0 character in pathname: tmp/,
+       "also on chmod"); $WARN = '';
+  is (glob($fn), ());
+  like($WARN, qr/^Invalid \\0 character in syscall: tmp/,
+       "also on glob"); $WARN = '';
+  {
+    no warnings 'syscalls';
+    $WARN = '';
+    is(open(I, $fn), undef, "open with nul with no warnings syscalls");
+    is($WARN, '', "ignore warning on embedded nul with no warnings syscalls");
+  }
+}
diff --git a/t/lib/warnings/6default b/t/lib/warnings/6default
index a8aafee..a501961 100644
--- a/t/lib/warnings/6default
+++ b/t/lib/warnings/6default
@@ -119,3 +119,8 @@ no warnings;
 }
 EXPECT
 
+########
+# Check if syscalls warn by default
+open(my $i, "foo\0bar");
+EXPECT
+Invalid \0 character in pathname: foo\0bar at - line 2.
diff --git a/t/op/require_errors.t b/t/op/require_errors.t
index e323948..7c102cd 100644
--- a/t/op/require_errors.t
+++ b/t/op/require_errors.t
@@ -7,7 +7,7 @@ BEGIN {
     require './test.pl';
 }
 
-plan(tests => 11);
+plan(tests => 16);
 
 my $nonfile = tempfile();
 
@@ -111,3 +111,19 @@ SKIP: {
 # I can't see how to test the EMFILE case
 # I can't see how to test the case of not displaying @INC in the message.
 # (and does that only happen on VMS?)
+
+# fail and print the full filename
+eval { no warnings 'syscalls'; require "strict.pm\0invalid"; };
+like $@, qr/^Can't locate strict\.pm\\0invalid: /, 'require nul check [perl #117265]';
+eval { no warnings 'syscalls'; do "strict.pm\0invalid"; };
+like $@, qr/^Can't locate strict\.pm\\0invalid: /, 'do nul check';
+{
+  my $WARN;
+  local $SIG{__WARN__} = sub { $WARN = shift };
+  eval { require "strict.pm\0invalid"; };
+  like $WARN, qr{^Invalid \\0 character in pathname: strict\.pm\\0invalid at }, 'nul warning';
+  like $@, qr{^Can't locate strict\.pm\\0invalid: }, 'nul error';
+}
+eval "require strict\0::invalid;";
+like $@, qr/^syntax error at \(eval \d+\) line 1/, 'parse error with \0 in barewords module names';
+
diff --git a/t/porting/diag.t b/t/porting/diag.t
index 2473e65..c8a3db2 100644
--- a/t/porting/diag.t
+++ b/t/porting/diag.t
@@ -476,6 +476,7 @@ Can't coerce readonly %s to string in %s
 Can't find string terminator %c%s%c anywhere before EOF
 Can't fix broken locale name "%s"
 Can't get short module name from a handle
+Can't locate %s:   %s
 Can't locate object method "%s" via package "%s" (perhaps you forgot to load "%s"?)
 Can't pipe "%s": %s
 Can't spawn: %s
diff --git a/warnings.h b/warnings.h
index d0bf710..cff4ed3 100644
--- a/warnings.h
+++ b/warnings.h
@@ -93,6 +93,7 @@
 #define WARN_EXPERIMENTAL__LEXICAL_SUBS 52
 #define WARN_EXPERIMENTAL__LEXICAL_TOPIC 53
 #define WARN_EXPERIMENTAL__REGEX_SETS 54
+#define WARN_SYSCALLS		 55
 
 #define WARNsize		14
 #define WARN_ALLstring		"\125\125\125\125\125\125\125\125\125\125\125\125\125\125"
-- 
1.7.10.4

@p5pRT
Copy link
Author

p5pRT commented Jun 27, 2013

From @tonycoz

On Tue Mar 26 14​:29​:41 2013, rurban wrote​:

Attached is the revised patch

Check for the nul char in pathnames and string arguments to
syscalls, return undef and set errno to ENOENT.
Added to the default severe warnings category syscalls.

Strings with embedded \0 chars were prev. ignored in the syscall but
kept in perl. The hidden payloads in these invalid string args may cause
unnoticed security problems, as they are hard to detect, ignored by
the syscalls but kept around in perl PVs.
Allow an ending \0 though, as several modules add a \0 to
such strings without adjusting the length.
Ignored on WinCE since this uses the wide char API.

Hi Reini,

I've had a look over this patch, but I can see it's treating a failure
due to \0 differently to failure due to an actual missing file, eg​:

tony@​mars​:.../git/perl$ ./perl -Wle 'print unlink "def", "abc\0def", "ghi"'
Invalid \0 character in pathname​: abc\0def at -e line 1.
1
tony@​mars​:.../git/perl$ ./perl -Wle 'print unlink "def", "abc", "ghi"'
0

I've attached a version of the patch updated to work with blead.

Tony

@p5pRT
Copy link
Author

p5pRT commented Jun 27, 2013

From @tonycoz

0001-safesyscalls-check-embedded-nul-in-syscall-args.patch
From ae9a37770ca59a6c76cc784bada78650c329bde1 Mon Sep 17 00:00:00 2001
From: Reini Urban <rurban@x-ray.at>
Date: Tue, 12 Mar 2013 19:25:26 +0100
Subject: [PATCH] safesyscalls: check embedded nul in syscall args

Check for the nul char in pathnames and string arguments to
syscalls, return undef and set errno to ENOENT.
Added to the default severe warnings category syscalls.

Strings with embedded \0 chars were prev. ignored in the syscall but
kept in perl. The hidden payloads in these invalid string args may cause
unnoticed security problems, as they are hard to detect, ignored by
the syscalls but kept around in perl PVs.
Allow an ending \0 though, as several modules add a \0 to
such strings without adjusting the length.
Ignored on WinCE since this uses the wide char API.
---
 doio.c                  |    9 +-
 embed.fnc               |    4 +-
 ext/File-Glob/Glob.pm   |    2 +-
 ext/File-Glob/Glob.xs   |    3 +-
 lib/warnings.pm         |  237 ++++++++++++++++++++++++-----------------------
 perl.h                  |   23 +++++
 perlio.c                |    4 +
 pod/perldiag.pod        |    9 ++
 pod/perllexwarn.pod     |    2 +
 pp_ctl.c                |    5 +
 regen/warnings.pl       |    7 +-
 t/io/open.t             |   27 +++++-
 t/lib/warnings/6default |    5 +
 t/op/caller.t           |    4 +-
 t/op/require_errors.t   |   18 +++-
 t/porting/diag.t        |    1 +
 warnings.h              |    7 +-
 17 files changed, 235 insertions(+), 132 deletions(-)

diff --git a/doio.c b/doio.c
index b24a5b4..7b631c2 100644
--- a/doio.c
+++ b/doio.c
@@ -216,6 +216,7 @@ Perl_do_openn(pTHX_ GV *gv, const char *oname, I32 len, int as_raw,
 		goto say_false;
 	    }
 #endif /* USE_STDIO */
+	    CHECK_PATHNAME(*svp, FALSE);
 	    name = (SvOK(*svp) || SvGMAGICAL(*svp)) ?
 			savesvpv (*svp) : savepvs ("");
 	    SAVEFREEPV(name);
@@ -1660,6 +1661,7 @@ Perl_apply(pTHX_ I32 type, SV **mark, SV **sp)
 		else {
 		    const char *name = SvPV_nomg_const_nolen(*mark);
 		    APPLY_TAINT_PROPER();
+		    CHECK_PATHNAME(*mark, --tot);
 		    if (PerlLIO_chmod(name, val))
 			tot--;
 		}
@@ -1694,6 +1696,7 @@ Perl_apply(pTHX_ I32 type, SV **mark, SV **sp)
 		else {
 		    const char *name = SvPV_nomg_const_nolen(*mark);
 		    APPLY_TAINT_PROPER();
+		    CHECK_PATHNAME(*mark, --tot);
 		    if (PerlLIO_chown(name, val, val2))
 			tot--;
 		}
@@ -1712,6 +1715,7 @@ nothing in the core.
 	APPLY_TAINT_PROPER();
 	if (mark == sp)
 	    break;
+	CHECK_SYSCALL(*mark, --tot);
 	s = SvPVx_const(*++mark, len);
 	if (*s == '-' && isALPHA(s[1]))
 	{
@@ -1724,7 +1728,7 @@ nothing in the core.
 		s += 3;
                 len -= 3;
             }
-           if ((val = whichsig_pvn(s, len)) < 0)
+	    if ((val = whichsig_pvn(s, len)) < 0)
                Perl_croak(aTHX_ "Unrecognized signal name \"%"SVf"\"", SVfARG(*mark));
 	}
 	else
@@ -1795,6 +1799,7 @@ nothing in the core.
 	while (++mark <= sp) {
 	    s = SvPV_nolen_const(*mark);
 	    APPLY_TAINT_PROPER();
+	    CHECK_PATHNAME(*mark, --tot);
 	    if (PerlProc_geteuid() || PL_unsafe) {
 		if (UNLINK(s))
 		    tot--;
@@ -1873,6 +1878,7 @@ nothing in the core.
 		else {
 		    const char * const name = SvPV_nomg_const_nolen(*mark);
 		    APPLY_TAINT_PROPER();
+		    CHECK_PATHNAME(*mark, --tot);
 #ifdef HAS_FUTIMES
 		    if (utimes(name, (struct timeval *)utbufp))
 #else
@@ -2365,6 +2371,7 @@ Perl_start_glob (pTHX_ SV *tmpglob, IO *io)
 
     PERL_ARGS_ASSERT_START_GLOB;
 
+    CHECK_SYSCALL(tmpglob, NULL);
     ENTER;
     SAVEFREESV(tmpcmd);
 #ifdef VMS /* expand the wildcards right here, rather than opening a pipe, */
diff --git a/embed.fnc b/embed.fnc
index a906eae..070df24 100644
--- a/embed.fnc
+++ b/embed.fnc
@@ -2257,7 +2257,7 @@ s	|void	|printbuf	|NN const char *const fmt|NN const char *const s
 #endif
 
 #if defined(PERL_IN_UNIVERSAL_C)
-s	|bool|isa_lookup	|NN HV *stash|NN const char * const name \
+s	|bool	|isa_lookup	|NN HV *stash|NN const char * const name \
                                         |STRLEN len|U32 flags
 #endif
 
@@ -2270,7 +2270,7 @@ s	|char*	|stdize_locale	|NN char* locs
 #if defined(PERL_IN_UTIL_C)
 s	|const COP*|closest_cop	|NN const COP *cop|NULLOK const OP *o
 s	|SV*	|mess_alloc
-s	|SV *|with_queued_errors|NN SV *ex
+s	|SV *	|with_queued_errors|NN SV *ex
 s	|bool	|invoke_exception_hook|NULLOK SV *ex|bool warn
 #if defined(PERL_MEM_LOG) && !defined(PERL_MEM_LOG_NOIMPL)
 sn	|void	|mem_log_common	|enum mem_log_type mlt|const UV n|const UV typesize \
diff --git a/ext/File-Glob/Glob.pm b/ext/File-Glob/Glob.pm
index 379d7f0..3001679 100644
--- a/ext/File-Glob/Glob.pm
+++ b/ext/File-Glob/Glob.pm
@@ -37,7 +37,7 @@ pop @{$EXPORT_TAGS{bsd_glob}}; # no "glob"
 
 @EXPORT_OK   = (@{$EXPORT_TAGS{'glob'}}, 'csh_glob');
 
-$VERSION = '1.20';
+$VERSION = '1.21';
 
 sub import {
     require Exporter;
diff --git a/ext/File-Glob/Glob.xs b/ext/File-Glob/Glob.xs
index df5530a..aa2b186 100644
--- a/ext/File-Glob/Glob.xs
+++ b/ext/File-Glob/Glob.xs
@@ -227,7 +227,8 @@ csh_glob(pTHX_ AV *entries, SV *patsv)
 
 	assert(SvTYPE(entries) != SVt_PVAV);
 	sv_upgrade((SV *)entries, SVt_PVAV);
-	
+	CHECK_SYSCALL(patsv, TRUE);
+
 	if (patav) {
 	    I32 items = AvFILLp(patav) + 1;
 	    SV **svp = AvARRAY(patav);
diff --git a/lib/warnings.pm b/lib/warnings.pm
index 7d988cb..ad32517 100644
--- a/lib/warnings.pm
+++ b/lib/warnings.pm
@@ -5,7 +5,7 @@
 
 package warnings;
 
-our $VERSION = '1.18';
+our $VERSION = '1.19';
 
 # Verify that we're called correctly so that warnings will work.
 # see also strict.pm.
@@ -233,130 +233,133 @@ our %Offsets = (
     'experimental::lexical_topic'=> 106,
     'experimental::regex_sets'=> 108,
     'experimental::smartmatch'=> 110,
+    'syscalls'		=> 112,
   );
 
 our %Bits = (
-    'all'		=> "\x55\x55\x55\x55\x55\x55\x55\x55\x55\x55\x55\x55\x55\x55", # [0..55]
-    'ambiguous'		=> "\x00\x00\x00\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00", # [29]
-    'bareword'		=> "\x00\x00\x00\x00\x00\x00\x00\x10\x00\x00\x00\x00\x00\x00", # [30]
-    'closed'		=> "\x00\x10\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [6]
-    'closure'		=> "\x04\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [1]
-    'debugging'		=> "\x00\x00\x00\x00\x00\x10\x00\x00\x00\x00\x00\x00\x00\x00", # [22]
-    'deprecated'	=> "\x10\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [2]
-    'digit'		=> "\x00\x00\x00\x00\x00\x00\x00\x40\x00\x00\x00\x00\x00\x00", # [31]
-    'exec'		=> "\x00\x40\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [7]
-    'exiting'		=> "\x40\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [3]
-    'experimental'	=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x40\x55", # [51..55]
-    'experimental::lexical_subs'=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01", # [52]
-    'experimental::lexical_topic'=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x04", # [53]
-    'experimental::regex_sets'=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x10", # [54]
-    'experimental::smartmatch'=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x40", # [55]
-    'glob'		=> "\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [4]
-    'illegalproto'	=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x40\x00\x00", # [47]
-    'imprecision'	=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x10\x00\x00", # [46]
-    'inplace'		=> "\x00\x00\x00\x00\x00\x40\x00\x00\x00\x00\x00\x00\x00\x00", # [23]
-    'internal'		=> "\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00", # [24]
-    'io'		=> "\x00\x54\x55\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [5..11]
-    'layer'		=> "\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [8]
-    'malloc'		=> "\x00\x00\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00", # [25]
-    'misc'		=> "\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [12]
-    'newline'		=> "\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [9]
-    'non_unicode'	=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\x00", # [48]
-    'nonchar'		=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x04\x00", # [49]
-    'numeric'		=> "\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [13]
-    'once'		=> "\x00\x00\x00\x10\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [14]
-    'overflow'		=> "\x00\x00\x00\x40\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [15]
-    'pack'		=> "\x00\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [16]
-    'parenthesis'	=> "\x00\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x00\x00", # [32]
-    'pipe'		=> "\x00\x00\x10\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [10]
-    'portable'		=> "\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [17]
-    'precedence'	=> "\x00\x00\x00\x00\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00", # [33]
-    'printf'		=> "\x00\x00\x00\x00\x00\x00\x00\x00\x10\x00\x00\x00\x00\x00", # [34]
-    'prototype'		=> "\x00\x00\x00\x00\x00\x00\x00\x00\x40\x00\x00\x00\x00\x00", # [35]
-    'qw'		=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x00", # [36]
-    'recursion'		=> "\x00\x00\x00\x00\x10\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [18]
-    'redefine'		=> "\x00\x00\x00\x00\x40\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [19]
-    'regexp'		=> "\x00\x00\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00", # [20]
-    'reserved'		=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x04\x00\x00\x00\x00", # [37]
-    'semicolon'		=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x10\x00\x00\x00\x00", # [38]
-    'severe'		=> "\x00\x00\x00\x00\x00\x54\x05\x00\x00\x00\x00\x00\x00\x00", # [21..25]
-    'signal'		=> "\x00\x00\x00\x00\x00\x00\x10\x00\x00\x00\x00\x00\x00\x00", # [26]
-    'substr'		=> "\x00\x00\x00\x00\x00\x00\x40\x00\x00\x00\x00\x00\x00\x00", # [27]
-    'surrogate'		=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x10\x00", # [50]
-    'syntax'		=> "\x00\x00\x00\x00\x00\x00\x00\x55\x55\x15\x00\x40\x00\x00", # [28..38,47]
-    'taint'		=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x40\x00\x00\x00\x00", # [39]
-    'threads'		=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00", # [40]
-    'uninitialized'	=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x04\x00\x00\x00", # [41]
-    'unopened'		=> "\x00\x00\x40\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [11]
-    'unpack'		=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x10\x00\x00\x00", # [42]
-    'untie'		=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x40\x00\x00\x00", # [43]
-    'utf8'		=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\x15\x00", # [44,48..50]
-    'void'		=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x04\x00\x00", # [45]
+    'all'		=> "\x55\x55\x55\x55\x55\x55\x55\x55\x55\x55\x55\x55\x55\x55\x01", # [0..56]
+    'ambiguous'		=> "\x00\x00\x00\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00", # [29]
+    'bareword'		=> "\x00\x00\x00\x00\x00\x00\x00\x10\x00\x00\x00\x00\x00\x00\x00", # [30]
+    'closed'		=> "\x00\x10\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [6]
+    'closure'		=> "\x04\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [1]
+    'debugging'		=> "\x00\x00\x00\x00\x00\x10\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [22]
+    'deprecated'	=> "\x10\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [2]
+    'digit'		=> "\x00\x00\x00\x00\x00\x00\x00\x40\x00\x00\x00\x00\x00\x00\x00", # [31]
+    'exec'		=> "\x00\x40\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [7]
+    'exiting'		=> "\x40\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [3]
+    'experimental'	=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x40\x55\x00", # [51..55]
+    'experimental::lexical_subs'=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\x00", # [52]
+    'experimental::lexical_topic'=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x04\x00", # [53]
+    'experimental::regex_sets'=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x10\x00", # [54]
+    'experimental::smartmatch'=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x40\x00", # [55]
+    'glob'		=> "\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [4]
+    'illegalproto'	=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x40\x00\x00\x00", # [47]
+    'imprecision'	=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x10\x00\x00\x00", # [46]
+    'inplace'		=> "\x00\x00\x00\x00\x00\x40\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [23]
+    'internal'		=> "\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00", # [24]
+    'io'		=> "\x00\x54\x55\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [5..11]
+    'layer'		=> "\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [8]
+    'malloc'		=> "\x00\x00\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x00", # [25]
+    'misc'		=> "\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [12]
+    'newline'		=> "\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [9]
+    'non_unicode'	=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00", # [48]
+    'nonchar'		=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x04\x00\x00", # [49]
+    'numeric'		=> "\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [13]
+    'once'		=> "\x00\x00\x00\x10\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [14]
+    'overflow'		=> "\x00\x00\x00\x40\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [15]
+    'pack'		=> "\x00\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [16]
+    'parenthesis'	=> "\x00\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00", # [32]
+    'pipe'		=> "\x00\x00\x10\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [10]
+    'portable'		=> "\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [17]
+    'precedence'	=> "\x00\x00\x00\x00\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00", # [33]
+    'printf'		=> "\x00\x00\x00\x00\x00\x00\x00\x00\x10\x00\x00\x00\x00\x00\x00", # [34]
+    'prototype'		=> "\x00\x00\x00\x00\x00\x00\x00\x00\x40\x00\x00\x00\x00\x00\x00", # [35]
+    'qw'		=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x00\x00", # [36]
+    'recursion'		=> "\x00\x00\x00\x00\x10\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [18]
+    'redefine'		=> "\x00\x00\x00\x00\x40\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [19]
+    'regexp'		=> "\x00\x00\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [20]
+    'reserved'		=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00", # [37]
+    'semicolon'		=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x10\x00\x00\x00\x00\x00", # [38]
+    'severe'		=> "\x00\x00\x00\x00\x00\x54\x05\x00\x00\x00\x00\x00\x00\x00\x01", # [21..25,56]
+    'signal'		=> "\x00\x00\x00\x00\x00\x00\x10\x00\x00\x00\x00\x00\x00\x00\x00", # [26]
+    'substr'		=> "\x00\x00\x00\x00\x00\x00\x40\x00\x00\x00\x00\x00\x00\x00\x00", # [27]
+    'surrogate'		=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x10\x00\x00", # [50]
+    'syntax'		=> "\x00\x00\x00\x00\x00\x00\x00\x55\x55\x15\x00\x40\x00\x00\x00", # [28..38,47]
+    'syscalls'		=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01", # [56]
+    'taint'		=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x40\x00\x00\x00\x00\x00", # [39]
+    'threads'		=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x00", # [40]
+    'uninitialized'	=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x04\x00\x00\x00\x00", # [41]
+    'unopened'		=> "\x00\x00\x40\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [11]
+    'unpack'		=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x10\x00\x00\x00\x00", # [42]
+    'untie'		=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x40\x00\x00\x00\x00", # [43]
+    'utf8'		=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\x15\x00\x00", # [44,48..50]
+    'void'		=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x04\x00\x00\x00", # [45]
   );
 
 our %DeadBits = (
-    'all'		=> "\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa", # [0..55]
-    'ambiguous'		=> "\x00\x00\x00\x00\x00\x00\x00\x08\x00\x00\x00\x00\x00\x00", # [29]
-    'bareword'		=> "\x00\x00\x00\x00\x00\x00\x00\x20\x00\x00\x00\x00\x00\x00", # [30]
-    'closed'		=> "\x00\x20\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [6]
-    'closure'		=> "\x08\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [1]
-    'debugging'		=> "\x00\x00\x00\x00\x00\x20\x00\x00\x00\x00\x00\x00\x00\x00", # [22]
-    'deprecated'	=> "\x20\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [2]
-    'digit'		=> "\x00\x00\x00\x00\x00\x00\x00\x80\x00\x00\x00\x00\x00\x00", # [31]
-    'exec'		=> "\x00\x80\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [7]
-    'exiting'		=> "\x80\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [3]
-    'experimental'	=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x80\xaa", # [51..55]
-    'experimental::lexical_subs'=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02", # [52]
-    'experimental::lexical_topic'=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x08", # [53]
-    'experimental::regex_sets'=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x20", # [54]
-    'experimental::smartmatch'=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x80", # [55]
-    'glob'		=> "\x00\x02\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [4]
-    'illegalproto'	=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x80\x00\x00", # [47]
-    'imprecision'	=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x20\x00\x00", # [46]
-    'inplace'		=> "\x00\x00\x00\x00\x00\x80\x00\x00\x00\x00\x00\x00\x00\x00", # [23]
-    'internal'		=> "\x00\x00\x00\x00\x00\x00\x02\x00\x00\x00\x00\x00\x00\x00", # [24]
-    'io'		=> "\x00\xa8\xaa\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [5..11]
-    'layer'		=> "\x00\x00\x02\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [8]
-    'malloc'		=> "\x00\x00\x00\x00\x00\x00\x08\x00\x00\x00\x00\x00\x00\x00", # [25]
-    'misc'		=> "\x00\x00\x00\x02\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [12]
-    'newline'		=> "\x00\x00\x08\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [9]
-    'non_unicode'	=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02\x00", # [48]
-    'nonchar'		=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x08\x00", # [49]
-    'numeric'		=> "\x00\x00\x00\x08\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [13]
-    'once'		=> "\x00\x00\x00\x20\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [14]
-    'overflow'		=> "\x00\x00\x00\x80\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [15]
-    'pack'		=> "\x00\x00\x00\x00\x02\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [16]
-    'parenthesis'	=> "\x00\x00\x00\x00\x00\x00\x00\x00\x02\x00\x00\x00\x00\x00", # [32]
-    'pipe'		=> "\x00\x00\x20\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [10]
-    'portable'		=> "\x00\x00\x00\x00\x08\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [17]
-    'precedence'	=> "\x00\x00\x00\x00\x00\x00\x00\x00\x08\x00\x00\x00\x00\x00", # [33]
-    'printf'		=> "\x00\x00\x00\x00\x00\x00\x00\x00\x20\x00\x00\x00\x00\x00", # [34]
-    'prototype'		=> "\x00\x00\x00\x00\x00\x00\x00\x00\x80\x00\x00\x00\x00\x00", # [35]
-    'qw'		=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02\x00\x00\x00\x00", # [36]
-    'recursion'		=> "\x00\x00\x00\x00\x20\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [18]
-    'redefine'		=> "\x00\x00\x00\x00\x80\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [19]
-    'regexp'		=> "\x00\x00\x00\x00\x00\x02\x00\x00\x00\x00\x00\x00\x00\x00", # [20]
-    'reserved'		=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x08\x00\x00\x00\x00", # [37]
-    'semicolon'		=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x20\x00\x00\x00\x00", # [38]
-    'severe'		=> "\x00\x00\x00\x00\x00\xa8\x0a\x00\x00\x00\x00\x00\x00\x00", # [21..25]
-    'signal'		=> "\x00\x00\x00\x00\x00\x00\x20\x00\x00\x00\x00\x00\x00\x00", # [26]
-    'substr'		=> "\x00\x00\x00\x00\x00\x00\x80\x00\x00\x00\x00\x00\x00\x00", # [27]
-    'surrogate'		=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x20\x00", # [50]
-    'syntax'		=> "\x00\x00\x00\x00\x00\x00\x00\xaa\xaa\x2a\x00\x80\x00\x00", # [28..38,47]
-    'taint'		=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x80\x00\x00\x00\x00", # [39]
-    'threads'		=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02\x00\x00\x00", # [40]
-    'uninitialized'	=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x08\x00\x00\x00", # [41]
-    'unopened'		=> "\x00\x00\x80\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [11]
-    'unpack'		=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x20\x00\x00\x00", # [42]
-    'untie'		=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x80\x00\x00\x00", # [43]
-    'utf8'		=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02\x2a\x00", # [44,48..50]
-    'void'		=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x08\x00\x00", # [45]
+    'all'		=> "\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa\x02", # [0..56]
+    'ambiguous'		=> "\x00\x00\x00\x00\x00\x00\x00\x08\x00\x00\x00\x00\x00\x00\x00", # [29]
+    'bareword'		=> "\x00\x00\x00\x00\x00\x00\x00\x20\x00\x00\x00\x00\x00\x00\x00", # [30]
+    'closed'		=> "\x00\x20\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [6]
+    'closure'		=> "\x08\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [1]
+    'debugging'		=> "\x00\x00\x00\x00\x00\x20\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [22]
+    'deprecated'	=> "\x20\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [2]
+    'digit'		=> "\x00\x00\x00\x00\x00\x00\x00\x80\x00\x00\x00\x00\x00\x00\x00", # [31]
+    'exec'		=> "\x00\x80\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [7]
+    'exiting'		=> "\x80\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [3]
+    'experimental'	=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x80\xaa\x00", # [51..55]
+    'experimental::lexical_subs'=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02\x00", # [52]
+    'experimental::lexical_topic'=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x08\x00", # [53]
+    'experimental::regex_sets'=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x20\x00", # [54]
+    'experimental::smartmatch'=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x80\x00", # [55]
+    'glob'		=> "\x00\x02\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [4]
+    'illegalproto'	=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x80\x00\x00\x00", # [47]
+    'imprecision'	=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x20\x00\x00\x00", # [46]
+    'inplace'		=> "\x00\x00\x00\x00\x00\x80\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [23]
+    'internal'		=> "\x00\x00\x00\x00\x00\x00\x02\x00\x00\x00\x00\x00\x00\x00\x00", # [24]
+    'io'		=> "\x00\xa8\xaa\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [5..11]
+    'layer'		=> "\x00\x00\x02\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [8]
+    'malloc'		=> "\x00\x00\x00\x00\x00\x00\x08\x00\x00\x00\x00\x00\x00\x00\x00", # [25]
+    'misc'		=> "\x00\x00\x00\x02\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [12]
+    'newline'		=> "\x00\x00\x08\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [9]
+    'non_unicode'	=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02\x00\x00", # [48]
+    'nonchar'		=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x08\x00\x00", # [49]
+    'numeric'		=> "\x00\x00\x00\x08\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [13]
+    'once'		=> "\x00\x00\x00\x20\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [14]
+    'overflow'		=> "\x00\x00\x00\x80\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [15]
+    'pack'		=> "\x00\x00\x00\x00\x02\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [16]
+    'parenthesis'	=> "\x00\x00\x00\x00\x00\x00\x00\x00\x02\x00\x00\x00\x00\x00\x00", # [32]
+    'pipe'		=> "\x00\x00\x20\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [10]
+    'portable'		=> "\x00\x00\x00\x00\x08\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [17]
+    'precedence'	=> "\x00\x00\x00\x00\x00\x00\x00\x00\x08\x00\x00\x00\x00\x00\x00", # [33]
+    'printf'		=> "\x00\x00\x00\x00\x00\x00\x00\x00\x20\x00\x00\x00\x00\x00\x00", # [34]
+    'prototype'		=> "\x00\x00\x00\x00\x00\x00\x00\x00\x80\x00\x00\x00\x00\x00\x00", # [35]
+    'qw'		=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02\x00\x00\x00\x00\x00", # [36]
+    'recursion'		=> "\x00\x00\x00\x00\x20\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [18]
+    'redefine'		=> "\x00\x00\x00\x00\x80\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [19]
+    'regexp'		=> "\x00\x00\x00\x00\x00\x02\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [20]
+    'reserved'		=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x08\x00\x00\x00\x00\x00", # [37]
+    'semicolon'		=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x20\x00\x00\x00\x00\x00", # [38]
+    'severe'		=> "\x00\x00\x00\x00\x00\xa8\x0a\x00\x00\x00\x00\x00\x00\x00\x02", # [21..25,56]
+    'signal'		=> "\x00\x00\x00\x00\x00\x00\x20\x00\x00\x00\x00\x00\x00\x00\x00", # [26]
+    'substr'		=> "\x00\x00\x00\x00\x00\x00\x80\x00\x00\x00\x00\x00\x00\x00\x00", # [27]
+    'surrogate'		=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x20\x00\x00", # [50]
+    'syntax'		=> "\x00\x00\x00\x00\x00\x00\x00\xaa\xaa\x2a\x00\x80\x00\x00\x00", # [28..38,47]
+    'syscalls'		=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02", # [56]
+    'taint'		=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x80\x00\x00\x00\x00\x00", # [39]
+    'threads'		=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02\x00\x00\x00\x00", # [40]
+    'uninitialized'	=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x08\x00\x00\x00\x00", # [41]
+    'unopened'		=> "\x00\x00\x80\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", # [11]
+    'unpack'		=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x20\x00\x00\x00\x00", # [42]
+    'untie'		=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x80\x00\x00\x00\x00", # [43]
+    'utf8'		=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02\x2a\x00\x00", # [44,48..50]
+    'void'		=> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x08\x00\x00\x00", # [45]
   );
 
-$NONE     = "\0\0\0\0\0\0\0\0\0\0\0\0\0\0";
-$DEFAULT  = "\x10\x01\x00\x00\x00\x50\x04\x00\x00\x00\x00\x00\x00\x55", # [2,52..55,4,22,23,25]
-$LAST_BIT = 112 ;
-$BYTES    = 14 ;
+$NONE     = "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0";
+$DEFAULT  = "\x10\x01\x00\x00\x00\x50\x04\x00\x00\x00\x00\x00\x00\x55\x01", # [2,52..55,4,22,23,25,56]
+$LAST_BIT = 114 ;
+$BYTES    = 15 ;
 
 $All = "" ; vec($All, $Offsets{'all'}, 2) = 3 ;
 
diff --git a/perl.h b/perl.h
index 613fd3c..fca7b22 100644
--- a/perl.h
+++ b/perl.h
@@ -5650,6 +5650,29 @@ extern void moncontrol(int);
 #  define do_aexec(really, mark,sp)	do_aexec5(really, mark, sp, 0, 0)
 #endif
 
+/* check embedded \0 characters in pathnames passed to syscalls,
+   but allow one ending \0 */
+#define CHECK_PATHNAME(pv,r) CHECK_SAFESYSCALLS("pathname",pv,r)
+/* check embedded \0 characters in args for various other syscalls.
+   user and group names, glob. */
+#define CHECK_SYSCALL(pv,r)  CHECK_SAFESYSCALLS("syscall",pv,r)
+/* if (CHECK_SYSCALL(pv, return NULL)); */
+#if defined(WIN32) && (defined(UNICODE) || defined(UNDER_CE))
+#define CHECK_SAFESYSCALLS(what,pv,reaction)
+/* TODO: wchar API, so far only used on wince */
+#else
+#define CHECK_SAFESYSCALLS(what,pv,reaction)                            \
+    if (SvPOK(pv)) {                                                    \
+      char *i;                                                          \
+      char *p = SvPVX(pv);                                              \
+      if ( (i = strchr(p, 0))&&(i-p)&&((size_t)(i-p)<SvCUR(pv)-1) ) {   \
+        SETERRNO(ENOENT, LIB_INVARG);                                   \
+        if (ckWARN_d(WARN_SYSCALLS)) {                                  \
+            ck_warner_d(packWARN(WARN_SYSCALLS),                        \
+                        "Invalid \\0 character in %s: %s\\0%s", what,p,++i); \
+        } return reaction; }}
+#endif
+
 #if defined(OEMVS)
 #define NO_ENV_ARRAY_IN_MAIN
 #endif
diff --git a/perlio.c b/perlio.c
index 2e5a77d..01fb764 100644
--- a/perlio.c
+++ b/perlio.c
@@ -310,6 +310,7 @@ PerlIO_openn(pTHX_ const char *layers, const char *mode, int fd,
 	    return PerlIO_tmpfile();
 	else {
 	    const char *name = SvPV_nolen_const(*args);
+	    CHECK_PATHNAME(*args, NULL);
 	    if (*mode == IoTYPE_NUMERIC) {
 		fd = PerlLIO_open3(name, imode, perm);
 		if (fd >= 0)
@@ -2719,6 +2720,7 @@ PerlIOUnix_open(pTHX_ PerlIO_funcs *self, PerlIO_list_t *layers,
 	}
 	if (imode != -1) {
 	    const char *path = SvPV_nolen_const(*args);
+	    CHECK_PATHNAME(*args, NULL);
 	    fd = PerlLIO_open3(path, imode, perm);
 	}
     }
@@ -3033,6 +3035,7 @@ PerlIOStdio_open(pTHX_ PerlIO_funcs *self, PerlIO_list_t *layers,
 	const char * const path = SvPV_nolen_const(*args);
 	PerlIOStdio * const s = PerlIOSelf(f, PerlIOStdio);
 	FILE *stdio;
+	CHECK_PATHNAME(*args, NULL);
 	PerlIOUnix_refcnt_dec(fileno(s->stdio));
 	stdio = PerlSIO_freopen(path, (mode = PerlIOStdio_mode(mode, tmode)),
 			    s->stdio);
@@ -3045,6 +3048,7 @@ PerlIOStdio_open(pTHX_ PerlIO_funcs *self, PerlIO_list_t *layers,
     else {
 	if (narg > 0) {
 	    const char * const path = SvPV_nolen_const(*args);
+	    CHECK_PATHNAME(*args, NULL);
 	    if (*mode == IoTYPE_NUMERIC) {
 		mode++;
 		fd = PerlLIO_open3(path, imode, perm);
diff --git a/pod/perldiag.pod b/pod/perldiag.pod
index 80fafaf..60d436f 100644
--- a/pod/perldiag.pod
+++ b/pod/perldiag.pod
@@ -2496,6 +2496,15 @@ the indicated name isn't valid.  See L<charnames/CUSTOM ALIASES>.
 (F) Only certain characters are valid for character names.  The
 indicated one isn't.  See L<charnames/CUSTOM ALIASES>.
 
+=item Invalid \0 character in %s: %s\0%s
+
+(S) Embedded \0 characters in pathnames or other syscall arguments
+create a warning since 5.18. The parts after the \0 were formerly
+ignored by syscalls.
+
+If you know what you are doing you can turn off this warning by
+C<no warnings 'syscalls';>.
+
 =item Invalid conversion in %s: "%s"
 
 (W printf) Perl does not understand the given format conversion.  See
diff --git a/pod/perllexwarn.pod b/pod/perllexwarn.pod
index b193e3c..97d77f7 100644
--- a/pod/perllexwarn.pod
+++ b/pod/perllexwarn.pod
@@ -281,6 +281,8 @@ will be lost.
          |                 +- internal
          |                 |
          |                 +- malloc
+         |                 |
+         |                 +- syscalls
          |
          +- signal
          |
diff --git a/pp_ctl.c b/pp_ctl.c
index f68336a..488fcf3 100644
--- a/pp_ctl.c
+++ b/pp_ctl.c
@@ -3590,6 +3590,7 @@ S_check_type_and_open(pTHX_ SV *name)
 
     PERL_ARGS_ASSERT_CHECK_TYPE_AND_OPEN;
 
+    CHECK_PATHNAME(name, NULL);
     if (st_rc < 0 || S_ISDIR(st.st_mode) || S_ISBLK(st.st_mode)) {
 	return NULL;
     }
@@ -3742,6 +3743,10 @@ PP(pp_require)
     name = SvPV_const(sv, len);
     if (!(name && len > 0 && *name))
 	DIE(aTHX_ "Null filename used");
+    CHECK_PATHNAME(sv, Perl_die(aTHX_ "Can't locate %s:   %s",
+      pv_escape(newSVpvs_flags("",SVs_TEMP),SvPVX(sv),SvCUR(sv),SvCUR(sv)*2,NULL,
+                SvUTF8(sv)?PERL_PV_ESCAPE_UNI:0),
+      Strerror(ENOENT)));
     TAINT_PROPER("require");
 
     path_searchable = path_is_searchable(name);
diff --git a/regen/warnings.pl b/regen/warnings.pl
index 72d9a0b..3265712 100644
--- a/regen/warnings.pl
+++ b/regen/warnings.pl
@@ -19,7 +19,7 @@
 #
 # This script is normally invoked from regen.pl.
 
-$VERSION = '1.02_03';
+$VERSION = '1.02_05';
 
 BEGIN {
     require 'regen/regen_lib.pl';
@@ -59,7 +59,8 @@ my $tree = {
 	 			'internal'	=> [ 5.008, DEFAULT_OFF],
          			'debugging'	=> [ 5.008, DEFAULT_ON],
          			'malloc'	=> [ 5.008, DEFAULT_ON],
-	 		   }],
+			        'syscalls'      => [ 5.017, DEFAULT_ON],
+			   }],
         'deprecated'	=> [ 5.008, DEFAULT_ON],
        	'void'		=> [ 5.008, DEFAULT_OFF],
        	'recursion'	=> [ 5.008, DEFAULT_OFF],
@@ -465,7 +466,7 @@ close_and_rename($lexwarn);
 __END__
 package warnings;
 
-our $VERSION = '1.18';
+our $VERSION = '1.19';
 
 # Verify that we're called correctly so that warnings will work.
 # see also strict.pm.
diff --git a/t/io/open.t b/t/io/open.t
index ef56dda..94601d0 100644
--- a/t/io/open.t
+++ b/t/io/open.t
@@ -10,7 +10,7 @@ $|  = 1;
 use warnings;
 use Config;
 
-plan tests => 122;
+plan tests => 132;
 
 my $Perl = which_perl();
 
@@ -391,3 +391,28 @@ sub _117941 { package _117941; open my $a, "TEST" }
 delete $::{"_117941::"};
 _117941();
 pass("no crash when open autovivifies glob in freed package");
+
+# [perl #117265] check for embedded nul in pathnames, allow ending \0 though
+{
+  my $WARN;
+  local $SIG{__WARN__} = sub { $WARN = shift };
+  my $fn = "tmp\0.invalid";
+  is(open(I, $fn), undef, "open with nul in pathnames since 5.18 [perl #117265]");
+  like($WARN, qr/^Invalid \\0 character in pathname: tmp/,
+       "warn on embedded nul"); $WARN = '';
+  is (unlink($fn), 0);
+  like($WARN, qr/^Invalid \\0 character in pathname: tmp/,
+       "also on unlink"); $WARN = '';
+  is(chmod(0644, $fn), 0);
+  like($WARN, qr/^Invalid \\0 character in pathname: tmp/,
+       "also on chmod"); $WARN = '';
+  is (glob($fn), ());
+  like($WARN, qr/^Invalid \\0 character in syscall: tmp/,
+       "also on glob"); $WARN = '';
+  {
+    no warnings 'syscalls';
+    $WARN = '';
+    is(open(I, $fn), undef, "open with nul with no warnings syscalls");
+    is($WARN, '', "ignore warning on embedded nul with no warnings syscalls");
+  }
+}
diff --git a/t/lib/warnings/6default b/t/lib/warnings/6default
index a8aafee..a501961 100644
--- a/t/lib/warnings/6default
+++ b/t/lib/warnings/6default
@@ -119,3 +119,8 @@ no warnings;
 }
 EXPECT
 
+########
+# Check if syscalls warn by default
+open(my $i, "foo\0bar");
+EXPECT
+Invalid \0 character in pathname: foo\0bar at - line 2.
diff --git a/t/op/caller.t b/t/op/caller.t
index c37a6ed..09728d3 100644
--- a/t/op/caller.t
+++ b/t/op/caller.t
@@ -111,8 +111,8 @@ sub testwarn {
 
     # The repetition number must be set to the value of $BYTES in
     # lib/warnings.pm
-    BEGIN { check_bits( ${^WARNING_BITS}, "\0" x 14, 'all bits off via "no warnings"' ) }
-    testwarn("\0" x 14, 'no bits');
+    BEGIN { check_bits( ${^WARNING_BITS}, "\0" x 15, 'all bits off via "no warnings"' ) }
+    testwarn("\0" x 15, 'no bits');
 
     use warnings;
     BEGIN { check_bits( ${^WARNING_BITS}, $default,
diff --git a/t/op/require_errors.t b/t/op/require_errors.t
index e323948..7c102cd 100644
--- a/t/op/require_errors.t
+++ b/t/op/require_errors.t
@@ -7,7 +7,7 @@ BEGIN {
     require './test.pl';
 }
 
-plan(tests => 11);
+plan(tests => 16);
 
 my $nonfile = tempfile();
 
@@ -111,3 +111,19 @@ SKIP: {
 # I can't see how to test the EMFILE case
 # I can't see how to test the case of not displaying @INC in the message.
 # (and does that only happen on VMS?)
+
+# fail and print the full filename
+eval { no warnings 'syscalls'; require "strict.pm\0invalid"; };
+like $@, qr/^Can't locate strict\.pm\\0invalid: /, 'require nul check [perl #117265]';
+eval { no warnings 'syscalls'; do "strict.pm\0invalid"; };
+like $@, qr/^Can't locate strict\.pm\\0invalid: /, 'do nul check';
+{
+  my $WARN;
+  local $SIG{__WARN__} = sub { $WARN = shift };
+  eval { require "strict.pm\0invalid"; };
+  like $WARN, qr{^Invalid \\0 character in pathname: strict\.pm\\0invalid at }, 'nul warning';
+  like $@, qr{^Can't locate strict\.pm\\0invalid: }, 'nul error';
+}
+eval "require strict\0::invalid;";
+like $@, qr/^syntax error at \(eval \d+\) line 1/, 'parse error with \0 in barewords module names';
+
diff --git a/t/porting/diag.t b/t/porting/diag.t
index 729abaf..17aca0f 100644
--- a/t/porting/diag.t
+++ b/t/porting/diag.t
@@ -461,6 +461,7 @@ Cannot apply "%s" in non-PerlIO perl
 Can't find string terminator %c%s%c anywhere before EOF
 Can't fix broken locale name "%s"
 Can't get short module name from a handle
+Can't locate %s:   %s
 Can't locate object method "%s" via package "%s" (perhaps you forgot to load "%s"?)
 Can't pipe "%s": %s
 Can't spawn: %s
diff --git a/warnings.h b/warnings.h
index 5c40d5c..531876f 100644
--- a/warnings.h
+++ b/warnings.h
@@ -94,10 +94,11 @@
 #define WARN_EXPERIMENTAL__LEXICAL_TOPIC 53
 #define WARN_EXPERIMENTAL__REGEX_SETS 54
 #define WARN_EXPERIMENTAL__SMARTMATCH 55
+#define WARN_SYSCALLS		 56
 
-#define WARNsize		14
-#define WARN_ALLstring		"\125\125\125\125\125\125\125\125\125\125\125\125\125\125"
-#define WARN_NONEstring		"\0\0\0\0\0\0\0\0\0\0\0\0\0\0"
+#define WARNsize		15
+#define WARN_ALLstring		"\125\125\125\125\125\125\125\125\125\125\125\125\125\125\125"
+#define WARN_NONEstring		"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"
 
 #define isLEXWARN_on 	(PL_curcop->cop_warnings != pWARN_STD)
 #define isLEXWARN_off	(PL_curcop->cop_warnings == pWARN_STD)
-- 
1.7.10.4

@p5pRT
Copy link
Author

p5pRT commented Jun 27, 2013

From @rurban

On 06/27/2013 01​:45 AM, Tony Cook via RT wrote​:

On Tue Mar 26 14​:29​:41 2013, rurban wrote​:

Attached is the revised patch

Check for the nul char in pathnames and string arguments to
syscalls, return undef and set errno to ENOENT.
Added to the default severe warnings category syscalls.

Strings with embedded \0 chars were prev. ignored in the syscall but
kept in perl. The hidden payloads in these invalid string args may cause
unnoticed security problems, as they are hard to detect, ignored by
the syscalls but kept around in perl PVs.
Allow an ending \0 though, as several modules add a \0 to
such strings without adjusting the length.
Ignored on WinCE since this uses the wide char API.

Hi Reini,

I've had a look over this patch, but I can see it's treating a failure
due to \0 differently to failure due to an actual missing file, eg​:

tony@​mars​:.../git/perl$ ./perl -Wle 'print unlink "def", "abc\0def", "ghi"'
Invalid \0 character in pathname​: abc\0def at -e line 1.
1
tony@​mars​:.../git/perl$ ./perl -Wle 'print unlink "def", "abc", "ghi"'
0

I've attached a version of the patch updated to work with blead.

Great, Thanks for the improvements.
I've looked over it, LGTM

Can someone please apply it now.

--
Reini

Working towards a true Modern Perl.
Slim, functional, unbloated, compile-time optimizable

@p5pRT
Copy link
Author

p5pRT commented Jun 27, 2013

From @Leont

On Thu, Jun 27, 2013 at 3​:41 PM, Reini Urban <reini@​cpanel.net> wrote​:
4>

I've attached a version of the patch updated to work with blead.

Great, Thanks for the improvements.
I've looked over it, LGTM

Can someone please apply it now.

I think having macros called CHECK_<whatever> that return from the
function is unnecessarily obfuscating the code.

I also think we shouldn't turn this warning on by default without some
discussion about that.

Leon

@p5pRT
Copy link
Author

p5pRT commented Jun 27, 2013

From @rurban

On 06/27/2013 10​:52 AM, Leon Timmermans via RT wrote​:

On Thu, Jun 27, 2013 at 3​:41 PM, Reini Urban <reini@​cpanel.net> wrote​:
4>

I've attached a version of the patch updated to work with blead.

Great, Thanks for the improvements.
I've looked over it, LGTM

Can someone please apply it now.

I think having macros called CHECK_<whatever> that return from the
function is unnecessarily obfuscating the code.

Have you got a better idea?

  if (CHECK_SAFESYSCALL(pv)) return NULL;
  if (CHECK_SAFEPATHNAME(pv)) return undef;

were the alternatives.

I also think we shouldn't turn this warning on by default without some
discussion about that.

We return with undef from syscalling functions with a wrong name,
set $^E to ENOENT

I think turning on this warning by default makes sense​:

It only warns on wrong user data, not on code errors.
It's surprising, I want to see that attack on the console and especially
in log files, but have no strong opinion on it.

With require I die immediately.

~ grep DEFAULT_ON regen/warnings.pl

  'inplace' => [ 5.008, DEFAULT_ON],
  'debugging' => [ 5.008, DEFAULT_ON],
  'malloc' => [ 5.008, DEFAULT_ON],
  'syscalls' => [ 5.017, DEFAULT_ON],
  'deprecated' => [ 5.008, DEFAULT_ON],
  'glob' => [ 5.008, DEFAULT_ON],
  'experimental​::lexical_subs' => [ 5.017, DEFAULT_ON ],
  'experimental​::regex_sets' => [ 5.017, DEFAULT_ON ],
  'experimental​::lexical_topic' => [ 5.017, DEFAULT_ON ],
  'experimental​::smartmatch' => [ 5.017, DEFAULT_ON ],

--
Reini

@p5pRT
Copy link
Author

p5pRT commented Jun 28, 2013

From @tonycoz

Hi Reini,

On Thu Jun 27 06​:42​:14 2013, reini@​cpanel.net wrote​:

On 06/27/2013 01​:45 AM, Tony Cook via RT wrote​:

I've had a look over this patch, but I can see it's treating a
failure
due to \0 differently to failure due to an actual missing file, eg​:

tony@​mars​:.../git/perl$ ./perl -Wle 'print unlink "def", "abc\0def",
"ghi"'
Invalid \0 character in pathname​: abc\0def at -e line 1.
1
tony@​mars​:.../git/perl$ ./perl -Wle 'print unlink "def", "abc",
"ghi"'
0

I've attached a version of the patch updated to work with blead.

Great, Thanks for the improvements.
I've looked over it, LGTM

Can someone please apply it now.

I only updated it to merge cleanly with blead, I didn't fix the
problem[1] I described above caused by the macro simply returning.

That needs to be fixed.

I also missed updating the version number for DEFAULT_ON in warnings.pl.

Tony

[1] and related problems in other ops, that was just the simplest
example to demonstrate.

@p5pRT
Copy link
Author

p5pRT commented Jun 28, 2013

From @cpansprout

On Sat Mar 23 17​:12​:20 2013, perl.p5p@​rjbs.manxome.org wrote​:

* Lukas Mai <plokinom@​gmail.com> [2013-03-23T11​:34​:47]

I explicitly allow typical harmless programmer errors adding another \0
at the end.

My favorite semantics would be to not treat \0 specially, regardless of
warnings. That means failing with ENOENT if you pass such a string to
open, unlink, stat, ... etc, because as far as Perl is concerned, there
is a \0 in that string, and as far as the file system is concerned, no
such file exists. Similarly, glob("*\0") should return ().

This is also my feeling on the topic.

And mine, too. Just to be clear, in case we misunderstand each other,
that means no warnings, just a failed system call, which will be
detected by the existing error checking code. (You *do* check the
return values of your system calls, right? :-)

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Jul 6, 2013

From @rjbs

* Father Chrysostomos via RT <perlbug-followup@​perl.org> [2013-06-28T02​:56​:27]

And mine, too. Just to be clear, in case we misunderstand each other,
that means no warnings, just a failed system call, which will be
detected by the existing error checking code. (You *do* check the
return values of your system calls, right? :-)

Exactamo.

Here is the only thing that gives me a little pause​: I check those return
values and say something like​:

  open ... or die "couldn't open <$filename>​: $!";

And if $filename is "foo.txt\0" then the error message is confusing...
...but I don't think it would be in practice, because trailing NULs are just
ignored, rather than cause for a ENOENT. They're used in 2-arg open to
clear up ambiguity with files ending in a pipe, and we can keep allowing and
ignoring them. In any other case, the error message will contain non-nuls
after the nuls, and we're good. (In the unlikely event of $realname . $nul .
$zerowidth, well, the programmer will have to debug.)

--
rjbs

@p5pRT
Copy link
Author

p5pRT commented Aug 26, 2013

@tonycoz - Status changed from 'open' to 'resolved'

@p5pRT
Copy link
Author

p5pRT commented Aug 28, 2013

From @rurban

ATT​: RANT below....

On Thu Jul 18 22​:33​:53 2013, tonyc wrote​:

On Thu Jul 18 10​:59​:07 2013, rurban wrote​:

On Tue, Jul 16, 2013 at 9​:12 PM, Tony Cook via RT
<perlbug-followup@​perl.org> wrote​:

On Sat Jul 06 16​:10​:49 2013, perl.p5p@​rjbs.manxome.org wrote​:

* Father Chrysostomos via RT <perlbug-followup@​perl.org> [2013-06-
28T02​:56​:27]

And mine, too. Just to be clear, in case we misunderstand each
other,
that means no warnings, just a failed system call, which will be
detected by the existing error checking code. (You *do* check the
return values of your system calls, right? :-)

Exactamo.

Here is the only thing that gives me a little pause​: I check those
return
values and say something like​:

open ... or die "couldn't open <$filename>​: $!";

And if $filename is "foo.txt\0" then the error message is confusing...
...but I don't think it would be in practice, because trailing NULs
are just
ignored, rather than cause for a ENOENT. They're used in 2-arg open
to
clear up ambiguity with files ending in a pipe, and we can keep
allowing and
ignoring them. In any other case, the error message will contain non-
nuls
after the nuls, and we're good. (In the unlikely event of $realname .
$nul .
$zerowidth, well, the programmer will have to debug.)

The warning would help in diagnosing this.

I'm considering the following changes before applying this​:

- modify the CHECK_SAFESYSCALLS/CHECK_PATHNAME macros to only check the
path name and optionally warn, rather than hiding the control flow for
the function return (this may need a helper function)

If you prefer the longer veresion, ok. will do

The short version is simply incorrect in a few places.

- make the warning off by default, but can be enabled explicitly with
use warnings 'safesyscalls';

Will not do.
The warning will only trigger on malicious user-input.

A simple example where you might have non-malicious input with filenames
with embedded NULs​:

The source data has been written as a fixed-sized records with a field
terminated by NULs.

The filenames have been extracted directly with substr() and haven't had
their terminating trash removed.

Now I'm not saying this is good code, but it's a possible case.

In my entire career I never saw those usecases.
Looks like a bad excuse to me.
When you wrongly strip a filename from fixed-size records or mess up substr()
you deserve a default warning when loading a module. use Strict behaves the same.

There's no other usecase, than passing random binary strings to
syscalls by accident.
Production code often does not use -w, but must warn on harmful
intrusion attempts.
I know that you don't care much about security, but I cannot let this
slip through.

You'd be wrong - I do care about security, and disallowing system calls
on the filenames satisfies the needs of security.

Disagree​: Shutting up the attempt is okay for basic security,
caring about security would require reporting the incident.

Security requires more than just "oh no! this filename is Eeeevil!" You
need to validate inputs like filenames as they enter your system - not
when they're passed to a system call.

Right. Always blame the user.

If such a warning does fire as the result of a security failure you
probably have much worse problems elsewhere in your system.

To repeat why I said p5p doesn't care much about security​:
p5p repeatedly ignored to check and warn evil file and module names
via use strict 'refs' given to require, and still insists that this is
a pure user problem. Bad input => your fault, not ours.

Now finally after 15 years of p5p module laxness require dies on
invalid filenames, but there is still no warning, so that production
systems need to turn on warnings​::register hashes to detect intrusion
attempts. For the typical evil "require $bareword;" usecase.

And you still blame the user, not yourself.

On the other hand this simple non-security, only dummy windows/mac
user related problem
  perl -e "use Strict;"
dies with "Incorrect use of pragma 'strict' at -e line 1".

So p5p cares more to bitch at dummies (complaint discussed 1998, fatality finally added
2007) but allows hackers to abuse \0payload symbols and packagenames without warnings,
because symbols are now almost full strings and binary safe and *everything is just fine*

BTW​: The module loader modules which do exist on CPAN also do not check
evil filenames and modulenames, they simply pass the job to require​:
Module​::Load, Class​::Loader (very evil), Class​::Autouse, Class​::DI, ...
Only Class​::Factory strips \W

p5p does not help, CPAN does not help, perldoc perlsec stays silent.
It's still the poor stupid users fault.
At least 5.20 closes now the require hole, maybe 5.22 the symbol strictness.
--
Reini Urban

@p5pRT
Copy link
Author

p5pRT commented Aug 28, 2013

From @b2gills

On Wed, Aug 28, 2013 at 4​:07 PM, Reini Urban via RT
<perlbug-followup@​perl.org> wrote​:

ATT​: RANT below....

On Thu Jul 18 22​:33​:53 2013, tonyc wrote​:

On Thu Jul 18 10​:59​:07 2013, rurban wrote​:

On Tue, Jul 16, 2013 at 9​:12 PM, Tony Cook via RT
<perlbug-followup@​perl.org> wrote​:

On Sat Jul 06 16​:10​:49 2013, perl.p5p@​rjbs.manxome.org wrote​:

* Father Chrysostomos via RT <perlbug-followup@​perl.org> [2013-06-
28T02​:56​:27]

And mine, too. Just to be clear, in case we misunderstand each
other,
that means no warnings, just a failed system call, which will be
detected by the existing error checking code. (You *do* check the
return values of your system calls, right? :-)

Exactamo.

Here is the only thing that gives me a little pause​: I check those
return
values and say something like​:

open ... or die "couldn't open <$filename>​: $!";

And if $filename is "foo.txt\0" then the error message is confusing...
...but I don't think it would be in practice, because trailing NULs
are just
ignored, rather than cause for a ENOENT. They're used in 2-arg open
to
clear up ambiguity with files ending in a pipe, and we can keep
allowing and
ignoring them. In any other case, the error message will contain non-
nuls
after the nuls, and we're good. (In the unlikely event of $realname .
$nul .
$zerowidth, well, the programmer will have to debug.)

The warning would help in diagnosing this.

I'm considering the following changes before applying this​:

- modify the CHECK_SAFESYSCALLS/CHECK_PATHNAME macros to only check the
path name and optionally warn, rather than hiding the control flow for
the function return (this may need a helper function)

If you prefer the longer veresion, ok. will do

The short version is simply incorrect in a few places.

- make the warning off by default, but can be enabled explicitly with
use warnings 'safesyscalls';

Will not do.
The warning will only trigger on malicious user-input.

A simple example where you might have non-malicious input with filenames
with embedded NULs​:

The source data has been written as a fixed-sized records with a field
terminated by NULs.

The filenames have been extracted directly with substr() and haven't had
their terminating trash removed.

Now I'm not saying this is good code, but it's a possible case.

In my entire career I never saw those usecases.
Looks like a bad excuse to me.
When you wrongly strip a filename from fixed-size records or mess up substr()
you deserve a default warning when loading a module. use Strict behaves the same.

There's no other usecase, than passing random binary strings to
syscalls by accident.
Production code often does not use -w, but must warn on harmful
intrusion attempts.
I know that you don't care much about security, but I cannot let this
slip through.

You'd be wrong - I do care about security, and disallowing system calls
on the filenames satisfies the needs of security.

Disagree​: Shutting up the attempt is okay for basic security,
caring about security would require reporting the incident.

Security requires more than just "oh no! this filename is Eeeevil!" You
need to validate inputs like filenames as they enter your system - not
when they're passed to a system call.

Right. Always blame the user.

There can not be a possible security problem at the point of calling
the system's
open functions. Full Stop.

The only possible problem we could possibly have, is not appending
a null to the string. Having extras isn't really a problem. That is
because AFAIK the system open function always requires a null
terminator. If it were a problem, you could cause it with two variables
which together use a continuous chunk of memory.

There could still be some problems with a Perl program which exhibits this
behaviour, but it will be at some other point in the program.
Which would be better detected with some external tool.

I do agree that it should warn though (optionally). Only because the program
is apparently asking for a resource other than the one it is
going to get.

If such a warning does fire as the result of a security failure you
probably have much worse problems elsewhere in your system.

To repeat why I said p5p doesn't care much about security​:
p5p repeatedly ignored to check and warn evil file and module names
via use strict 'refs' given to require, and still insists that this is
a pure user problem. Bad input => your fault, not ours.

This would have a slightly higher risk, if `require` did anything
other than just load a file and run it. Actually if it ever does do anything
extra, that is a bug and should be reported.

I would like to point out that sending these warnings out on STDERR
could invoke a bug in a program which was reading STDERR.
So doing as you asked could actually create a security hole.

@p5pRT
Copy link
Author

p5pRT commented Aug 29, 2013

From @cpansprout

On Sun Aug 25 21​:39​:01 2013, tonyc wrote​:

On Thu Aug 01 18​:30​:43 2013, tonyc wrote​:

On Tue Jul 16 19​:12​:43 2013, tonyc wrote​:

I'm considering the following changes before applying this​:

- modify the CHECK_SAFESYSCALLS/CHECK_PATHNAME macros to only
check the
path name and optionally warn, rather than hiding the control flow for
the function return (this may need a helper function)

- make the warning off by default, but can be enabled explicitly with
use warnings 'safesyscalls';

Pushed as smoke-me/tonyc/syscalls, with variations from the above.

The warning isn't enabled by default, but is enabled by "use warnings"
since that's equivalent to C< use warnings 'all' >

Applied as c8028aa.

I see several problems with that commit​:

PERL_STATIC_INLINE bool
S_is_safe_syscall(pTHX_ SV *pv, const char *what, const char *op_name) {
  /* While the Windows CE API provides only UCS-16 (or UTF-16) APIs
  * perl itself uses xce*() functions which accept 8-bit strings.
  */

  PERL_ARGS_ASSERT_IS_SAFE_SYSCALL;

  if (SvPOK(pv) && SvCUR(pv) >= 1) {

What if the argument is not SvPOK? What if it stringifies to something
containing a null? Then we have an inconsistency between plain strings
and overloaded objects. Since all the callers stringify, and since
stringifying twice is not good (the caller’s buffer could be
invalidated), shouldn’t the caller be passing char* and length?

In Glob.xs, the pattern is parsed, and only after the parsing is the
whole thing thrown away based on whether the *pre-parsed* pattern
contains a null. So <a \0 c> doesn’t return even a and c. And the
parsing is all for nought, wasting CPU cycles.

If a glob pattern is, say, "a{\0,}b*" it should still return all files
beginning with "ab", but now it returns nothing. In fact, in those
cases where the file system is not even accessed, there should be no
check at all. C<pe{a,}rl> gives me ("pearl","perl") without even
accessing the file system. So should "a{\0,0}b", etc. (<...> is not
just for file globbing. Some of us use it for permutations.)

Having the check in csh_glob instead of doglob makes it apply only to
csh_glob and not to bsd_glob.

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Sep 3, 2013

From @tonycoz

On Wed Aug 28 23​:10​:28 2013, sprout wrote​:

On Sun Aug 25 21​:39​:01 2013, tonyc wrote​:

On Thu Aug 01 18​:30​:43 2013, tonyc wrote​:

On Tue Jul 16 19​:12​:43 2013, tonyc wrote​:

I'm considering the following changes before applying this​:

- modify the CHECK_SAFESYSCALLS/CHECK_PATHNAME macros to only
check the
path name and optionally warn, rather than hiding the control
flow for
the function return (this may need a helper function)

- make the warning off by default, but can be enabled explicitly
with
use warnings 'safesyscalls';

Pushed as smoke-me/tonyc/syscalls, with variations from the above.

The warning isn't enabled by default, but is enabled by "use warnings"
since that's equivalent to C< use warnings 'all' >

Applied as c8028aa.

I see several problems with that commit​:

PERL_STATIC_INLINE bool
S_is_safe_syscall(pTHX_ SV *pv, const char *what, const char *op_name) {
/* While the Windows CE API provides only UCS-16 (or UTF-16) APIs
* perl itself uses xce*() functions which accept 8-bit strings.
*/

PERL\_ARGS\_ASSERT\_IS\_SAFE\_SYSCALL;

if \(SvPOK\(pv\) && SvCUR\(pv\) >= 1\) \{

What if the argument is not SvPOK? What if it stringifies to something
containing a null? Then we have an inconsistency between plain strings
and overloaded objects. Since all the callers stringify, and since
stringifying twice is not good (the caller’s buffer could be
invalidated), shouldn’t the caller be passing char* and length?

I've pushed a fix for this to tonyc/more-safesyscalls.

I misunderstood how "" overloading worked.

In Glob.xs, the pattern is parsed, and only after the parsing is the
whole thing thrown away based on whether the *pre-parsed* pattern
contains a null. So <a \0 c> doesn’t return even a and c. And the
parsing is all for nought, wasting CPU cycles.

The change above now does the \0 check before the pattern check.

If a glob pattern is, say, "a{\0,}b*" it should still return all files
beginning with "ab", but now it returns nothing.

It returned nothing before the change.

In fact, in those
cases where the file system is not even accessed, there should be no
check at all. C<pe{a,}rl> gives me ("pearl","perl") without even
accessing the file system. So should "a{\0,0}b", etc. (<...> is not
just for file globbing. Some of us use it for permutations.)

While that might be useful, neither csh_glob nor bsd_glob handle NUL in
the pattern in the way you describe.

Both stop at the NUL​:

$ ~/perl/5.18.1-thr/bin/perl -MFile​::Glob=csh_glob -le 'print
csh_glob("a{,\0}b")'
a
$ ~/perl/5.18.1-thr/bin/perl -MFile​::Glob=bsd_glob -le 'print
bsd_glob("a{,\0}b")'
a

I think fixing that is beyond the scope of this ticket.

Having the check in csh_glob instead of doglob makes it apply only to
csh_glob and not to bsd_glob.

The branch above fixes that.

Tony

@p5pRT
Copy link
Author

p5pRT commented Sep 3, 2013

@tonycoz - Status changed from 'resolved' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Sep 3, 2013

From @cpansprout

On Mon Sep 02 23​:44​:13 2013, tonyc wrote​:

On Wed Aug 28 23​:10​:28 2013, sprout wrote​:

Having the check in csh_glob instead of doglob makes it apply only to
csh_glob and not to bsd_glob.

The branch above fixes that.

Thank you.

I still see a problem with the commit entitled ‘[perl #117265] do most
glob \0 checks in the same place’. The code to handle the argument is
now in iterate(). File​::Glob intentionally ignores its arguments
altogether when called from the same callsite multiple times in scalar
context. After the first call, it simply returns the next cached file name.

Your patch changes that if the argument contains a null.

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Sep 4, 2013

From @tonycoz

On Tue Sep 03 06​:04​:14 2013, sprout wrote​:

On Mon Sep 02 23​:44​:13 2013, tonyc wrote​:

On Wed Aug 28 23​:10​:28 2013, sprout wrote​:

Having the check in csh_glob instead of doglob makes it apply only to
csh_glob and not to bsd_glob.

The branch above fixes that.

Thank you.

I still see a problem with the commit entitled ‘[perl #117265] do most
glob \0 checks in the same place’. The code to handle the argument is
now in iterate(). File​::Glob intentionally ignores its arguments
altogether when called from the same callsite multiple times in scalar
context. After the first call, it simply returns the next cached file
name.

Your patch changes that if the argument contains a null.

I've updated the branch to fix that.

Tony

@p5pRT
Copy link
Author

p5pRT commented Sep 4, 2013

From @cpansprout

On Tue Sep 03 17​:06​:22 2013, tonyc wrote​:

On Tue Sep 03 06​:04​:14 2013, sprout wrote​:

On Mon Sep 02 23​:44​:13 2013, tonyc wrote​:

On Wed Aug 28 23​:10​:28 2013, sprout wrote​:

Having the check in csh_glob instead of doglob makes it apply
only to
csh_glob and not to bsd_glob.

The branch above fixes that.

Thank you.

I still see a problem with the commit entitled ‘[perl #117265] do most
glob \0 checks in the same place’. The code to handle the argument is
now in iterate(). File​::Glob intentionally ignores its arguments
altogether when called from the same callsite multiple times in scalar
context. After the first call, it simply returns the next cached file
name.

Your patch changes that if the argument contains a null.

I've updated the branch to fix that.

I see no more problems with it.

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Sep 4, 2013

From @craigberry

On Tue, Sep 3, 2013 at 7​:56 PM, Father Chrysostomos via RT
<perlbug-followup@​perl.org> wrote​:

I see no more problems with it.

I see one more. I haven't actually tried the new branch yet, but I
don't think it will make a difference to the bit that's bugging me,
which is that the warnings are different when using external glob
versus using File​::Glob to implement the glob operator.

When using File​::Glob as unixy Perl does, we see only one warning​:

$ ./perl -Ilib -we 'glob(qq/foo\0bar/);'
Invalid \0 character in pattern for glob​: foo\0bar at -e line 1.

But when using "external" glob (which just means "something besides
File​::Glob", precise definition varying by platform) as miniperl does,
we see two warnings​:

$ ./miniperl -Ilib -we 'glob(qq/foo\0bar/);'
Invalid \0 character in pattern for glob​: foo\0bar at -e line 1.
glob failed (can't start child​: No such file or directory) at -e line 1.

There are several tests (t/io/open, t/op/require_errors, and
lib/warnings) that depend on only seeing the first warning. If the
second warning also appears, the tests fail. Which is a problem for
me because on VMS, PERL_EXTERNAL_GLOB is always defined, so these
tests have failed ever since the new safety checks came into blead.

Not sure what the right thing to do is. Could be fiddling with the
tests to accept the warning they are looking for even if others are
spotted along the way, or could be tinkering with warning or return
paths in pp_hot.c​:Perl_do_readline().

This doesn't preclude merging the new branch into blead as the tests
are already failing in blead.

@p5pRT
Copy link
Author

p5pRT commented Sep 4, 2013

From @cpansprout

On Tue Sep 03 19​:57​:43 2013, craig.a.berry@​gmail.com wrote​:

On Tue, Sep 3, 2013 at 7​:56 PM, Father Chrysostomos via RT
<perlbug-followup@​perl.org> wrote​:

I see no more problems with it.

I see one more. I haven't actually tried the new branch yet, but I
don't think it will make a difference to the bit that's bugging me,
which is that the warnings are different when using external glob
versus using File​::Glob to implement the glob operator.

When using File​::Glob as unixy Perl does, we see only one warning​:

$ ./perl -Ilib -we 'glob(qq/foo\0bar/);'
Invalid \0 character in pattern for glob​: foo\0bar at -e line 1.

But when using "external" glob (which just means "something besides
File​::Glob", precise definition varying by platform) as miniperl does,
we see two warnings​:

$ ./miniperl -Ilib -we 'glob(qq/foo\0bar/);'
Invalid \0 character in pattern for glob​: foo\0bar at -e line 1.
glob failed (can't start child​: No such file or directory) at -e line 1.

There are several tests (t/io/open, t/op/require_errors, and
lib/warnings) that depend on only seeing the first warning. If the
second warning also appears, the tests fail. Which is a problem for
me because on VMS, PERL_EXTERNAL_GLOB is always defined, so these
tests have failed ever since the new safety checks came into blead.

miniperl is affected on all platforms.

$ ./miniperl -we 'glob "\0a"'
Invalid \0 character in pattern for glob​: \0a at -e line 1.
glob failed (can't start child​: No such file or directory) at -e line 1.

Not sure what the right thing to do is.

start_glob (in doio.c) is not part of the API. We can change it to
signal to do_readline (in pp_hot.c) that it shouldn’t produce the second
warning.

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Sep 4, 2013

From @ikegami

On Fri, Jul 19, 2013 at 1​:33 AM, Tony Cook via RT <perlbug-followup@​perl.org

wrote​:

A simple example where you might have non-malicious input with filenames
with embedded NULs​:

The source data has been written as a fixed-sized records with a field
terminated by NULs.

Non-issue. You should be using unpack 'Z100' instead of unpack 'a100'.

Now I'm not saying this is good code, but it's a possible case.

In fact, I'd call that buggy code. It expects Perl to work with a file
other than the one it named.

I don't know why this is just getting a warning?! Why is NUL treated
differently than chr(0x110000)? They're both characters outside the
character set allowed for file names, so why are some proposing that they
have different behaviours? If this is about dwimming, why wouldn't we start
by handling the far more common error of accidental trailing newlines?

@p5pRT
Copy link
Author

p5pRT commented Sep 4, 2013

From @cpansprout

On Tue Sep 03 20​:43​:52 2013, ikegami@​adaelis.com wrote​:

I don't know why this is just getting a warning?! Why is NUL treated
differently than chr(0x110000)? They're both characters outside the
character set allowed for file names, so why are some proposing that they
have different behaviours?

I agree there shouldn’t be a warning. I thought Ricardo thought so, too.

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Sep 5, 2013

From @tonycoz

On Tue Sep 03 20​:09​:31 2013, sprout wrote​:

On Tue Sep 03 19​:57​:43 2013, craig.a.berry@​gmail.com wrote​:

On Tue, Sep 3, 2013 at 7​:56 PM, Father Chrysostomos via RT
<perlbug-followup@​perl.org> wrote​:

I see no more problems with it.

I see one more. I haven't actually tried the new branch yet, but I
don't think it will make a difference to the bit that's bugging me,
which is that the warnings are different when using external glob
versus using File​::Glob to implement the glob operator.

When using File​::Glob as unixy Perl does, we see only one warning​:

$ ./perl -Ilib -we 'glob(qq/foo\0bar/);'
Invalid \0 character in pattern for glob​: foo\0bar at -e line 1.

But when using "external" glob (which just means "something besides
File​::Glob", precise definition varying by platform) as miniperl does,
we see two warnings​:

$ ./miniperl -Ilib -we 'glob(qq/foo\0bar/);'
Invalid \0 character in pattern for glob​: foo\0bar at -e line 1.
glob failed (can't start child​: No such file or directory) at -e line 1.

There are several tests (t/io/open, t/op/require_errors, and
lib/warnings) that depend on only seeing the first warning. If the
second warning also appears, the tests fail. Which is a problem for
me because on VMS, PERL_EXTERNAL_GLOB is always defined, so these
tests have failed ever since the new safety checks came into blead.

miniperl is affected on all platforms.

$ ./miniperl -we 'glob "\0a"'
Invalid \0 character in pattern for glob​: \0a at -e line 1.
glob failed (can't start child​: No such file or directory) at -e line 1.

Not sure what the right thing to do is.

start_glob (in doio.c) is not part of the API. We can change it to
signal to do_readline (in pp_hot.c) that it shouldn’t produce the second
warning.

I moved the warning closer to where the failure occurs instead.

Tony

@p5pRT
Copy link
Author

p5pRT commented Sep 5, 2013

From @tonycoz

On Wed Sep 04 06​:14​:40 2013, sprout wrote​:

On Tue Sep 03 20​:43​:52 2013, ikegami@​adaelis.com wrote​:

I don't know why this is just getting a warning?! Why is NUL treated
differently than chr(0x110000)? They're both characters outside the
character set allowed for file names, so why are some proposing that
they
have different behaviours?

I agree there shouldn’t be a warning. I thought Ricardo thought so, too.

I discussed it in IRC with Ricardo, he only objects to default-on or
fatal-if-on-by-default warnings.

Of course, you'll only ever see this warning if you try to use a string
with a NUL inside it as a filename or glob pattern.

Tony

@p5pRT
Copy link
Author

p5pRT commented Sep 9, 2013

From @tonycoz

On Wed Sep 04 21​:31​:10 2013, tonyc wrote​:

I moved the warning closer to where the failure occurs instead.

Pushed the various changes to blead as merge commit
9d32676, containing
788436d,
41188aa,
ace0afd and
de7dabb.

Tony

@p5pRT
Copy link
Author

p5pRT commented Sep 9, 2013

@tonycoz - Status changed from 'open' to 'resolved'

@p5pRT
Copy link
Author

p5pRT commented Sep 14, 2013

From @craigberry

On Mon, Sep 9, 2013 at 12​:39 AM, Tony Cook via RT
<perlbug-followup@​perl.org> wrote​:

On Wed Sep 04 21​:31​:10 2013, tonyc wrote​:

I moved the warning closer to where the failure occurs instead.

Pushed the various changes to blead as merge commit
9d32676, containing
788436d,
41188aa,
ace0afd and
de7dabb.

tonyc++ for persisting with this (not so) little project through all
its twists and turns. I'm afraid I have one more twist, and apologies
for the very late and piecemeal nature of my comments here.

t/io/open and t/lib/warnings now pass on VMS, but t/op/require_errors
doesn't (I had assumed it was the same problem but it isn't). Here's
what it boils down to​:

$ perl -we "@​INC = qq/lib\0invalid/; require 'foo';"
Can't locate foo in @​INC (@​INC contains​: libinvalid) at -e line 1.

where you can see the warning is missing, a warning that should look like​:

$ ./perl -Ilib -we '@​INC = qq/lib\0invalid/; require q/foo/;'
Invalid \0 character in pathname for require​: lib\0invalid/foo at -e line 1.
Can't locate foo in @​INC (@​INC contains​: libinvalid) at -e line 1.

The reason we don't get the warning is that (on VMS only) the
directory we're going look in gets converted to Unix format, which
involves an interface using C strings, so the part of the directory
beyond the NUL has already been discarded before we check for the NUL
in the full pathname. The relevant bit of pp_require is currently
here​:

http​://perl5.git.perl.org/perl.git/blob/HEAD​:/pp_ctl.c#l3972

I believe the __SYMBIAN32__ code path just under the VMS path will
have a similar problem because it uses sv_catpvf.

One possible fix would be to check the directory itself before
appending the name to it, like the following, where the "continue"
iterates to the next directory in @​INC​:

---- pp_ctl.c;-0 2013-09-10 10​:19​:04 -0500
+++ pp_ctl.c 2013-09-14 16​:04​:59 -0500
@​@​ -3969,6 +3969,8 @​@​ PP(pp_require)
  dirlen = 0;
  }

+ if (!IS_SAFE_SYSCALL(dir, dirlen, "@​INC entry", "require"))
+ continue;
#ifdef VMS
  if (((unixdirbuf = SvPVX(sv_2mortal(newSVpv("", VMS_MAXRSS-1)))) == NULL)
  || ((unixdir = tounixpath(dir, unixdirbuf)) == NULL))
[end]

That works​:

$ perl -we "@​INC = qq/lib\0invalid/; require 'foo';"
Invalid \0 character in @​INC entry for require​: lib\0invalid at -e line 1.
Can't locate foo in @​INC (@​INC contains​: libinvalid) at -e line 1.
%LIB-F-INVARG, invalid argument(s)

That will need an adjustment to require_errors.t and more testing than
I've done so far, but does anyone object to this approach? Or have a
better name than "@​INC entry" for what to call it? Maybe "@​INC
element"?

Oh, and I've been meaning to ask, why don't we put these checks in
S_incpush ad friends in perl.c so the bad directories never get into
@​INC in the first place?

@p5pRT
Copy link
Author

p5pRT commented Sep 15, 2013

@tonycoz - Status changed from 'resolved' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Sep 15, 2013

From @cpansprout

On Sat Sep 14 15​:45​:17 2013, craig.a.berry@​gmail.com wrote​:

Oh, and I've been meaning to ask, why don't we put these checks in
S_incpush ad friends in perl.c so the bad directories never get into
@​INC in the first place?

@​INC can be modified at run time, so how would that help?

--

Father Chrysostomos

@p5pRT
Copy link
Author

p5pRT commented Sep 15, 2013

From @craigberry

On Sun, Sep 15, 2013 at 1​:26 AM, Father Chrysostomos via RT
<perlbug-followup@​perl.org> wrote​:

On Sat Sep 14 15​:45​:17 2013, craig.a.berry@​gmail.com wrote​:

Oh, and I've been meaning to ask, why don't we put these checks in
S_incpush ad friends in perl.c so the bad directories never get into
@​INC in the first place?

@​INC can be modified at run time, so how would that help?

Sorry, I guess I was thinking of %ENV where run-time modifications do
special things.

@p5pRT
Copy link
Author

p5pRT commented Sep 16, 2013

From @tonycoz

On Sat Sep 14 15​:45​:17 2013, craig.a.berry@​gmail.com wrote​:

On Mon, Sep 9, 2013 at 12​:39 AM, Tony Cook via RT
<perlbug-followup@​perl.org> wrote​:

On Wed Sep 04 21​:31​:10 2013, tonyc wrote​:

I moved the warning closer to where the failure occurs instead.

Pushed the various changes to blead as merge commit
9d32676, containing
788436d,
41188aa,
ace0afd and
de7dabb.

tonyc++ for persisting with this (not so) little project through all
its twists and turns. I'm afraid I have one more twist, and apologies
for the very late and piecemeal nature of my comments here.

t/io/open and t/lib/warnings now pass on VMS, but t/op/require_errors
doesn't (I had assumed it was the same problem but it isn't). Here's
what it boils down to​:

$ perl -we "@​INC = qq/lib\0invalid/; require 'foo';"
Can't locate foo in @​INC (@​INC contains​: libinvalid) at -e line 1.

where you can see the warning is missing, a warning that should look
like​:

$ ./perl -Ilib -we '@​INC = qq/lib\0invalid/; require q/foo/;'
Invalid \0 character in pathname for require​: lib\0invalid/foo at -e
line 1.
Can't locate foo in @​INC (@​INC contains​: libinvalid) at -e line 1.

The reason we don't get the warning is that (on VMS only) the
directory we're going look in gets converted to Unix format, which
involves an interface using C strings, so the part of the directory
beyond the NUL has already been discarded before we check for the NUL
in the full pathname. The relevant bit of pp_require is currently
here​:

http​://perl5.git.perl.org/perl.git/blob/HEAD​:/pp_ctl.c#l3972

I believe the __SYMBIAN32__ code path just under the VMS path will
have a similar problem because it uses sv_catpvf.

One possible fix would be to check the directory itself before
appending the name to it, like the following, where the "continue"
iterates to the next directory in @​INC​:

---- pp_ctl.c;-0 2013-09-10 10​:19​:04 -0500
+++ pp_ctl.c 2013-09-14 16​:04​:59 -0500
@​@​ -3969,6 +3969,8 @​@​ PP(pp_require)
dirlen = 0;
}

+ if (!IS_SAFE_SYSCALL(dir, dirlen, "@​INC entry", "require"))
+ continue;
#ifdef VMS
if (((unixdirbuf = SvPVX(sv_2mortal(newSVpv("", VMS_MAXRSS-1))))
== NULL)
|| ((unixdir = tounixpath(dir, unixdirbuf)) == NULL))
[end]

That works​:

$ perl -we "@​INC = qq/lib\0invalid/; require 'foo';"
Invalid \0 character in @​INC entry for require​: lib\0invalid at -e
line 1.
Can't locate foo in @​INC (@​INC contains​: libinvalid) at -e line 1.
%LIB-F-INVARG, invalid argument(s)

That will need an adjustment to require_errors.t and more testing than
I've done so far, but does anyone object to this approach? Or have a
better name than "@​INC entry" for what to call it? Maybe "@​INC
element"?

That looks sensible.

Maybe the "library search path", but that may be more confusing than not
- or the "library search path (@​INC)".

Tony

@p5pRT
Copy link
Author

p5pRT commented Sep 17, 2013

From @craigberry

On Sun, Sep 15, 2013 at 10​:50 PM, Tony Cook via RT
<perlbug-followup@​perl.org> wrote​:

On Sat Sep 14 15​:45​:17 2013, craig.a.berry@​gmail.com wrote​:

The reason we don't get the warning is that (on VMS only) the
directory we're going look in gets converted to Unix format, which
involves an interface using C strings, so the part of the directory
beyond the NUL has already been discarded before we check for the NUL
in the full pathname.
<snip>
One possible fix would be to check the directory itself before
appending the name to it, like the following, where the "continue"
iterates to the next directory in @​INC​:

---- pp_ctl.c;-0 2013-09-10 10​:19​:04 -0500
+++ pp_ctl.c 2013-09-14 16​:04​:59 -0500
@​@​ -3969,6 +3969,8 @​@​ PP(pp_require)
dirlen = 0;
}

+ if (!IS_SAFE_SYSCALL(dir, dirlen, "@​INC entry", "require"))
+ continue;
#ifdef VMS
if (((unixdirbuf = SvPVX(sv_2mortal(newSVpv("", VMS_MAXRSS-1))))
== NULL)
|| ((unixdir = tounixpath(dir, unixdirbuf)) == NULL))
[end]

That works​:

$ perl -we "@​INC = qq/lib\0invalid/; require 'foo';"
Invalid \0 character in @​INC entry for require​: lib\0invalid at -e
line 1.
Can't locate foo in @​INC (@​INC contains​: libinvalid) at -e line 1.
%LIB-F-INVARG, invalid argument(s)

That will need an adjustment to require_errors.t and more testing than
I've done so far, but does anyone object to this approach? Or have a
better name than "@​INC entry" for what to call it? Maybe "@​INC
element"?

That looks sensible.

Thanks. I've pushed it as ddc65b6.

Maybe the "library search path", but that may be more confusing than not
- or the "library search path (@​INC)".

I didn't see any other warnings that felt the need to elaborate on
what @​INC is so I left it as is.

@p5pRT
Copy link
Author

p5pRT commented Sep 18, 2013

@tonycoz - Status changed from 'open' to 'resolved'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant