Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SelfLoader/fork() gotcha #8101

Open
p5pRT opened this issue Sep 9, 2005 · 7 comments
Open

SelfLoader/fork() gotcha #8101

p5pRT opened this issue Sep 9, 2005 · 7 comments

Comments

@p5pRT
Copy link

p5pRT commented Sep 9, 2005

Migrated from rt.perl.org#37119 (status was 'open')

Searchable as RT37119$

@p5pRT
Copy link
Author

p5pRT commented Sep 9, 2005

From @nwc10

Created by @nwc10

This is definitely platform dependent - I can reproduce this on x86 Linux,
but not x86 FreeBSD or OS X.

fork and __DATA__ don't mix. I don't think that we document this anywhere.
I don't know if we should, given the amount of documentation, but it is a
subtle gotcha, and it had never occurred to me.

The basic problem is with modules that lazily read from the DATA file
handle, particularly if they read from another package's DATA file handle
lazily, and on demand. DATA is implemented by the Perl 5 compiler leaving
the program's file handle open if it encounters the __DATA__ token. All is
fine and dandy, until you fork. At which point both processes now have a
(buffered) DATA file handle pointing to the same kernel file descriptor.
When one reads from DATA, the other's DATA handle moves. Underneath it.

SelfLoader breaks.

$ cat Demo.pm
#!perl -w

package Demo;
sub import {};
use SelfLoader;
@​ISA = 'SelfLoader';

1;
__DATA__
sub hash {
  print "pig-pen​: $_[0]\n";
}

$ cat demo.pl
#!perl -w
use strict;

use Demo;

my $pid = fork();
sleep 2 if $pid;
Demo​::hash ($pid);

$ perl demo.pl
pig-pen​: 0
Undefined subroutine Demo​::hash at demo.pl line 8

I'm not sure if/how we can fix SelfLoader. I'm not sure where we should
document this gotcha with DATA.

Nicholas Clark

Perl Info

Flags:
    category=core
    severity=low

Site configuration information for perl v5.8.5:

Configured by root at Mon Oct 18 17:51:35 BST 2004.

Summary of my perl5 (revision 5 version 8 subversion 5) configuration:
  Platform:
    osname=linux, osvers=2.4.21-4.elsmp, archname=i686-linux
    uname='linux switch.work.fotango.com 2.4.21-4.elsmp #1 smp fri oct 3 17:52:56 edt 2003 i686 i686 i386 gnulinux '
    config_args='-Dprefix=/usr/local/perl-5.8.5 -Uinstallusrbinperl -des'
    hint=recommended, useposix=true, d_sigaction=define
    usethreads=undef use5005threads=undef useithreads=undef usemultiplicity=undef
    useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
    use64bitint=undef use64bitall=undef uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='cc', ccflags ='-fno-strict-aliasing -pipe -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm',
    optimize='-O2',
    cppflags='-fno-strict-aliasing -pipe -I/usr/include/gdbm'
    ccversion='', gccversion='3.2.3 20030502 (Red Hat Linux 3.2.3-20)', gccosandvers=''
    intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
    ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
    alignbytes=4, prototype=define
  Linker and Libraries:
    ld='cc', ldflags =' -L/usr/local/lib'
    libpth=/usr/local/lib /lib /usr/lib
    libs=-lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lc
    perllibs=-lnsl -ldl -lm -lcrypt -lutil -lc
    libc=/lib/libc-2.3.2.so, so=so, useshrplib=false, libperl=libperl.a
    gnulibc_version='2.3.2'
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E'
    cccdlflags='-fpic', lddlflags='-shared -L/usr/local/lib'

Locally applied patches:
    


@INC for perl v5.8.5:
    /usr/local/perl-5.8.5/lib/5.8.5/i686-linux
    /usr/local/perl-5.8.5/lib/5.8.5
    /usr/local/perl-5.8.5/lib/site_perl/5.8.5/i686-linux
    /usr/local/perl-5.8.5/lib/site_perl/5.8.5
    /usr/local/perl-5.8.5/lib/site_perl
    .


Environment for perl v5.8.5:
    HOME=/home/nick
    LANG (unset)
    LANGUAGE (unset)
    LD_LIBRARY_PATH (unset)
    LOGDIR (unset)
    PATH=/home/nick/bin:/usr/kerberos/bin:/usr/lib/ccache/bin:/usr/local/bin:/bin:/usr/bin:/usr/X11R6/bin:/usr/local/sbin:/sbin:/usr/sbin
    PERL_BADLANG (unset)
    SHELL=/bin/bash

@p5pRT
Copy link
Author

p5pRT commented Oct 2, 2005

From nospam-abuse@ilyaz.org

I do not think this effect is documented anywhere​:

  perl -wle 'open F, q(<), shift or die; defined fork or die;
  open O, q(>x).$$; print O $_ while <F>' .article

Two files x$$ are created (one per process). One of the them is
empty. (Both on Solaris and OS/2; but this may be system-dependent.)

Obviously, forked process share the same position in file. One of the
culprits is the DATA file handle (see bug [perl #37119])​: SelfLoader
will work in only one of the processes (unless it already read the
<DATA> section).

  [Sounds familiar; see
http​://groups.google.com/group/comp.lang.perl.modules/msg/ffafaf280e1423d7
  ]

It should be considered a bug in implementation of <DATA> handle. It
must tell() initially and after each read, and seek() back before each
read if any fork() was performed (drat, there is still a race
condition; maybe it should dup() first...).

Yours,
Ilya

P.S. How to reproduce​: e.g., start

  perl -dwe0

in an XTerm (with ReadLine​::Perl), then type

  fork

Now type something to the command line in one terminal and press
BackSpace key. After this BackSpace won't work in other terminal.

@p5pRT
Copy link
Author

p5pRT commented Oct 2, 2005

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Oct 2, 2005

From nospam-abuse@ilyaz.org

On Sat, Oct 01, 2005 at 02​:33​:19PM -0700, Ilya Zakharevich wrote​:

I do not think this effect is documented anywhere​:

perl -wle 'open F, q(<), shift or die; defined fork or die;
open O, q(>x).$$; print O $_ while <F>' .article

Two files x$$ are created (one per process). One of the them is
empty. (Both on Solaris and OS/2; but this may be system-dependent.)

Obviously, forked process share the same position in file. One of the
culprits is the DATA file handle (see bug [perl #37119])​: SelfLoader
will work in only one of the processes (unless it already read the
<DATA> section).

Below I made a sample implementation of protection against this
misfeature of fork() (and \*DATA). It is done on the level of
SelfLoader; however, the "correct" fix should happen on the level of
\*DATA. Everybody​: do you have any idea how to do something similar
on the level of \*DATA?

(The particular case of SelfLoader is simpler since data is read in
one chunk, thus fork() can't happen between two read()s.)

Thanks,
Ilya

Inline Patch
--- ./lib/SelfLoader.pm-pre	Wed Aug 13 23:37:40 2003
+++ ./lib/SelfLoader.pm	Sat Oct  1 15:45:44 2005
@@ -51,13 +51,15 @@ sub load_stubs { shift->_load_stubs((cal
 sub _load_stubs {
     # $endlines is used by Devel::SelfStubber to capture lines after __END__
     my($self, $callpack, $endlines) = @_;
-    my $fh = \*{"${callpack}::DATA"};
+    my $ofh = \*{"${callpack}::DATA"};
     my $currpack = $callpack;
     my($line,$name,@lines, @stubs, $protoype);
 
     print STDERR "SelfLoader::load_stubs($callpack)\n" if $DEBUG;
     croak("$callpack doesn't contain an __DATA__ token")
-        unless fileno($fh);
+        unless fileno($ofh);
+    open my $fh, '<&', $ofh or croak "reopen: $!";
+    close $ofh;				# Protect: fork() shares the pointer
     $Cache{"${currpack}::<DATA"} = 1;   # indicate package is cached
 
     local($/) = "\n";

@p5pRT
Copy link
Author

p5pRT commented Oct 4, 2005

From @rgs

Ilya Zakharevich wrote​:

Below I made a sample implementation of protection against this
misfeature of fork() (and \*DATA). It is done on the level of
SelfLoader; however, the "correct" fix should happen on the level of
\*DATA. Everybody​: do you have any idea how to do something similar
on the level of \*DATA?

(The particular case of SelfLoader is simpler since data is read in
one chunk, thus fork() can't happen between two read()s.)

The problem with your patch is that the DATA filehandle can't be used
anymore after loading stubs, because you just closed it. And thus the
19th test of lib/SelfLoader.t fails.

But, IMHO fixing the fork() bug is more important than being able to
reuse DATA from a selfloaded module. Anyway a perl-interpreter level fix
would be better. But how to do this ? Catch calls to fork() and dup
*DATA when they happen ? This won't fix the situation where a perl is
embedded in another process (say, an httpd server) which forks.

--- ./lib/SelfLoader.pm-pre Wed Aug 13 23​:37​:40 2003
+++ ./lib/SelfLoader.pm Sat Oct 1 15​:45​:44 2005
@​@​ -51,13 +51,15 @​@​ sub load_stubs { shift->_load_stubs((cal
sub _load_stubs {
# $endlines is used by Devel​::SelfStubber to capture lines after __END__
my($self, $callpack, $endlines) = @​_;
- my $fh = \*{"${callpack}​::DATA"};
+ my $ofh = \*{"${callpack}​::DATA"};
my $currpack = $callpack;
my($line,$name,@​lines, @​stubs, $protoype);

 print STDERR "SelfLoader&#8203;::load\_stubs\($callpack\)\\n" if $DEBUG;
 croak\("$callpack doesn't contain an \_\_DATA\_\_ token"\)

- unless fileno($fh);
+ unless fileno($ofh);
+ open my $fh, '<&', $ofh or croak "reopen​: $!";
+ close $ofh; # Protect​: fork() shares the pointer
$Cache{"${currpack}​::<DATA"} = 1; # indicate package is cached

 local\($/\) = "\\n";

@p5pRT
Copy link
Author

p5pRT commented Oct 5, 2005

From nospam-abuse@ilyaz.org

On Tue, Oct 04, 2005 at 04​:28​:55PM +0200, Rafael Garcia-Suarez wrote​:

(The particular case of SelfLoader is simpler since data is read in
one chunk, thus fork() can't happen between two read()s.)

The problem with your patch is that the DATA filehandle can't be used
anymore after loading stubs, because you just closed it. And thus the
19th test of lib/SelfLoader.t fails.

But, IMHO fixing the fork() bug is more important than being able to
reuse DATA from a selfloaded module. Anyway a perl-interpreter level fix
would be better. But how to do this ? Catch calls to fork() and dup
*DATA when they happen ? This won't fix the situation where a perl is
embedded in another process (say, an httpd server) which forks.

Make a special input layer :forkable (useful not only for DATA, but in
most situations of read from "normal" file - I consider this part of
semantic of fork() completely broken). It saves pid and pos() after
each read. If on the next read pid changed, you dup filedescriptor to
itself (probably, one needs 2 steps) and seek() to the preceeding
position.

Can somebody see problems with this?

Thanks,
Ilya

@p5pRT
Copy link
Author

p5pRT commented Apr 29, 2010

From user42@zip.com.au

Created by foo@bar.com

Nosing around SelfLoader I wondered whether it was safe on a fork()
and/or threads. The docs could helpfully say whether it is or not.

I suspect the answer is something like not safe in general, or not until
you load_stubs() -- but that you'd have to be fairly unlucky to get
precisely concurrent load_stubs() reading the __DATA__ and hence making
a mess.

(I saw the DATA handle dup()-ing code, but I think it doesn't help,
since a dup() is the same file table entry so the file position is
shared by parent and child, either a fork or a thread.)

Perl Info

Flags:
    category=library
    severity=low
    module=SelfLoader

Site configuration information for perl 5.10.1:

Configured by Debian Project at Sun Apr 11 22:31:36 UTC 2010.

Summary of my perl5 (revision 5 version 10 subversion 1) configuration:
   
  Platform:
    osname=linux, osvers=2.6.32.11-dsa-ia32, archname=i486-linux-gnu-thread-multi
    uname='linux murphy 2.6.32.11-dsa-ia32 #1 smp fri apr 2 10:32:00 cest 2010 i686 gnulinux '
    config_args='-Dusethreads -Duselargefiles -Dccflags=-DDEBIAN -Dcccdlflags=-fPIC -Darchname=i486-linux-gnu -Dprefix=/usr -Dprivlib=/usr/share/perl/5.10 -Darchlib=/usr/lib/perl/5.10 -Dvendorprefix=/usr -Dvendorlib=/usr/share/perl5 -Dvendorarch=/usr/lib/perl5 -Dsiteprefix=/usr/local -Dsitelib=/usr/local/share/perl/5.10.1 -Dsitearch=/usr/local/lib/perl/5.10.1 -Dman1dir=/usr/share/man/man1 -Dman3dir=/usr/share/man/man3 -Dsiteman1dir=/usr/local/man/man1 -Dsiteman3dir=/usr/local/man/man3 -Dman1ext=1 -Dman3ext=3perl -Dpager=/usr/bin/sensible-pager -Uafs -Ud_csh -Ud_ualarm -Uusesfio -Uusenm -DDEBUGGING=-g -Doptimize=-O2 -Duseshrplib -Dlibperl=libperl.so.5.10.1 -Dd_dosuid -des'
    hint=recommended, useposix=true, d_sigaction=define
    useithreads=define, usemultiplicity=define
    useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef
    use64bitint=undef, use64bitall=undef, uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='cc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBIAN -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
    optimize='-O2 -g',
    cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBIAN -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include'
    ccversion='', gccversion='4.4.3', gccosandvers=''
    intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
    ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
    alignbytes=4, prototype=define
  Linker and Libraries:
    ld='cc', ldflags =' -fstack-protector -L/usr/local/lib'
    libpth=/usr/local/lib /lib /usr/lib /usr/lib64
    libs=-lgdbm -lgdbm_compat -ldb -ldl -lm -lpthread -lc -lcrypt
    perllibs=-ldl -lm -lpthread -lc -lcrypt
    libc=/lib/libc-2.10.2.so, so=so, useshrplib=true, libperl=libperl.so.5.10.1
    gnulibc_version='2.10.2'
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E'
    cccdlflags='-fPIC', lddlflags='-shared -O2 -g -L/usr/local/lib -fstack-protector'

Locally applied patches:
    DEBPKG:debian/arm_thread_stress_timeout - http://bugs.debian.org/501970 Raise the timeout of ext/threads/shared/t/stress.t to accommodate slower build hosts
    DEBPKG:debian/cpan_config_path - Set location of CPAN::Config to /etc/perl as /usr may not be writable.
    DEBPKG:debian/cpan_definstalldirs - Provide a sensible INSTALLDIRS default for modules installed from CPAN.
    DEBPKG:debian/db_file_ver - http://bugs.debian.org/340047 Remove overly restrictive DB_File version check.
    DEBPKG:debian/doc_info - Replace generic man(1) instructions with Debian-specific information.
    DEBPKG:debian/enc2xs_inc - http://bugs.debian.org/290336 Tweak enc2xs to follow symlinks and ignore missing @INC directories.
    DEBPKG:debian/errno_ver - http://bugs.debian.org/343351 Remove Errno version check due to upgrade problems with long-running processes.
    DEBPKG:debian/extutils_hacks - Various debian-specific ExtUtils changes
    DEBPKG:debian/fakeroot - Postpone LD_LIBRARY_PATH evaluation to the binary targets.
    DEBPKG:debian/instmodsh_doc - Debian policy doesn't install .packlist files for core or vendor.
    DEBPKG:debian/ld_run_path - Remove standard libs from LD_RUN_PATH as per Debian policy.
    DEBPKG:debian/libnet_config_path - Set location of libnet.cfg to /etc/perl/Net as /usr may not be writable.
    DEBPKG:debian/m68k_thread_stress - http://bugs.debian.org/495826 Disable some threads tests on m68k for now due to missing TLS.
    DEBPKG:debian/mod_paths - Tweak @INC ordering for Debian
    DEBPKG:debian/module_build_man_extensions - http://bugs.debian.org/479460 Adjust Module::Build manual page extensions for the Debian Perl policy
    DEBPKG:debian/perl_synopsis - http://bugs.debian.org/278323 Rearrange perl.pod
    DEBPKG:debian/prune_libs - http://bugs.debian.org/128355 Prune the list of libraries wanted to what we actually need.
    DEBPKG:debian/use_gdbm - Explicitly link against -lgdbm_compat in ODBM_File/NDBM_File. 
    DEBPKG:fixes/assorted_docs - http://bugs.debian.org/443733 [384f06a] Math::BigInt::CalcEmu documentation grammar fix
    DEBPKG:fixes/net_smtp_docs - http://bugs.debian.org/100195 [rt.cpan.org #36038] Document the Net::SMTP 'Port' option
    DEBPKG:fixes/processPL - http://bugs.debian.org/357264 [rt.cpan.org #17224] Always use PERLRUNINST when building perl modules.
    DEBPKG:debian/perlivp - http://bugs.debian.org/510895 Make perlivp skip include directories in /usr/local
    DEBPKG:fixes/pod2man-index-backslash - http://bugs.debian.org/521256 Escape backslashes in .IX entries
    DEBPKG:debian/disable-zlib-bundling - Disable zlib bundling in Compress::Raw::Zlib
    DEBPKG:fixes/kfreebsd_cppsymbols - http://bugs.debian.org/533098 [3b910a0] Add gcc predefined macros to $Config{cppsymbols} on GNU/kFreeBSD.
    DEBPKG:debian/cpanplus_definstalldirs - http://bugs.debian.org/533707 Configure CPANPLUS to use the site directories by default.
    DEBPKG:debian/cpanplus_config_path - Save local versions of CPANPLUS::Config::System into /etc/perl.
    DEBPKG:fixes/kfreebsd-filecopy-pipes - http://bugs.debian.org/537555 [16f708c] Fix File::Copy::copy with pipes on GNU/kFreeBSD
    DEBPKG:fixes/anon-tmpfile-dir - http://bugs.debian.org/528544 [perl #66452] Honor TMPDIR when open()ing an anonymous temporary file
    DEBPKG:fixes/abstract-sockets - http://bugs.debian.org/329291 [89904c0] Add support for Abstract namespace sockets.
    DEBPKG:fixes/hurd_cppsymbols - http://bugs.debian.org/544307 [eeb92b7] Add gcc predefined macros to $Config{cppsymbols} on GNU/Hurd.
    DEBPKG:fixes/autodie-flock - http://bugs.debian.org/543731 Allow for flock returning EAGAIN instead of EWOULDBLOCK on linux/parisc
    DEBPKG:fixes/archive-tar-instance-error - http://bugs.debian.org/539355 [rt.cpan.org #48879] Separate Archive::Tar instance error strings from each other
    DEBPKG:fixes/positive-gpos - http://bugs.debian.org/545234 [perl #69056] [c584a96] Fix \\G crash on first match
    DEBPKG:debian/devel-ppport-ia64-optim - http://bugs.debian.org/548943 Work around an ICE on ia64
    DEBPKG:fixes/trie-logic-match - http://bugs.debian.org/552291 [perl #69973] [0abd0d7] Fix a DoS in Unicode processing [CVE-2009-3626]
    DEBPKG:fixes/hppa-thread-eagain - http://bugs.debian.org/554218 make the threads-shared test suite more robust, fixing failures on hppa
    DEBPKG:fixes/crash-on-undefined-destroy - http://bugs.debian.org/564074 [perl #71952] [1f15e67] Fix a NULL pointer dereference when looking for a DESTROY method
    DEBPKG:fixes/tainted-errno - http://bugs.debian.org/574129 [perl #61976] [be1cf43] fix an errno stringification bug in taint mode
    DEBPKG:patchlevel - http://bugs.debian.org/567489 List packaged patches for 5.10.1-12 in patchlevel.h

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants