Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fork during parsing exhausts parsing file #9811

Open
p5pRT opened this issue Aug 2, 2009 · 6 comments
Open

fork during parsing exhausts parsing file #9811

p5pRT opened this issue Aug 2, 2009 · 6 comments

Comments

@p5pRT
Copy link

p5pRT commented Aug 2, 2009

Migrated from rt.perl.org#68118 (status was 'open')

Searchable as RT68118$

@p5pRT
Copy link
Author

p5pRT commented Aug 2, 2009

From perlbug@plan9.de

Created by perlbug@plan9.de

This program prints "here" only once, when one would naively expect it to
print it twice​:

  BEGIN { fork }
  warn "here\n";

I guess this is because the parser exhausts the file, so the next run will
hit EOF immediately, as this is in the middle of the parse.

I think this should either be fixed, or the parse file handle be made
accessible (perl programs can't really be responsible for implementation
details they have no access to - if all the open parser fh's would be
accessible it could be made the responsibility of the perl program to get
it right).

Perl Info

Flags:
    category=core
    severity=low

Site configuration information for perl 5.10.0:

Configured by Marc Lehmann at Sat Feb 21 02:30:27 CET 2009.

Summary of my perl5 (revision 5 version 10 subversion 0) configuration:
  Platform:
    osname=linux, osvers=2.6.24-etchnhalf.1-amd64, archname=amd64-linux
    uname='linux cerebro 2.6.24-etchnhalf.1-amd64 #1 smp mon jul 21 10:36:02 utc 2008 x86_64 gnulinux '
    config_args='-Duselargefiles -Dxxxxuse64bitint -Uuse64bitall -Dusemymalloc=n -Dcc=gcc -Dccflags=-ggdb -gdwarf-2 -g3 -Dcppflags=-DPERL_ARENA_SIZE=16368 -D_GNU_SOURCE -I/opt/include -Doptimize=-O6 -msse2 -funroll-loops -fno-strict-aliasing -Dcccdlflags=-fPIC -Dldflags=-L/opt/perl/lib -L/opt/lib -Dlibs=-ldl -lm -lcrypt -Darchname=amd64-linux -Dprefix=/opt/perl -Dprivlib=/opt/perl/lib/perl5 -Darchlib=/opt/perl/lib/perl5 -Dvendorprefix=/opt/perl -Dvendorlib=/opt/perl/lib/perl5 -Dvendorarch=/opt/perl/lib/perl5 -Dsiteprefix=/opt/perl -Dsitelib=/opt/perl/lib/perl5 -Dsitearch=/opt/perl/lib/perl5 -Dsitebin=/opt/perl/bin -Dman1dir=/opt/perl/man/man1 -Dman3dir=/opt/perl/man/man3 -Dsiteman1dir=/opt/perl/man/man1 -Dsiteman3dir=/opt/perl/man/man3 -Dman1ext=1 -Dman3ext=3 -Dpager=/usr/bin/less -Uafs -Uusesfio -Uusenm -Uuseshrplib -Dd_dosuid -Dusethreads=undef -Duse5005threads=undef -Duseithreads=undef -Dusemultiplicity=undef -Demail=perl-binary@plan9.de -Dcf_email=perl-binary@plan9.de -Dcf_by=Marc Lehmann -Dlocincpth=/opt/perl/include /opt/include -Dmyhostname=localhost -Dmultiarch=undef -Dbin=/opt/perl/bin -Dxxxusedevel -DxxxDEBUGGING -Dxxxuse_debugging_perl -Dxxxuse_debugmalloc -des'
    hint=recommended, useposix=true, d_sigaction=define
    useithreads=undef, usemultiplicity=undef
    useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef
    use64bitint=define, use64bitall=undef, uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='gcc', ccflags ='-ggdb -gdwarf-2 -g3 -fno-strict-aliasing -pipe -I/opt/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
    optimize='-O6 -msse2 -funroll-loops -fno-strict-aliasing',
    cppflags='-DPERL_ARENA_SIZE=16368 -D_GNU_SOURCE -I/opt/include -ggdb -gdwarf-2 -g3 -fno-strict-aliasing -pipe -I/opt/include'
    ccversion='', gccversion='4.3.2', gccosandvers=''
    intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
    ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
    alignbytes=8, prototype=define
  Linker and Libraries:
    ld='gcc', ldflags ='-L/opt/perl/lib -L/opt/lib -L/usr/local/lib'
    libpth=/usr/local/lib /lib /usr/lib /lib64 /usr/lib64
    libs=-ldl -lm -lcrypt
    perllibs=-ldl -lm -lcrypt
    libc=/lib/libc-2.7.so, so=so, useshrplib=false, libperl=libperl.a
    gnulibc_version='2.7'
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E'
    cccdlflags='-fPIC', lddlflags='-shared -O6 -msse2 -funroll-loops -fno-strict-aliasing -L/opt/perl/lib -L/opt/lib -L/usr/local/lib'

Locally applied patches:
    http://public.activestate.com/cgi-bin/perlbrowse/p/34209
    http://public.activestate.com/cgi-bin/perlbrowse/p/34507
    http://www.gossamer-threads.com/lists/perl/porters/232549
    embed.fnc:Perl_vcroak NULLOK


@INC for perl 5.10.0:
    /root/src/sex
    /opt/perl/lib/perl5
    /opt/perl/lib/perl5
    /opt/perl/lib/perl5
    /opt/perl/lib/perl5
    /opt/perl/lib/perl5
    .


Environment for perl 5.10.0:
    HOME=/root
    LANG (unset)
    LANGUAGE (unset)
    LC_CTYPE=en_US.UTF-8
    LD_LIBRARY_PATH (unset)
    LOGDIR (unset)
    PATH=/root/s2:/root/s:/opt/bin:/opt/sbin:/bin:/sbin:/usr/bin:/usr/sbin:/usr/X11/bin:/usr/games:/usr/local/bin:/usr/local/sbin:/root/pserv:.
    PERL5LIB=/root/src/sex
    PERL5_CPANPLUS_CONFIG=/root/.cpanplus/config
    PERLDB_OPTS=ornaments=0
    PERL_ANYEVENT_DBI_TESTS=1
    PERL_ANYEVENT_EDNS0=1
    PERL_ANYEVENT_NET_TESTS=1
    PERL_ANYEVENT_PROTOCOLS=ipv4,ipv6
    PERL_ANYEVENT_STRICT=1
    PERL_BADLANG (unset)
    PERL_UNICODE=0
    SHELL=/bin/bash

@p5pRT
Copy link
Author

p5pRT commented Aug 6, 2009

From @rgs

2009/8/2 perlbug@​plan9.de (via RT) <perlbug-followup@​perl.org>​:

This program prints "here" only once, when one would naively expect it to
print it twice​:

  BEGIN { fork }
  warn "here\n";

I guess this is because the parser exhausts the file, so the next run will
hit EOF immediately, as this is in the middle of the parse.

I think this should either be fixed, or the parse file handle be made
accessible (perl programs can't really be responsible for implementation
details they have no access to - if all the open parser fh's would be
accessible it could be made the responsibility of the perl program to get
it right).

Note that a simple workaround to this behaviour is to use __DATA__ and
its filehandle, and rewind it, as in​:

seek DATA,0,0;
print '-'x20,"\n";
print for <DATA>;
print '-'x20,"\n";
__DATA__

@p5pRT
Copy link
Author

p5pRT commented Aug 6, 2009

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Aug 26, 2009

From schmorp@schmorp.de

On Thu, Aug 06, 2009 at 04​:57​:51PM +0200, Rafael Garcia-Suarez <rgarciasuarez@​gmail.com> wrote​:

I think this should either be fixed, or the parse file handle be made
accessible (perl programs can't really be responsible for implementation
details they have no access to - if all the open parser fh's would be
accessible it could be made the responsibility of the perl program to get
it right).

Note that a simple workaround to this behaviour is to use __DATA__ and
its filehandle, and rewind it, as in​:

Just stumbled over your reply by accident (you didn't send it to me, of
course).

Please note that your example does not work, because DATA is not available
in a BEGIN block, nor does it work when the code is used in a module.

The workaround I use in Anyevent​::Watchdog is this, which is of coruse rather
painful, but works fine​:

Before fork​:

  our %SEEKPOS;
  # due to bugs in perl, try to remember file offsets for all fds, and restore them later
  # (the parser otherwise exhausts the input files)

  # this causes perlio to flush it's handles internally, so
  # seek offsets become correct.
  exec "."; # toi toi toi

  # now records all fd positions
  for (0 .. 1023) {
  open my $fh, "<&$_" or next;
  $SEEKPOS{$_} = (sysseek $fh, 0, 1 or next);
  }

After each fork​:

  # restore seek offsets
  while (my ($k, $v) = each %SEEKPOS) {
  open my $fh, "<&$k" or next;
  sysseek $fh, $v, 0;
  }

The code is so ugly because there is no way to access the file handles in
any other way (the parser doesn't expose them), and that I need an exec to
reify the file offsets so I cna query them (a dummy fork would work, too,
and is probably cleaner).

Since it seems to work, I am fine with that as long as I do not have to
look at the code. It would be nice if the perl parser would support
forking, however.

--
  The choice of a Deliantra, the free code+content MORPG
  -----==- _GNU_ http​://www.deliantra.net
  ----==-- _ generation
  ---==---(_)__ __ ____ __ Marc Lehmann
  --==---/ / _ \/ // /\ \/ / pcg@​goof.com
  -=====/_/_//_/\_,_/ /_/\_\

@p5pRT
Copy link
Author

p5pRT commented Aug 26, 2009

From @rgs

2009/8/26 Marc Lehmann <schmorp@​schmorp.de>​:

On Thu, Aug 06, 2009 at 04​:57​:51PM +0200, Rafael Garcia-Suarez <rgarciasuarez@​gmail.com> wrote​:

I think this should either be fixed, or the parse file handle be made
accessible (perl programs can't really be responsible for implementation
details they have no access to - if all the open parser fh's would be
accessible it could be made the responsibility of the perl program to get
it right).

Note that a simple workaround to this behaviour is to use __DATA__ and
its filehandle, and rewind it, as in​:

Just stumbled over your reply by accident (you didn't send it to me, of
course).

No. Your mail address wasn't in the From or in the Reply-To headers.
RT should have forwarded my email to you, but apparently that did not
happen. Am I understanding correctly what RT should do here ?

Please note that your example does not work, because DATA is not available
in a BEGIN block, nor does it work when the code is used in a module.

Yes. That was a specific workaround for simple cases.

The workaround I use in Anyevent​::Watchdog is this, which is of coruse rather
painful, but works fine​:

Before fork​:

  our %SEEKPOS;
  # due to bugs in perl, try to remember file offsets for all fds, and restore them later
  # (the parser otherwise exhausts the input files)

  # this causes perlio to flush it's handles internally, so
  # seek offsets become correct.
  exec "."; # toi toi toi

  # now records all fd positions
  for (0 .. 1023) {
     open my $fh, "<&$_" or next;
     $SEEKPOS{$_} = (sysseek $fh, 0, 1 or next);
  }

After each fork​:

     # restore seek offsets
     while (my ($k, $v) = each %SEEKPOS) {
        open my $fh, "<&$k" or next;
        sysseek $fh, $v, 0;
     }

The code is so ugly because there is no way to access the file handles in
any other way (the parser doesn't expose them), and that I need an exec to
reify the file offsets so I cna query them (a dummy fork would work, too,
and is probably cleaner).

Since it seems to work, I am fine with that as long as I do not have to
look at the code. It would be nice if the perl parser would support
forking, however.

I agree. I think that IlyaZ encountered the same problem some years
ago and even proposed the start of the solution.

@p5pRT
Copy link
Author

p5pRT commented Aug 26, 2009

From schmorp@schmorp.de

On Wed, Aug 26, 2009 at 11​:06​:55AM +0200, Rafael Garcia-Suarez <rgarciasuarez@​gmail.com> wrote​:

Just stumbled over your reply by accident (you didn't send it to me, of
course).

No. Your mail address wasn't in the From or in the Reply-To headers.

Oh right, rt.cpan.org has this annoying habit of remoivng e-mail addresses.

RT should have forwarded my email to you, but apparently that did not
happen. Am I understanding correctly what RT should do here ?

I don't know - normally I receive replies to perlbug-reports. Not a big
deal in any case.

Since it seems to work, I am fine with that as long as I do not have to
look at the code. It would be nice if the perl parser would support
forking, however.

I agree. I think that IlyaZ encountered the same problem some years
ago and even proposed the start of the solution.

Well, I can live with havign to do some extra work - the problem isn't
exactly high-priority. It's just that my workaround is beyond evil
(especially the dummy exec, relying on even more internals).

But as it seems to work, I can live with it for the time being.

(Maybe it should just be documented - that fork doesn't work in BEGIN
under windows is actually mentioned somewhere).

--
  The choice of a Deliantra, the free code+content MORPG
  -----==- _GNU_ http​://www.deliantra.net
  ----==-- _ generation
  ---==---(_)__ __ ____ __ Marc Lehmann
  --==---/ / _ \/ // /\ \/ / pcg@​goof.com
  -=====/_/_//_/\_,_/ /_/\_\

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants