Skip Menu |
Report information
Id: 128227
Status: open
Priority: 0/
Queue: perl5

Owner: Nobody
Requestors: e [at] 80x24.org
Cc:
AdminCc:

Operating System: (no value)
PatchStatus: (no value)
Severity: low
Type: unknown
Perl Version: (no value)
Fixed In: (no value)



To: perlbug [...] perl.org
CC: e [...] 80x24.org
From: e [...] 80x24.org
Subject: vfork should be used for spawning external processes
Date: Mon, 23 May 2016 23:53:57 +0000 (UTC)
Download (untitled) / with headers
text/plain 3.9k
This is a bug report for perl from e@80x24.org, generated with the help of perlbug 1.40 running under perl 5.24.0. ----------------------------------------------------------------- Perl currently uses fork + exec for spawning processes with system(), pipe open(), and `backtick` operators. Despite the implementation of copy-on-write (CoW) under Linux, fork performance degrades as process size grows. vfork avoids the issue by pausing the parent thread and sharing the heap until execve() or _exit() is called. In my attached example (vfork.perl) using Inline::C, it only takes around 3 seconds to vfork+execve+waitpid "/bin/true" 10000 times on my system. With the Perl CORE::system() function, it takes 11-12 seconds(!). This is with a 10MB string in memory. Increasing the size of the $mem string in the attached script degrades performance further. Keep in mind vfork is tricky to use and some of the caveats are documented at: https://ewontfix.com/7/ The mainline Ruby implementation has been using vfork since Ruby 2.2 (released December 2014) and AFAIK we have not had major problems related to it; even in multi-threaded (pthreads) environments. Disclaimer: I'm a member of the ruby-core team, but I probably use Perl more than Ruby :) Unfortunately my knowledge of Perl internals is weak at this point, but I will try my best to help answer questions related to implementing vfork. Thank you for the many years of Perl 5! ----------------------------------------------------------------- --- Flags: category=core severity=wishlist --- Site configuration information for perl 5.24.0: Configured by ew at Mon May 23 22:06:43 UTC 2016. Summary of my perl5 (revision 5 version 24 subversion 0) configuration: Commit id: be2c0c650b028f54e427f2469a59942edfdff8a9 Platform: osname=linux, osvers=4.5.3-x86_64-linode67, archname=x86_64-linux-64int uname='linux dcvr 4.5.3-x86_64-linode67 #3 smp tue may 10 10:22:44 edt 2016 x86_64 gnulinux ' config_args='-des -Dprefix=/home/eee/p -Duse64bitint' hint=recommended, useposix=true, d_sigaction=define useithreads=undef, usemultiplicity=undef use64bitint=define, use64bitall=undef, uselongdouble=undef usemymalloc=n, bincompat5005=undef Compiler: cc='cc', ccflags ='-fwrapv -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -D_FORTIFY_SOURCE=2', optimize='-O2', cppflags='-fwrapv -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include' ccversion='', gccversion='4.9.2', gccosandvers='' intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=12345678, doublekind=3 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12, longdblkind=3 ivtype='long long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8 alignbytes=4, prototype=define Linker and Libraries: ld='cc', ldflags =' -fstack-protector-strong -L/usr/local/lib' libpth=/usr/local/lib /usr/lib/gcc/i586-linux-gnu/4.9/include-fixed /usr/include/i386-linux-gnu /usr/lib /lib/i386-linux-gnu /lib/../lib /usr/lib/i386-linux-gnu /usr/lib/../lib /lib libs=-lpthread -lnsl -lgdbm -ldl -lm -lcrypt -lutil -lc -lgdbm_compat perllibs=-lpthread -lnsl -ldl -lm -lcrypt -lutil -lc libc=libc-2.19.so, so=so, useshrplib=false, libperl=libperl.a gnulibc_version='2.19' Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E' cccdlflags='-fPIC', lddlflags='-shared -O2 -L/usr/local/lib -fstack-protector-strong' --- @INC for perl 5.24.0: /home/eee/p/lib/perl5/site_perl/5.24.0/x86_64-linux-64int /home/eee/p/lib/perl5/site_perl/5.24.0 /home/eee/p/lib/perl5/5.24.0/x86_64-linux-64int /home/eee/p/lib/perl5/5.24.0 . --- Environment for perl 5.24.0: HOME=/home/eee LANG (unset) LANGUAGE (unset) LD_LIBRARY_PATH (unset) LOGDIR (unset) PATH=/home/eee/p/bin:/usr/bin:/bin PERL_BADLANG (unset) SHELL=/bin/bash
Download vfork.perl
text/x-patch 3.1k

Message body is not shown because sender requested not to inline it.

From: Zefram <zefram [...] fysh.org>
Subject: Re: [perl #128227] vfork should be used for spawning external processes
To: perl5-porters [...] perl.org
Date: Tue, 24 May 2016 14:35:45 +0100
Download (untitled) / with headers
text/plain 469b
via RT wrote: Show quoted text
>In my attached example (vfork.perl) using Inline::C, it only takes >around 3 seconds to vfork+execve+waitpid "/bin/true" 10000 times >on my system. With the Perl CORE::system() function, it takes >11-12 seconds(!).
This is not a fair comparison. You should compare standard CORE::system() against an equivalent implementation of the system op that uses vfork instead of fork, because that's the change that you're proposing we should make. -zefram
Date: Tue, 24 May 2016 22:50:39 +0000
Subject: Re: [perl #128227] vfork should be used for spawning external processes
From: Eric Wong <e [...] 80x24.org>
To: Zefram via RT <perlbug-followup [...] perl.org>
Download (untitled) / with headers
text/plain 1.6k
Zefram via RT <perlbug-followup@perl.org> wrote: Show quoted text
> via RT wrote:
> >In my attached example (vfork.perl) using Inline::C, it only takes > >around 3 seconds to vfork+execve+waitpid "/bin/true" 10000 times > >on my system. With the Perl CORE::system() function, it takes > >11-12 seconds(!).
> > This is not a fair comparison. You should compare standard CORE::system() > against an equivalent implementation of the system op that uses vfork > instead of fork, because that's the change that you're proposing we > should make.
Oops, missed the pipe creation and probably a few other things system() does. I tried dumbly swapping out fork() for vfork() in the perl sources but of course that doesn't work, yet: the child process is modifying its heap in several places and probably doing other vfork-incompatible things. I still haven't had time to digest and learn much about the perl internals; but I figured I'd start with a wishlist report to get things started. Anyways, attached is a tiny standalone C program with most error-checking omitted. I've only tested it on Debian GNU/Linux but hopefully it runs on most POSIX-like systems with vfork. $ gcc -o vfork-test -Wall -O2 vfork-test.c $ time ./vfork-test : normal fork() real 0m6.444s user 0m0.023s sys 0m1.713s $ time ./vfork-test vfork real 0m2.725s user 0m0.013s sys 0m0.243s That's only with 10M malloc-ed, increasing the malloc-ed size will show vfork performance remains stable as process size increases. Anyways, I hope the above numbers are convincing enough to have somebody more familiar than I to change Perl to use vfork instead of fork whenever possible. Thank you.
Download vfork-test.c
text/x-csrc 634b

Message body is not shown because sender requested not to inline it.

To: Perl5 Porters <perl5-porters [...] perl.org>
CC: "bugs-bitbucket [...] rt.perl.org" <bugs-bitbucket [...] rt.perl.org>
From: Leon Timmermans <fawaka [...] gmail.com>
Subject: Re: [perl #128227] vfork should be used for spawning external processes
Date: Wed, 25 May 2016 10:36:07 +0200
Download (untitled) / with headers
text/plain 1.9k
On Tue, May 24, 2016 at 1:54 AM, via RT <perlbug-followup@perl.org> wrote:
Show quoted text
Perl currently uses fork + exec for spawning processes with
system(), pipe open(), and `backtick` operators.

Despite the implementation of copy-on-write (CoW) under Linux,
fork performance degrades as process size grows.  vfork avoids
the issue by pausing the parent thread and sharing the heap
until execve() or _exit() is called.

In my attached example (vfork.perl) using Inline::C, it only takes
around 3 seconds to vfork+execve+waitpid "/bin/true" 10000 times
on my system.  With the Perl CORE::system() function, it takes
11-12 seconds(!).

This is with a 10MB string in memory.  Increasing the size of the
$mem string in the attached script degrades performance further.

Keep in mind vfork is tricky to use and some of the caveats
are documented at: https://ewontfix.com/7/

The mainline Ruby implementation has been using vfork since
Ruby 2.2 (released December 2014) and AFAIK we have not had
major problems related to it; even in multi-threaded (pthreads)
environments.

Disclaimer: I'm a member of the ruby-core team, but I probably
use Perl more than Ruby :)

Unfortunately my knowledge of Perl internals is weak at this
point, but I will try my best to help answer questions related
to implementing vfork.

Thank you for the many years of Perl 5!

It appears we removed vfork support 15 years ago in 52e18b1f. I think the concept of it is out-of-date, and using it correctly is fickle. I don't think it's a good idea.

It may actually be sane to implement system on top of posix_spawn though (which can use vfork internally). That at least can be reasoned about, and seems to generally do what we want (I've never actually used it, there may be pitfalls that aren't obvious yet).

Not sure either option is really worth the development effort though, any program where fork is a serious bottleneck is likely to be a mess anyway unless you're writing a shell or make.

Leon


Date: Wed, 25 May 2016 13:00:25 +0300
Subject: Re: [perl #128227] vfork should be used for spawning external processes
From: Ivan Pozdeev via perl5-porters <perl5-porters [...] perl.org>
To: Leon Timmermans <fawaka [...] gmail.com>, Perl5 Porters <perl5-porters [...] perl.org>
CC: "bugs-bitbucket [...] rt.perl.org" <bugs-bitbucket [...] rt.perl.org>, perlbug-followup [...] perl.org
Download (untitled) / with headers
text/plain 374b
Configure has a question about whether to use vfork(), warning that "perl can only use vfork() that doesn't suffer from severe limitations". If you're going down this road, you should start your research by exploring this lead. Besides, posix_spawn() is the current POSIX standard that is intended to replace vfork() that apparently didn't do well enough in practice.
From: Eric Wong <e [...] 80x24.org>
Subject: Re: [perl #128227] vfork should be used for spawning external processes
CC: Perl5 Porters <perl5-porters [...] perl.org>, "bugs-bitbucket [...] rt.perl.org" <bugs-bitbucket [...] rt.perl.org>
To: Leon Timmermans <fawaka [...] gmail.com>
Date: Thu, 26 May 2016 07:40:04 +0000
Download (untitled) / with headers
text/plain 1.3k
Leon Timmermans <fawaka@gmail.com> wrote: Show quoted text
> It appears we removed vfork support 15 years ago in 52e18b1f. I think the > concept of it is out-of-date, and using it correctly is fickle. I don't > think it's a good idea.
Sadly, in the past 15 years process sizes have gotten bigger and make it more expensive to fork. Show quoted text
> It may actually be sane to implement system on top of posix_spawn though > (which can use vfork internally). That at least can be reasoned about, and > seems to generally do what we want (I've never actually used it, there may > be pitfalls that aren't obvious yet).
Good point, I'll give posix_spawn a shot if nobody beats me to it. It could be a few months or even years for me, though. Ruby doesn't use posix_spawn since it needs to support chdir/rlimits/umask and a bunch of other things in its generic Process.spawn API (which system() shares code with); but Perl doesn't do those things, so I suppose it's fine for Perl. Show quoted text
> Not sure either option is really worth the development effort though, any > program where fork is a serious bottleneck is likely to be a mess anyway > unless you're writing a shell or make.
Who knows what I'll be working on :> But yeah, I plan to look at getting the dash shell to support posix_spawn or vfork, too; and maybe bash/GNU make, too (but FSF copyright assignment is a pain).
Subject: Re: [perl #128227] perlbug AutoReply: vfork should be used for spawning external processes
To: perlbug-followup [...] perl.org
From: Eric Wong <e [...] 80x24.org>
Date: Mon, 28 Jan 2019 09:17:51 +0000
Download (untitled) / with headers
text/plain 528b
Show quoted text
Another note: I hit ENOMEM when attempting to spawn one small subprocess from a giant Perl process. I had enough RAM for the small subprocess (but not enough for a full fork of the giant Perl process). The kernel could not know the fork() was for an execve() shortly afterwards when preparing for CoW, so it bailed with ENOMEM on fork(). Using posix_spawn (where glibc uses CLONE_VFORK behind-the-scenes) or vfork for spawning subprocesses would've avoided this problem.
Subject: Re: [perl #128227] perlbug: vfork should be used for spawning external processes
Date: Mon, 28 Jan 2019 11:09:27 +0100
CC: perlbug-followup [...] perl.org
To: Eric Wong <e [...] 80x24.org>
From: Mark Overmeer <mark [...] overmeer.net>
* Eric Wong (e@80x24.org) [190128 09:23]: Show quoted text
> I hit ENOMEM when attempting to spawn one small subprocess from > a giant Perl process. I had enough RAM for the small subprocess > (but not enough for a full fork of the giant Perl process).
Well, reading fork(2) tells me ENOMEM fork() failed to allocate the necessary kernel structures because memory is tight. It's not directly about the normal RAM used by the giant process, but about the RAM used for kernel administration, which may have restrictions. You may be able to find an error or warning in dmesg about which resource is really the limitation the kernel bounces into. That might show that your processor is too small for the task (number of page tables, DMA range, ...) -- greetz, MarkOv ------------------------------------------------------------------------ Mark Overmeer MSc MARKOV Solutions Mark@Overmeer.net solutions@overmeer.net http://Mark.Overmeer.net http://solutions.overmeer.net
RT-Send-CC: perl5-porters [...] perl.org, bugs-bitbucket [...] rt.perl.org, Mark [...] Overmeer.net, zefram [...] fysh.org, perl5-porters [...] perl.org
Dana Mon, 28 Jan 2019 02:55:53 -0800, Mark@Overmeer.net reče: Show quoted text
> * Eric Wong (e@80x24.org) [190128 09:23]:
> > I hit ENOMEM when attempting to spawn one small subprocess from > > a giant Perl process. I had enough RAM for the small subprocess > > (but not enough for a full fork of the giant Perl process).
> > Well, reading fork(2) tells me > > ENOMEM fork() failed to allocate the necessary kernel structures > because memory is tight. > > It's not directly about the normal RAM used by the giant process, but > about the RAM used for kernel administration, which may have restrictions. > > You may be able to find an error or warning in dmesg about which resource > is really the limitation the kernel bounces into. That might show that > your processor is too small for the task (number of page tables, DMA > range, ...)
I doubt it's about kernel memory. Just tried a experiment on a recent Linux system (debian/stretch) with 32GB RAM + 1 GB Swap (at the moment used memory by other processes: < 1GB). The following script just allocates memory and then tries a simple system() call: #!/usr/bin/perl use strict; my $alloc_gb = shift || 4; my $mb = " " x 1024**2; my $buf; for (1..$alloc_gb*1024) { $buf .= $mb; } system 'echo', 'system() was successful'; print "Return values of system() call: \$?=$?, \$!=$!\n"; __END__ Running script allocating 16GB of RAM: $ perl /tmp/m.pl 16 Return values of system() call: $?=0, $!=Cannot allocate memory No logging happens in any of the files in /var/log at this time. Running the script using with 15GB is successful. Now another script using spawn() provided by POSIX::RT::Spawn (which unfortunately does not build anymore for perl >= 5.28.0): #!/usr/bin/perl use strict; use POSIX::RT::Spawn; my $alloc_gb = shift || 4; my $mb = " " x 1024**2; my $buf; for (1..$alloc_gb*1024) { $buf .= $mb; } my $pid = spawn 'echo', 'spawn() was successful' or die "failed to spawn: $!"; waitpid $pid, 0; die "command failed with status: ", $?>>8 if $?; __END__ No problem running this script with 28GB. Regards, Slaven


This service is sponsored and maintained by Best Practical Solutions and runs on Perl.org infrastructure.

For issues related to this RT instance (aka "perlbug"), please contact perlbug-admin at perl.org