Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

malloc deadlock in unsafe signal handler #15014

Open
p5pRT opened this issue Oct 28, 2015 · 11 comments
Open

malloc deadlock in unsafe signal handler #15014

p5pRT opened this issue Oct 28, 2015 · 11 comments

Comments

@p5pRT
Copy link

p5pRT commented Oct 28, 2015

Migrated from rt.perl.org#126474 (status was 'open')

Searchable as RT126474$

@p5pRT
Copy link
Author

p5pRT commented Oct 28, 2015

From kazuhooku@gmail.com

Created by kazuhooku@gmail.com

This is a bug report for perl from kazuhooku@​gmail.com,
generated with the help of perlbug 1.40 running under perl 5.23.5.

-----------------------------------------------------------------
Unsafe signal handler calls malloc (via newSVsv), therefore causes
deadlock when signal is received while malloc is running.

When I run the following script, and repeatedly send SIGUSR1 to the
script (e.g. while 1 ; do kill -USR1 proc-id ; sleep 1; done), then
the script enters a dead lock within the signal handler.

  use strict;
  use warnings;
  use POSIX qw(SIGUSR1);

  warn "pid​:$$\n";

  POSIX​::sigaction(SIGUSR1, POSIX​::SigAction->new(sub {}))
  or die "POSIX​::sigaction failed​:$!";

  while (1) {
  my $a = [];
  for my $i (1..1000000) {
  push @​$a, "​:$i";
  }
  print "hello\n";
  }

Looking at gdb backtrace, it is likely that the unsafe signal handler
(triggered while malloc is running) is calling malloc (which is not a
reentrant function).

  (gdb) bt
  #0 __lll_lock_wait_private () at
../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S​:95
  #1 0x00007f51774bfdca in _L_lock_12779 () at malloc.c​:5206
  #2 0x00007f51774bd7a5 in __GI___libc_malloc (bytes=10) at malloc.c​:2887
  #3 0x000000000048f775 in Perl_safesysmalloc ()
  #4 0x00000000004bd230 in Perl_sv_grow ()
  #5 0x00000000004bc1b8 in Perl_sv_setsv_flags ()
  #6 0x00000000004c5f13 in Perl_newSVsv ()
  #7 0x000000000049b383 in Perl_sighandler ()
  #8 <signal handler called>
  #9 malloc_consolidate (av=av@​entry=0x7f51777f9760 <main_arena>)
at malloc.c​:4173
  #10 0x00007f51774ba56d in _int_free (av=0x7f51777f9760
<main_arena>, p=<optimized out>, have_lock=0) at malloc.c​:4057
  #11 0x00000000004b5881 in Perl_sv_clear ()
  #12 0x00000000004b5929 in Perl_sv_free2 ()
  #13 0x00000000004dbb8c in Perl_leave_scope ()
  #14 0x00000000004aa545 in Perl_pp_unstack ()
  #15 0x00000000004a9b83 in Perl_runops_standard ()
  #16 0x00000000004412d0 in perl_run ()
  #17 0x000000000041fff5 in main ()

I believe this regression was introduced in 5.17.2 when #45173
(Arriving signals no longer clear $@​) got fixed.

Perl Info

Flags:
    category=core
    severity=low

Site configuration information for perl 5.23.5:

Configured by kazuho at Wed Oct 28 09:17:29 JST 2015.

Summary of my perl5 (revision 5 version 23 subversion 5) configuration:
  Commit id: 6504068eb895d4fb6161ddbb0249e59c19afa707
  Platform:
    osname=linux, osvers=3.13.0-65-generic, archname=x86_64-linux
    uname='linux ubuntu1404 3.13.0-65-generic #106-ubuntu smp fri oct
2 22:08:27 utc 2015 x86_64 x86_64 x86_64 gnulinux '
    config_args=''
    hint=previous, useposix=true, d_sigaction=define
    useithreads=undef, usemultiplicity=undef
    use64bitint=define, use64bitall=define, uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='cc', ccflags ='-fwrapv -fno-strict-aliasing -pipe
-fstack-protector -I/usr/local/include -D_LARGEFILE_SOURCE
-D_FILE_OFFSET_BITS=64',
    optimize='-O2',
    cppflags='-fwrapv -fno-strict-aliasing -pipe -fstack-protector
-I/usr/local/include'
    ccversion='', gccversion='4.8.2', gccosandvers=''
    intsize=4, longsize=8, ptrsize=8, doublesize=8,
byteorder=12345678, doublekind=3
    d_longlong=define, longlongsize=8, d_longdbl=define,
longdblsize=16, longdblkind=3
    ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t',
lseeksize=8
    alignbytes=8, prototype=define
  Linker and Libraries:
    ld='cc', ldflags =' -fstack-protector -L/usr/local/lib'
    libpth=/usr/local/lib
/usr/lib/gcc/x86_64-linux-gnu/4.8/include-fixed
/usr/include/x86_64-linux-gnu /usr/lib /lib/x86_64-linux-gnu
/lib/../lib /usr/lib/x86_64-linux-gnu /usr/lib/../lib /lib
/usr/local/lib /usr/lib/gcc/x86_64-linux-gnu/4.8/include-fixed
/usr/include/x86_64-linux-gnu /usr/lib
    libs=-lpthread -lnsl -ldl -lm -lcrypt -lutil -lc
    perllibs=-lpthread -lnsl -ldl -lm -lcrypt -lutil -lc
    libc=libc-2.19.so, so=so, useshrplib=false, libperl=libperl.a
    gnulibc_version='2.19'
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E'
    cccdlflags='-fPIC', lddlflags='-shared -O2 -L/usr/local/lib
-fstack-protector'



@INC for perl 5.23.5:
    /usr/local/perl-bleed/lib/site_perl/5.23.5/x86_64-linux
    /usr/local/perl-bleed/lib/site_perl/5.23.5
    /usr/local/perl-bleed/lib/5.23.5/x86_64-linux
    /usr/local/perl-bleed/lib/5.23.5
    .


Environment for perl 5.23.5:
    HOME=/home/kazuho
    LANG=en_US.UTF-8
    LANGUAGE=en_US:en
    LD_LIBRARY_PATH (unset)
    LOGDIR (unset)
    PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games
    PERL_BADLANG (unset)
    SHELL=/bin/bash

@p5pRT
Copy link
Author

p5pRT commented Oct 28, 2015

From @tonycoz

On Wed Oct 28 03​:44​:27 2015, kazuhooku@​gmail.com wrote​:

Unsafe signal handler calls malloc (via newSVsv), therefore causes
deadlock when signal is received while malloc is running.

When I run the following script, and repeatedly send SIGUSR1 to the
script (e.g. while 1 ; do kill -USR1 proc-id ; sleep 1; done), then
the script enters a dead lock within the signal handler.

I don't think this is a bug - it's the reason safe signal handlers were introduced.

Tony

@p5pRT
Copy link
Author

p5pRT commented Oct 28, 2015

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Oct 28, 2015

From kazuhooku@gmail.com

Thank you for your response.

I disagree. It is true that the use of unsafe signal is discouraged,
but people have written code relying on unsafe signal handlers and IMO
they should continue to work.

My understanding is that the introduction of safe signals in Perl 5.8
was to protect people from writing corrupt signal handlers (causing
deadlocks or segfaults). But even after that, Perl has provided
freedom to use unsafe signal handlers at one's own risk, as stated in
perldoc perlipc, and cautiously-coded signal handlers (call only
async-signal-safe C functions, or do nothing at all) have continued to
work.

However (as I believe) in Perl 5.17, the implementation of the signal
handler in the perl core has been altered to _always_ call malloc (via
newSVsv).

In other words, starting from 5.17, it has become _impossible_ to
write an unsafe signal handler without fear of deadlock. And that is
what I see as a regression.

2015-10-29 7​:34 GMT+09​:00 Tony Cook via RT <perlbug-followup@​perl.org>​:

On Wed Oct 28 03​:44​:27 2015, kazuhooku@​gmail.com wrote​:

Unsafe signal handler calls malloc (via newSVsv), therefore causes
deadlock when signal is received while malloc is running.

When I run the following script, and repeatedly send SIGUSR1 to the
script (e.g. while 1 ; do kill -USR1 proc-id ; sleep 1; done), then
the script enters a dead lock within the signal handler.

I don't think this is a bug - it's the reason safe signal handlers were introduced.

Tony

--
Kazuho Oku

@p5pRT
Copy link
Author

p5pRT commented Oct 29, 2015

From @Leont

On Thu, Oct 29, 2015 at 12​:50 AM, Kazuho Oku <kazuhooku@​gmail.com> wrote​:

I disagree. It is true that the use of unsafe signal is discouraged,
but people have written code relying on unsafe signal handlers and IMO
they should continue to work.

There is a very limited set of circumstances in which they can be relied
upon to work. It assumes the «*+ptr = foo» is atomic and requires the
value, mark and the save stacks all to be pre-allocated (not impossible,
not even unlikely, but still far from a guarantee), as well as a signal
handler doing just about nothing that may malloc.

Signal handlers in POSIX have this concept of an alternative stack (mainly
useful in case of a stack overflow), it may be useful here too.

My understanding is that the introduction of safe signals in Perl 5.8
was to protect people from writing corrupt signal handlers (causing
deadlocks or segfaults). But even after that, Perl has provided
freedom to use unsafe signal handlers at one's own risk, as stated in
perldoc perlipc, and cautiously-coded signal handlers (call only
async-signal-safe C functions, or do nothing at all) have continued to
work.

I have a hard time imagining async-signal-safe perl code, given that stack
manipulation isn't guaranteed to be (even if statistically the odds of a
stack reallocation are low).

However (as I believe) in Perl 5.17, the implementation of the signal
handler in the perl core has been altered to _always_ call malloc (via
newSVsv).

In other words, starting from 5.17, it has become _impossible_ to
write an unsafe signal handler without fear of deadlock. And that is
what I see as a regression.

I think you're referring to 100c03a. Possibly an alternative $@​ is the
solution here too.

Leon

@p5pRT
Copy link
Author

p5pRT commented Oct 30, 2015

From kazuhooku@gmail.com

Thank you for your response.

2015-10-29 18​:30 GMT+09​:00 Leon Timmermans via RT <perlbug-followup@​perl.org>​:

On Thu, Oct 29, 2015 at 12​:50 AM, Kazuho Oku <kazuhooku@​gmail.com> wrote​:

I disagree. It is true that the use of unsafe signal is discouraged,
but people have written code relying on unsafe signal handlers and IMO
they should continue to work.

There is a very limited set of circumstances in which they can be relied
upon to work. It assumes the «*+ptr = foo» is atomic and requires the
value, mark and the save stacks all to be pre-allocated (not impossible,
not even unlikely, but still far from a guarantee), as well as a signal
handler doing just about nothing that may malloc.

Thank you for the clarification.

Signal handlers in POSIX have this concept of an alternative stack (mainly
useful in case of a stack overflow), it may be useful here too.

Agreed.

My understanding is that the introduction of safe signals in Perl 5.8
was to protect people from writing corrupt signal handlers (causing
deadlocks or segfaults). But even after that, Perl has provided
freedom to use unsafe signal handlers at one's own risk, as stated in
perldoc perlipc, and cautiously-coded signal handlers (call only
async-signal-safe C functions, or do nothing at all) have continued to
work.

I have a hard time imagining async-signal-safe perl code, given that stack
manipulation isn't guaranteed to be (even if statistically the odds of a
stack reallocation are low).

For example, you can call POSIX​::write in the unsafe signal handler to
wake up `select` reliably. The code will look like​:

  pipe(my $rfh, $wfh);
  my $wfd = fileno $wfh;
  POSIX​::sigaction(SIGTERM, POSIX​::SigAction->new(sub {
  POSIX​::write $wfd, "1", 1;
  }));

  my $rbits = '';
  vec($rbits, fileno $rfh, 1) = 1;
  ...
  select($rbits, ...);

This is essentially same as using signaled provided by Linux, but is
more portable.

However (as I believe) in Perl 5.17, the implementation of the signal
handler in the perl core has been altered to _always_ call malloc (via
newSVsv).

In other words, starting from 5.17, it has become _impossible_ to
write an unsafe signal handler without fear of deadlock. And that is
what I see as a regression.

I think you're referring to 100c03a. Possibly an alternative $@​ is the
solution here too.

Agreed.

And the fact is that the odds are far more high for malloc to deadlock
(than to crash due to other race conditions such as stack relocation).

Leon

--
Kazuho Oku

@p5pRT
Copy link
Author

p5pRT commented Nov 4, 2015

From @Leont

On Fri, Oct 30, 2015 at 3​:26 AM, Kazuho Oku <kazuhooku@​gmail.com> wrote​:

My understanding is that the introduction of safe signals in Perl 5.8
was to protect people from writing corrupt signal handlers (causing
deadlocks or segfaults). But even after that, Perl has provided
freedom to use unsafe signal handlers at one's own risk, as stated in
perldoc perlipc, and cautiously-coded signal handlers (call only
async-signal-safe C functions, or do nothing at all) have continued to
work.

I have a hard time imagining async-signal-safe perl code, given that
stack
manipulation isn't guaranteed to be (even if statistically the odds of a
stack reallocation are low).

For example, you can call POSIX​::write in the unsafe signal handler to
wake up `select` reliably. The code will look like​:

pipe\(my $rfh\, $wfh\);
my $wfd = fileno $wfh;
POSIX&#8203;::sigaction\(SIGTERM\, POSIX&#8203;::SigAction\->new\(sub \{
    POSIX&#8203;::write $wfd\, "1"\, 1;
\}\)\);

my $rbits = '';
vec\($rbits\, fileno $rfh\, 1\) = 1;
\.\.\.
select\($rbits\, \.\.\.\);

This is essentially same as using signaled provided by Linux, but is
more portable.

A small amount of XS would be much more reliable IMO. You may want to check
out my Signal-Pipe distribution at https://github.com/Leont/signal-pipe
(not on CPAN yet for lack of documentation and tests, patches welcome).

Leon

@p5pRT
Copy link
Author

p5pRT commented Nov 9, 2015

From @tonycoz

On Thu Oct 29 19​:27​:04 2015, kazuhooku@​gmail.com wrote​:

For example, you can call POSIX​::write in the unsafe signal handler to
wake up `select` reliably. The code will look like​:

pipe(my $rfh, $wfh);
my $wfd = fileno $wfh;
POSIX​::sigaction(SIGTERM, POSIX​::SigAction->new(sub {
POSIX​::write $wfd, "1", 1;

POSIX​::write() calls sv_newmortal() for its return value, so there's a small change of a call to malloc() there.

}));

my $rbits = '';
vec($rbits, fileno $rfh, 1) = 1;
...
select($rbits, ...);

This is essentially same as using signaled provided by Linux, but is
more portable.

However (as I believe) in Perl 5.17, the implementation of the
signal
handler in the perl core has been altered to _always_ call malloc
(via
newSVsv).

In other words, starting from 5.17, it has become _impossible_ to
write an unsafe signal handler without fear of deadlock. And that
is
what I see as a regression.

I think you're referring to 100c03a. Possibly an alternative $@​ is
the
solution here too.

Agreed.

Possibly $@​ should be set to a sv_newmortal() instead of a copy of its current value.

Tony

@p5pRT
Copy link
Author

p5pRT commented Jul 10, 2019

From @tonycoz

On Sun, 08 Nov 2015 19​:43​:06 -0800, tonyc wrote​:

On Thu Oct 29 19​:27​:04 2015, kazuhooku@​gmail.com wrote​:

For example, you can call POSIX​::write in the unsafe signal handler
to
wake up `select` reliably. The code will look like​:

pipe(my $rfh, $wfh);
my $wfd = fileno $wfh;
POSIX​::sigaction(SIGTERM, POSIX​::SigAction->new(sub {
POSIX​::write $wfd, "1", 1;

POSIX​::write() calls sv_newmortal() for its return value, so there's a
small change of a call to malloc() there.

Both for allocating the SV itself and for expanding the tmps stack.

}));

my $rbits = '';
vec($rbits, fileno $rfh, 1) = 1;
...
select($rbits, ...);

This is essentially same as using signaled provided by Linux, but is
more portable.

However (as I believe) in Perl 5.17, the implementation of the
signal
handler in the perl core has been altered to _always_ call malloc
(via
newSVsv).

In other words, starting from 5.17, it has become _impossible_ to
write an unsafe signal handler without fear of deadlock. And that
is
what I see as a regression.

I think you're referring to 100c03a. Possibly an alternative $@​ is
the
solution here too.

Agreed.

Possibly $@​ should be set to a sv_newmortal() instead of a copy of its
current value.

That doesn't help, the CLEAR_ERRSV() in Perl_call_sv() will still allocate a new PV (via SvPVCLEAR() which is a wrapper around Perl_sv_setpv_bufsize()).

Several other macros can end up allocating memory too, depending on whether the stack involved has enough space or not.

We can't use G_KEEPERR to skip the CLEAR_ERRSV() since the signal handler uses the value of ERRSV to check if the sub died.

The only way I can see to make it work would be to pre-create a SV that we keep in an interpreter global and substitute that into GvSV(PL_errgv) when we call_sv() the signal handler.

This doesn't prevent all of the other potential allocations.

Tony

Tony

@p5pRT
Copy link
Author

p5pRT commented Jul 10, 2019

From @demerphq

On Wed, 10 Jul 2019 at 03​:48, Tony Cook via RT
<perlbug-followup@​perl.org> wrote​:

On Sun, 08 Nov 2015 19​:43​:06 -0800, tonyc wrote​:

On Thu Oct 29 19​:27​:04 2015, kazuhooku@​gmail.com wrote​:

For example, you can call POSIX​::write in the unsafe signal handler
to
wake up `select` reliably. The code will look like​:

pipe(my $rfh, $wfh);
my $wfd = fileno $wfh;
POSIX​::sigaction(SIGTERM, POSIX​::SigAction->new(sub {
POSIX​::write $wfd, "1", 1;

POSIX​::write() calls sv_newmortal() for its return value, so there's a
small change of a call to malloc() there.

Both for allocating the SV itself and for expanding the tmps stack.

}));

my $rbits = '';
vec($rbits, fileno $rfh, 1) = 1;
...
select($rbits, ...);

This is essentially same as using signaled provided by Linux, but is
more portable.

However (as I believe) in Perl 5.17, the implementation of the
signal
handler in the perl core has been altered to _always_ call malloc
(via
newSVsv).

In other words, starting from 5.17, it has become _impossible_ to
write an unsafe signal handler without fear of deadlock. And that
is
what I see as a regression.

I think you're referring to 100c03a. Possibly an alternative $@​ is
the
solution here too.

Agreed.

Possibly $@​ should be set to a sv_newmortal() instead of a copy of its
current value.

That doesn't help, the CLEAR_ERRSV() in Perl_call_sv() will still allocate a new PV (via SvPVCLEAR() which is a wrapper around Perl_sv_setpv_bufsize()).

Several other macros can end up allocating memory too, depending on whether the stack involved has enough space or not.

We can't use G_KEEPERR to skip the CLEAR_ERRSV() since the signal handler uses the value of ERRSV to check if the sub died.

The only way I can see to make it work would be to pre-create a SV that we keep in an interpreter global and substitute that into GvSV(PL_errgv) when we call_sv() the signal handler.

Surely this is a reasonable thing to do? Do we need one per thread for
threaded builds?

Yves

--
perl -Mre=debug -e "/just|another|perl|hacker/"

@p5pRT
Copy link
Author

p5pRT commented Jul 22, 2019

From @tonycoz

On Wed, 10 Jul 2019 03​:38​:53 -0700, demerphq wrote​:

On Wed, 10 Jul 2019 at 03​:48, Tony Cook via RT
<perlbug-followup@​perl.org> wrote​:

On Sun, 08 Nov 2015 19​:43​:06 -0800, tonyc wrote​:

On Thu Oct 29 19​:27​:04 2015, kazuhooku@​gmail.com wrote​:

For example, you can call POSIX​::write in the unsafe signal
handler
to
wake up `select` reliably. The code will look like​:

pipe(my $rfh, $wfh);
my $wfd = fileno $wfh;
POSIX​::sigaction(SIGTERM, POSIX​::SigAction->new(sub {
POSIX​::write $wfd, "1", 1;

POSIX​::write() calls sv_newmortal() for its return value, so
there's a
small change of a call to malloc() there.

Both for allocating the SV itself and for expanding the tmps stack.

}));

my $rbits = '';
vec($rbits, fileno $rfh, 1) = 1;
...
select($rbits, ...);

This is essentially same as using signaled provided by Linux, but
is
more portable.

However (as I believe) in Perl 5.17, the implementation of the
signal
handler in the perl core has been altered to _always_ call
malloc
(via
newSVsv).

In other words, starting from 5.17, it has become _impossible_
to
write an unsafe signal handler without fear of deadlock. And
that
is
what I see as a regression.

I think you're referring to 100c03a. Possibly an alternative
$@​ is
the
solution here too.

Agreed.

Possibly $@​ should be set to a sv_newmortal() instead of a copy of
its
current value.

That doesn't help, the CLEAR_ERRSV() in Perl_call_sv() will still
allocate a new PV (via SvPVCLEAR() which is a wrapper around
Perl_sv_setpv_bufsize()).

Several other macros can end up allocating memory too, depending on
whether the stack involved has enough space or not.

We can't use G_KEEPERR to skip the CLEAR_ERRSV() since the signal
handler uses the value of ERRSV to check if the sub died.

The only way I can see to make it work would be to pre-create a SV
that we keep in an interpreter global and substitute that into
GvSV(PL_errgv) when we call_sv() the signal handler.

Surely this is a reasonable thing to do? Do we need one per thread for
threaded builds?

Well, it would fix one point on non-safety, for it to actually be safe we'd also need to switch in at least​:

- a pool of SVs to allocate from
- preallocated temp, save and argument stacks

since all of these are used either by the signal -> perl sub delivery code, or by the C<< POSIX​::write $wfd, "1", 1; >> in the suggested signal handler​:

- POSIX​::write() allocates a SV and makes it mortal for its return value (temp stack)
- the call to the signal handler sub and to POSIX​::write both use the argument stack
- the signal handling code uses the save stack
- the signal handling code pushes a new argument stack (possibly using an existing one, but creating one if needed). This code may race if the non-signal handler code is also pushing a stack

Tony

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants