Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Perl complains and dies on receiving a signal #13111

Open
p5pRT opened this issue Jul 18, 2013 · 3 comments
Open

Perl complains and dies on receiving a signal #13111

p5pRT opened this issue Jul 18, 2013 · 3 comments

Comments

@p5pRT
Copy link

p5pRT commented Jul 18, 2013

Migrated from rt.perl.org#118929 (status was 'open')

Searchable as RT118929$

@p5pRT
Copy link
Author

p5pRT commented Jul 18, 2013

From sab123@hotmail.com

Created by sab123@hotmail.com

This is a bug report for perl from sab123@​hotmail.com,
generated with the help of perlbug 1.39 running under perl 5.19.0.

-----------------------------------------------------------------

I have an application (the module Triceps as available on CPAN)
that uses the signal SIGUSR2 to communicate between the threads,
as a request for a thread to interrupt the current ongoing system
call, so that it can detect that it needs to exit. It worked
fine with Perl 5.10, but with 5.19 I see the following errors,
and once in a while it also dumps core​:

Signal SIGUSR2 received, but no signal handler set.

An example of stack in the core file​:

(gdb) bt
#0 0x000000000048e16d in Perl_csighandler ()
#1 <signal handler called>
#2 0x00000038c500e01b in accept () from /lib64/libpthread.so.0
#3 0x00000000004f2305 in Perl_pp_accept ()
#4 0x00000000004a41d6 in Perl_runops_standard ()
#5 0x00000000004342bc in Perl_call_sv ()
#6 0x00007fe4f062a521 in S_ithread_run () from /home/babkin/perl519/lib/5.19.0/x86_64-linux-thread-multi/auto/threads/threads.so
#7 0x00000038c500686a in start_thread () from /lib64/libpthread.so.0
#8 0x00000038c44de3bd in clone () from /lib64/libc.so.6
#9 0x0000000000000000 in ?? ()

(the stack varies between the runs). For all I can tell, the
core dumps happen because Perl_sighandler() uses exit() and
not _exit() after printing the error message, so the other
threads encounter the half-destroyed state while exit() does its
destructions.

The appearance of the message and of the core dump depends on timing,
and varies between the runs.

There are multiple threads that get sent this signal at about the
same time. The signal is sent from the XS code, so it's a real
signal, not the pseudo-signal of the Perl threads (because the
whole point of this signal is to interrupt the system call).

The signal handler is set at the start of the program​:

$SIG{USR2} = sub {};

There is no race with the thread creation​: the killing is enabled
only after the thread's Perl code is running. For debugging,
I've added the printing of $SIG{USR2} before enabling the killing
for that thread, and it all prints as a valid object.

And example of a log from such a run​:

$ PERL_TEST_DIFF='diff -u ' ~/perl519/bin/perl t/xTqlMt.t
1..7
# Running under perl version 5.019000 for linux
# Current time local​: Thu Jul 18 00​:09​:42 2013
# Current time GMT​: Thu Jul 18 04​:09​:42 2013
# Using Test.pm version 1.26
ok 1
XXX starting thread global SIGUSR2 CODE(0x291f7e8)
XXX starting thread collector SIGUSR2 CODE(0x30f5ab8)
XXX starting thread tqlListener SIGUSR2 CODE(0x7f0d0c167890)
XXX starting thread send_c1 SIGUSR2 CODE(0x33c6148)
XXX starting thread tql1 SIGUSR2 CODE(0x7f0cfc15dd80)
XXX starting thread recv_c1 SIGUSR2 CODE(0x37bbe18)
XXX starting thread tql1.rd SIGUSR2 CODE(0x7f0cf415d1b0)
XXX killed thread global
XXX killed thread tql1
ok 2
XXX killed thread collector
XXX killed thread recv_c1
XXX killed thread send_c1
XXX killed thread tql1.rd
XXX killed thread tqlListener
XXX starting thread global SIGUSR2 CODE(0x3124018)
XXX starting thread collector SIGUSR2 CODE(0x3b988b8)
XXX starting thread tqlListener SIGUSR2 CODE(0x7f0d0415cc80)
XXX starting thread tql1 SIGUSR2 CODE(0x7f0ce81531a0)
XXX starting thread recv_c1 SIGUSR2 CODE(0x2b3a9f8)
XXX starting thread tql1.rd SIGUSR2 CODE(0x7f0d0c1518d0)
XXX starting thread send_c1 SIGUSR2 CODE(0x3b3f4c8)
XXX killed thread global
XXX killed thread tql1
ok 3
XXX killed thread collector
XXX killed thread recv_c1
XXX killed thread send_c1
XXX killed thread tql1.rd
XXX killed thread tqlListener
Signal SIGUSR2 received, but no signal handler set.
Segmentation fault (core dumped)

It looks like there is some issue with initializing the table of the handlers
in PL_psig_ptr that leaves it empty but I'm not sure, how can it happen. I
guess, it could happen that the signal arrives when the thread is exiting on
its own anyway (because another thread that feeds it had exited, so the
dependent threads start exiting too), and maybe by that time Perl had already
cleaned PL_psig_ptr. Perhaps in this case the correct reaction would be to
just silently return from Perl_sighandler() instead of printing an error
message and exiting.

Perl Info

Flags:
    category=core
    severity=high

Site configuration information for perl 5.19.0:

Configured by babkin at Tue Jun 18 21:38:41 EDT 2013.

Summary of my perl5 (revision 5 version 19 subversion 0) configuration:
   
  Platform:
    osname=linux, osvers=2.6.30.10-105.2.23.fc11.x86_64, archname=x86_64-linux-thread-multi
    uname='linux babkin.myhomedomain 2.6.30.10-105.2.23.fc11.x86_64 #1 smp thu feb 11 07:06:34 utc 2010 x86_64 x86_64 x86_64 gnulinux '
    config_args='-Dprefix=/home/babkin/perl519'
    hint=recommended, useposix=true, d_sigaction=define
    useithreads=define, usemultiplicity=define
    useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef
    use64bitint=define, use64bitall=define, uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='cc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
    optimize='-O2',
    cppflags='-D_REENTRANT -D_GNU_SOURCE -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include'
    ccversion='', gccversion='4.4.1 20090725 (Red Hat 4.4.1-2)', gccosandvers=''
    intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
    ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
    alignbytes=8, prototype=define
  Linker and Libraries:
    ld='cc', ldflags =' -fstack-protector -L/usr/local/lib'
    libpth=/usr/local/lib /lib/../lib64 /usr/lib/../lib64 /lib /usr/lib /lib64 /usr/lib64 /usr/local/lib64
    libs=-lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lpthread -lc
    perllibs=-lnsl -ldl -lm -lcrypt -lutil -lpthread -lc
    libc=/lib/libc-2.10.1.so, so=so, useshrplib=false, libperl=libperl.a
    gnulibc_version='2.10.1'
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E'
    cccdlflags='-fPIC', lddlflags='-shared -O2 -L/usr/local/lib -fstack-protector'

Locally applied patches:
    


@INC for perl 5.19.0:
    /home/babkin/perl519/lib/site_perl/5.19.0/x86_64-linux-thread-multi
    /home/babkin/perl519/lib/site_perl/5.19.0
    /home/babkin/perl519/lib/5.19.0/x86_64-linux-thread-multi
    /home/babkin/perl519/lib/5.19.0
    .


Environment for perl 5.19.0:
    HOME=/home/babkin
    LANG=en_US.UTF-8
    LANGUAGE (unset)
    LD_LIBRARY_PATH (unset)
    LOGDIR (unset)
    PATH=/usr/java/bin:/home/babkin/bin:/usr/java/bin:/home/babkin/bin:/usr/lib64/qt-3.3/bin:/usr/kerberos/bin:/usr/lib64/ccache:/usr/local/bin:/usr/bin:/bin:/usr/local/sbin:/usr/sbin:/sbin:/sbin:/usr/sbin:/home/babkin/pile/photo/bin:/opt/aleri/devtools/i686/apache-ant/current/bin:/sbin:/usr/sbin:/home/babkin/pile/photo/bin:/opt/aleri/devtools/i686/apache-ant/current/bin
    PERL_BADLANG (unset)
    SHELL=/bin/bash
 		 	   		  

@p5pRT
Copy link
Author

p5pRT commented Jan 5, 2017

From @jkeenan

On Thu, 18 Jul 2013 04​:28​:05 GMT, sab123@​hotmail.com wrote​:

This is a bug report for perl from sab123@​hotmail.com,
generated with the help of perlbug 1.39 running under perl 5.19.0.

-----------------------------------------------------------------

I have an application (the module Triceps as available on CPAN)
that uses the signal SIGUSR2 to communicate between the threads,
as a request for a thread to interrupt the current ongoing system
call, so that it can detect that it needs to exit. It worked
fine with Perl 5.10, but with 5.19 I see the following errors,
and once in a while it also dumps core​:

Signal SIGUSR2 received, but no signal handler set.

An example of stack in the core file​:

(gdb) bt
#0 0x000000000048e16d in Perl_csighandler ()
#1 <signal handler called>
#2 0x00000038c500e01b in accept () from /lib64/libpthread.so.0
#3 0x00000000004f2305 in Perl_pp_accept ()
#4 0x00000000004a41d6 in Perl_runops_standard ()
#5 0x00000000004342bc in Perl_call_sv ()
#6 0x00007fe4f062a521 in S_ithread_run () from
/home/babkin/perl519/lib/5.19.0/x86_64-linux-thread-
multi/auto/threads/threads.so
#7 0x00000038c500686a in start_thread () from /lib64/libpthread.so.0
#8 0x00000038c44de3bd in clone () from /lib64/libc.so.6
#9 0x0000000000000000 in ?? ()

(the stack varies between the runs). For all I can tell, the
core dumps happen because Perl_sighandler() uses exit() and
not _exit() after printing the error message, so the other
threads encounter the half-destroyed state while exit() does its
destructions.

The appearance of the message and of the core dump depends on timing,
and varies between the runs.

There are multiple threads that get sent this signal at about the
same time. The signal is sent from the XS code, so it's a real
signal, not the pseudo-signal of the Perl threads (because the
whole point of this signal is to interrupt the system call).

The signal handler is set at the start of the program​:

$SIG{USR2} = sub {};

There is no race with the thread creation​: the killing is enabled
only after the thread's Perl code is running. For debugging,
I've added the printing of $SIG{USR2} before enabling the killing
for that thread, and it all prints as a valid object.

And example of a log from such a run​:

$ PERL_TEST_DIFF='diff -u ' ~/perl519/bin/perl t/xTqlMt.t
1..7
# Running under perl version 5.019000 for linux
# Current time local​: Thu Jul 18 00​:09​:42 2013
# Current time GMT​: Thu Jul 18 04​:09​:42 2013
# Using Test.pm version 1.26
ok 1
XXX starting thread global SIGUSR2 CODE(0x291f7e8)
XXX starting thread collector SIGUSR2 CODE(0x30f5ab8)
XXX starting thread tqlListener SIGUSR2 CODE(0x7f0d0c167890)
XXX starting thread send_c1 SIGUSR2 CODE(0x33c6148)
XXX starting thread tql1 SIGUSR2 CODE(0x7f0cfc15dd80)
XXX starting thread recv_c1 SIGUSR2 CODE(0x37bbe18)
XXX starting thread tql1.rd SIGUSR2 CODE(0x7f0cf415d1b0)
XXX killed thread global
XXX killed thread tql1
ok 2
XXX killed thread collector
XXX killed thread recv_c1
XXX killed thread send_c1
XXX killed thread tql1.rd
XXX killed thread tqlListener
XXX starting thread global SIGUSR2 CODE(0x3124018)
XXX starting thread collector SIGUSR2 CODE(0x3b988b8)
XXX starting thread tqlListener SIGUSR2 CODE(0x7f0d0415cc80)
XXX starting thread tql1 SIGUSR2 CODE(0x7f0ce81531a0)
XXX starting thread recv_c1 SIGUSR2 CODE(0x2b3a9f8)
XXX starting thread tql1.rd SIGUSR2 CODE(0x7f0d0c1518d0)
XXX starting thread send_c1 SIGUSR2 CODE(0x3b3f4c8)
XXX killed thread global
XXX killed thread tql1
ok 3
XXX killed thread collector
XXX killed thread recv_c1
XXX killed thread send_c1
XXX killed thread tql1.rd
XXX killed thread tqlListener
Signal SIGUSR2 received, but no signal handler set.
Segmentation fault (core dumped)

It looks like there is some issue with initializing the table of the
handlers
in PL_psig_ptr that leaves it empty but I'm not sure, how can it
happen. I
guess, it could happen that the signal arrives when the thread is
exiting on
its own anyway (because another thread that feeds it had exited, so
the
dependent threads start exiting too), and maybe by that time Perl had
already
cleaned PL_psig_ptr. Perhaps in this case the correct reaction would
be to
just silently return from Perl_sighandler() instead of printing an
error
message and exiting.

Several questions​:

1. Do you still experience this problem in the environment where you originally saw it?

2. Does the problem persist with more recent versions of Perl (5.20, 5.22, 5.24)?

3. Would you be able to supply a small program, independent of Triceps, that reproduces the problem?

Thank you very much.

--
James E Keenan (jkeenan@​cpan.org)

@p5pRT
Copy link
Author

p5pRT commented Jan 5, 2017

The RT System itself - Status changed from 'new' to 'open'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants