-
Notifications
You must be signed in to change notification settings - Fork 571
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
close and dup hang in threads #11805
Comments
From patrik.h.hagglund@ericsson.comCreated by patrik.h.hagglund@ericsson.comThis is a bug report for perl from patrik.h.hagglund@ericsson.com, ----------------------------------------------------------------- Filehandle duplication 'open(STDOUT, '>&', $tmph)', and 'close($tmph)' sometimes I don't get this hang on a single-core machine. Only with 2 or more cores. This When line 31-39 below is replaced with # test close => hangs sometimes I also see hangs on 'close', but not on 'POSIX::close'. /Patrik Hägglund use v5.8.9; use strict; use File::Temp 'tempfile', 'unlink0'; # for debug, pass trival programs that can be spotted using 'watch pstree' # execute a test command and check the output my ($tmph, $tmpname) = tempfile(); my $pid = fork(); if ($pid == 0) { # test fork/exec => no hangs # test dup => hangs sometimes # test dup => no? ### excecute (here a dummy cmd) # in parent waitpid($pid, 0); ### check output } # run a test in a number of variants # wait for thread completion # run all tests Perl Info
|
From @cpansproutOn Wed Dec 14 07:35:12 2011, patrik.h.hagglund@ericsson.com wrote:
I can reproduce the hang, but when I run it under gdb it never gets that Program received signal EXC_SOFTWARE, Software generated exception. I have no idea what that means. The backtrace is: (gdb) bt -- Father Chrysostomos |
The RT System itself - Status changed from 'new' to 'open' |
From @cpansproutOn Mon Jan 16 08:49:12 2012, sprout wrote:
And it seems that this script has *really* confused things, because when (gdb) q warning: error on line 2179 of -- Father Chrysostomos |
From patrik.h.hagglund@ericsson.comI have now tried to fix this myself. See patch and updated test-case However, I'm not sure if I fulfill the requriement in atfork_lock(): /* locks must be held in locking order (if any) */ (Also, I find the comment in atfork_unlock strange, saying that the same /Patrik Hägglund From fb0fe87ffa55b8121f0610b44a9c87fef519a2f0 Mon Sep 17 00:00:00 2001 This is a multi-part message in MIME format. Using threads + fork() on Linux, and IO operations in the threads, the Therefore, ensure that the PL_perlio_mutex is unlocked in the child util.c | 2 ++ --------------1.8.1.3 Inline Patchdiff --git a/util.c b/util.c
index a3fbd3c..4c25a44 100644
--- a/util.c
+++ b/util.c
@@ -2798,6 +2798,7 @@ Perl_atfork_lock(void)
dVAR;
#if defined(USE_ITHREADS)
/* locks must be held in locking order (if any) */
+ MUTEX_LOCK(&PL_perlio_mutex);
# ifdef MYMALLOC
MUTEX_LOCK(&PL_malloc_mutex);
# endif
@@ -2812,6 +2813,7 @@ Perl_atfork_unlock(void)
dVAR;
#if defined(USE_ITHREADS)
/* locks must be released in same order as in atfork_lock() */
+ MUTEX_UNLOCK(&PL_perlio_mutex);
# ifdef MYMALLOC
MUTEX_UNLOCK(&PL_malloc_mutex);
# endif
--------------1.8.1.3--
# Skeleton version of a test driver program utilizing Perl threads. use threads; # execute a "test" # run a number of "tests" sequentially # wait for thread completion # run all |
From patrik.h.hagglund@ericsson.com0001-Add-PL_perlio_mutex-to-atfork_lock-unlock-to-fix-bug.patchFrom fb0fe87ffa55b8121f0610b44a9c87fef519a2f0 Mon Sep 17 00:00:00 2001
From: Patrik Hagglund <patrik.h.hagglund@ericsson.com>
Date: Sat, 2 Feb 2013 20:21:05 +0100
Subject: [PATCH] Add PL_perlio_mutex to atfork_lock/unlock, to fix bug
#106212.
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="------------1.8.1.3"
This is a multi-part message in MIME format.
--------------1.8.1.3
Content-Type: text/plain; charset=UTF-8; format=fixed
Content-Transfer-Encoding: 8bit
Using threads + fork() on Linux, and IO operations in the threads, the
PL_perlio_mutex may be left in a locked state at the call of fork(),
potentially leading to deadlock in the child process at subsequent IO
operations. (Threads are pre-empted and not continued in the child
process after the fork.)
Therefore, ensure that the PL_perlio_mutex is unlocked in the child
process, right after fork(), by using atfork_lock/unlock.
---
util.c | 2 ++
1 file changed, 2 insertions(+)
--------------1.8.1.3
Content-Type: text/x-patch; name="0001-Add-PL_perlio_mutex-to-atfork_lock-unlock-to-fix-bug.patch"
Content-Transfer-Encoding: 8bit
Content-Disposition: attachment; filename="0001-Add-PL_perlio_mutex-to-atfork_lock-unlock-to-fix-bug.patch"
diff --git a/util.c b/util.c
index a3fbd3c..4c25a44 100644
--- a/util.c
+++ b/util.c
@@ -2798,6 +2798,7 @@ Perl_atfork_lock(void)
dVAR;
#if defined(USE_ITHREADS)
/* locks must be held in locking order (if any) */
+ MUTEX_LOCK(&PL_perlio_mutex);
# ifdef MYMALLOC
MUTEX_LOCK(&PL_malloc_mutex);
# endif
@@ -2812,6 +2813,7 @@ Perl_atfork_unlock(void)
dVAR;
#if defined(USE_ITHREADS)
/* locks must be released in same order as in atfork_lock() */
+ MUTEX_UNLOCK(&PL_perlio_mutex);
# ifdef MYMALLOC
MUTEX_UNLOCK(&PL_malloc_mutex);
# endif
--------------1.8.1.3--
|
From [Unknown Contact. See original ticket]I have now tried to fix this myself. See patch and updated test-case However, I'm not sure if I fulfill the requriement in atfork_lock(): /* locks must be held in locking order (if any) */ (Also, I find the comment in atfork_unlock strange, saying that the same /Patrik Hägglund From fb0fe87ffa55b8121f0610b44a9c87fef519a2f0 Mon Sep 17 00:00:00 2001 This is a multi-part message in MIME format. Using threads + fork() on Linux, and IO operations in the threads, the Therefore, ensure that the PL_perlio_mutex is unlocked in the child util.c | 2 ++ --------------1.8.1.3 Inline Patchdiff --git a/util.c b/util.c
index a3fbd3c..4c25a44 100644
--- a/util.c
+++ b/util.c
@@ -2798,6 +2798,7 @@ Perl_atfork_lock(void)
dVAR;
#if defined(USE_ITHREADS)
/* locks must be held in locking order (if any) */
+ MUTEX_LOCK(&PL_perlio_mutex);
# ifdef MYMALLOC
MUTEX_LOCK(&PL_malloc_mutex);
# endif
@@ -2812,6 +2813,7 @@ Perl_atfork_unlock(void)
dVAR;
#if defined(USE_ITHREADS)
/* locks must be released in same order as in atfork_lock() */
+ MUTEX_UNLOCK(&PL_perlio_mutex);
# ifdef MYMALLOC
MUTEX_UNLOCK(&PL_malloc_mutex);
# endif
--------------1.8.1.3--
# Skeleton version of a test driver program utilizing Perl threads. use threads; # execute a "test" # run a number of "tests" sequentially # wait for thread completion # run all |
From @LeontOn Mon, Feb 11, 2013 at 4:56 PM, Patrik Hagglund via RT
The fix makes perfect sense to me, this class of issue is exactly why
Me neither, it would require a deeper look at the code. What is
I would find reverse more intuitive too, but I don't think the unlock Leon |
From @LeontOn Mon, Feb 11, 2013 at 4:56 PM, Patrik Hagglund via RT
One small issue with the patch is that it wouldn't compile on a Leon |
From patrik.h.hagglund@ericsson.com
I took a quick look, and couldn't find any interdependencies between the three mutex locks in atfork_lock(). However, that may be due to my lack of understanding of the perl internals. Here are updated patches. The first one now just adds #ifdef USE_PERLIO, and the second one alters the order of locks (potentially dangerous!), and updates the comments. /Patrik Hägglund -----Original Message----- On Mon, Feb 11, 2013 at 4:56 PM, Patrik Hagglund via RT <perlbug-comment@perl.org> wrote:
The fix makes perfect sense to me, this class of issue is exactly why at_fork was invented.
Me neither, it would require a deeper look at the code. What is important is that *any* code that uses at least two of those three locks (memory, io, ops) does it in the same order (not just in core but also on cpan). As long as that invariant is held, the specific order doesn't matter.
I would find reverse more intuitive too, but I don't think the unlock order semantically matters much since it will never block, it's the lock order that does. Leon |
From patrik.h.hagglund@ericsson.com0001-Add-PL_perlio_mutex-to-atfork_lock-unlock-to-fix-bug.patchFrom 57ad65b03bdd5ab163534b39a6cbd22fc6e7d74c Mon Sep 17 00:00:00 2001
From: Patrik Hagglund <patrik.h.hagglund@ericsson.com>
Date: Sat, 2 Feb 2013 20:21:05 +0100
Subject: [PATCH 1/2] Add PL_perlio_mutex to atfork_lock/unlock, to fix bug
#106212.
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="------------1.8.1.3"
This is a multi-part message in MIME format.
--------------1.8.1.3
Content-Type: text/plain; charset=UTF-8; format=fixed
Content-Transfer-Encoding: 8bit
Using threads + fork() on Linux, and IO operations in the threads, the
PL_perlio_mutex may be left in a locked state at the call of fork(),
potentially leading to deadlock in the child process at subsequent IO
operations. (Threads are pre-empted and not continued in the child
process after the fork.)
Therefore, ensure that the PL_perlio_mutex is unlocked in the child
process, right after fork(), by using atfork_lock/unlock.
---
util.c | 6 ++++++
1 file changed, 6 insertions(+)
--------------1.8.1.3
Content-Type: text/x-patch; name="0001-Add-PL_perlio_mutex-to-atfork_lock-unlock-to-fix-bug.patch"
Content-Transfer-Encoding: 8bit
Content-Disposition: attachment; filename="0001-Add-PL_perlio_mutex-to-atfork_lock-unlock-to-fix-bug.patch"
diff --git a/util.c b/util.c
index a3fbd3c..bb35d97 100644
--- a/util.c
+++ b/util.c
@@ -2798,6 +2798,9 @@ Perl_atfork_lock(void)
dVAR;
#if defined(USE_ITHREADS)
/* locks must be held in locking order (if any) */
+# ifdef USE_PERLIO
+ MUTEX_LOCK(&PL_perlio_mutex);
+# endif
# ifdef MYMALLOC
MUTEX_LOCK(&PL_malloc_mutex);
# endif
@@ -2812,6 +2815,9 @@ Perl_atfork_unlock(void)
dVAR;
#if defined(USE_ITHREADS)
/* locks must be released in same order as in atfork_lock() */
+# ifdef USE_PERLIO
+ MUTEX_UNLOCK(&PL_perlio_mutex);
+# endif
# ifdef MYMALLOC
MUTEX_UNLOCK(&PL_malloc_mutex);
# endif
--------------1.8.1.3--
|
From patrik.h.hagglund@ericsson.com0002-Update-ordering-of-locks-in-atfork_lock-unlock-and-c.patchFrom fcac73e4e40e5c4770628dda6512155ba7e36716 Mon Sep 17 00:00:00 2001
From: Patrik Hagglund <patrik.h.hagglund@ericsson.com>
Date: Tue, 12 Feb 2013 11:18:53 +0100
Subject: [PATCH 2/2] Update ordering of locks in atfork_lock/unlock, and
corresponding commments.
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="------------1.8.1.3"
This is a multi-part message in MIME format.
--------------1.8.1.3
Content-Type: text/plain; charset=UTF-8; format=fixed
Content-Transfer-Encoding: 8bit
---
util.c | 17 +++++++++++------
1 file changed, 11 insertions(+), 6 deletions(-)
--------------1.8.1.3
Content-Type: text/x-patch; name="0002-Update-ordering-of-locks-in-atfork_lock-unlock-and-c.patch"
Content-Transfer-Encoding: 8bit
Content-Disposition: attachment; filename="0002-Update-ordering-of-locks-in-atfork_lock-unlock-and-c.patch"
diff --git a/util.c b/util.c
index bb35d97..47c5e2d 100644
--- a/util.c
+++ b/util.c
@@ -2797,14 +2797,17 @@ Perl_atfork_lock(void)
{
dVAR;
#if defined(USE_ITHREADS)
- /* locks must be held in locking order (if any) */
+ /* Locks must be held in locking order (if any), to avoid
+ deadlock. (It is not clear to me if there are any lock
+ interdependencies. I can't find any. Anyway, use an approximate
+ high-level to low-level ordering.) */
+ OP_REFCNT_LOCK;
# ifdef USE_PERLIO
MUTEX_LOCK(&PL_perlio_mutex);
# endif
# ifdef MYMALLOC
MUTEX_LOCK(&PL_malloc_mutex);
# endif
- OP_REFCNT_LOCK;
#endif
}
@@ -2814,13 +2817,15 @@ Perl_atfork_unlock(void)
{
dVAR;
#if defined(USE_ITHREADS)
- /* locks must be released in same order as in atfork_lock() */
-# ifdef USE_PERLIO
- MUTEX_UNLOCK(&PL_perlio_mutex);
-# endif
+ /* As long as locking is absent, ordering is not needed to avoid
+ deadlock. However, for clarity, use the reverse order of
+ atfork_lock() */
# ifdef MYMALLOC
MUTEX_UNLOCK(&PL_malloc_mutex);
# endif
+# ifdef USE_PERLIO
+ MUTEX_UNLOCK(&PL_perlio_mutex);
+# endif
OP_REFCNT_UNLOCK;
#endif
}
--------------1.8.1.3--
|
From @LeontOn Tue, Feb 12, 2013 at 8:36 AM, Leon Timmermans <fawaka@gmail.com> wrote:
To explain what's going on: When a fork is performed in a multi-threaded application, only the One solution to this is to lock all mutexes right before the lock and Do note in particular that any shared variable may still show this The sane solution is to just never combine fork and threads, unless Leon |
From patrik.h.hagglund@ericsson.comFYI: There is a discussion in POSIX, pthread_atfork, section RATIONALE:
I originally used the following code, copied from perlfaq8, to capture a program's STDOUT but discard its STDERR, in a thread: use IPC::Open3;
I'm not sure I follow you completely. However, this is what the RATIONALE says:
/Patrik Hägglund -----Original Message----- On Tue, Feb 12, 2013 at 8:36 AM, Leon Timmermans <fawaka@gmail.com> wrote:
To explain what's going on: When a fork is performed in a multi-threaded application, only the calling thread is copied into the new one. So when one thread holds a mutex and another forks, the mutex is never unlocked in the new thread. One solution to this is to lock all mutexes right before the lock and unlock them right after. This way you can be sure no other thread holds those mutexes. Interestingly this means sacrificing fork's signal safety for thread safety, as useful fork handlers are pretty much never signal safe. Do note in particular that any shared variable may still show this issue, unless guarded by POSIX::AtFork. Also note you can't blindly lock all shared variables, as this may produce deadlocks. It's not impossible to do this correctly, but you're likely to get it wrong. The sane solution is to just never combine fork and threads, unless maybe when followed immediately by an exec. Anything in between the two should be handled mostly as if you're writing an (unsafe) signal handler, unless guarded by an at_fork handler (which malloc fortunately is on any sane platform). And even then there are some signaling complications you should take into account (for example, the SIGCHLD may end up being handled by another thread). There's a reason POSIX considers system() not to be thread-safe. Leon |
From @LeontOn Tue, Feb 12, 2013 at 1:38 PM, Patrik Hägglund H
Sounds like the kind of problem you just reported ;-)
That's two different things. The signal safety issue is this: fork used to be async-signal-safe (it The second thing boils down to this: 1) Mutexes are not multi-threaded fork safe, unless properly secured 1 + 2 + 3 → So async-signal safe is also multi-threaded-fork safe, Leon |
From @druud62On 2013-02-12 08:36, Leon Timmermans wrote:
So we should all do them in (for example) alphabetical order, -- |
From @LeontOn Wed, Feb 13, 2013 at 8:46 PM, Dr.Ruud <rvtol+usenet@isolution.nl> wrote:
To make matters more complex, there is also: * PL_hints_mutex /* Mutex for refcounted he refcounting */ We don't have three but seven mutexes to take into account. Doing this Leon |
From patrik.h.hagglund@ericsson.com(I have been busy with other duties, and is now returning to this.)
Fine. So is my 0001 patch attached to the ticket [1] good enough? If yes, can you apply it? If no, what is the next step to get this fixed? /Patrik Hägglund -----Original Message----- On Wed, Feb 13, 2013 at 8:46 PM, Dr.Ruud <rvtol+usenet@isolution.nl> wrote:
To make matters more complex, there is also: * PL_hints_mutex /* Mutex for refcounted he refcounting */ We don't have three but seven mutexes to take into account. Doing this Leon |
From @khwilliamsonThanks, applied as commit which made it into 5.17.10, and hence 5.18 |
@khwilliamson - Status changed from 'open' to 'resolved' |
Migrated from rt.perl.org#106212 (status was 'resolved')
Searchable as RT106212$
The text was updated successfully, but these errors were encountered: