New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
win32_msgwait runs forever with non-infinite timeout #7714
Comments
From chris@fatorange.comCreated by chris@fatorange.comThis is a bug report for perl from chris@fatorange.com, ----------------------------------------------------------------- Perl Info
|
From @bulk88On Fri Dec 17 19:11:29 2004, cbecker wrote:
I was able to create a deadlock with a C debugger by doing a Perl "sleep I have a patch nearly complete to prevent the race condition above, but 1. Keep GetTickCount and put the 49.7 days of sleep actual seconds slept 2. Replace GetTickCount with QueryPerformanceCounter 3. Replace GetTickCount with GetSystemTimeAsFileTime 4. If sleep delta > (49.7 days to seconds/2) assume we overflowed and The Perl community's input is needed and appreciated. |
The RT System itself - Status changed from 'new' to 'open' |
From @bulk88On Mon Aug 20 23:21:46 2012, bulk88 wrote:
5. If timeout is > 50% of 2^32 ms or 90% or 99% of 2^32 ms, croak as |
From @cpansproutOn Mon Aug 20 23:21:46 2012, bulk88 wrote:
PL_check is process-global. I don’t fully understand how the #ifdef PERL_GLOBAL_STRUCT and opcode.h: #ifdef PERL_GLOBAL_STRUCT_INIT -- Father Chrysostomos |
From [Unknown Contact. See original ticket]On Mon Aug 20 23:21:46 2012, bulk88 wrote:
PL_check is process-global. I don’t fully understand how the #ifdef PERL_GLOBAL_STRUCT and opcode.h: #ifdef PERL_GLOBAL_STRUCT_INIT -- Father Chrysostomos |
From @bulk88On Mon Aug 20 23:30:05 2012, bulk88 wrote:
I wrote a C level hooker that hooked GetTickCount from perl517.dll to With Visual Studio debugger, something interesting happens. The deadlock
I'm not sure how well I will be able to diagnose this since my knowledge |
From @bulk88On Tue Sep 04 20:37:00 2012, bulk88 wrote:
I caught a call in the Perl process that registers (or
|
From @bulk88The problem with VS Debugger is it loads msctf.dll into the Perl process According to my disassembly of msctf.dll, function "OnForegroundChange" From the knowledge above, a hang from GetTickCount overflow I was able |
From @bulk88 |
From @bulk88On Fri Sep 07 16:00:03 2012, bulk88 wrote:
Egh, again I didn't forward my posts to P5P list from RT interface. I wrote a patch that solves the overflow and underflow issues I I'm not sure if this bug should remain open (more than 49 days of |
From @bulk880001-perl-33096-fix-over-underflow-issues-in-win32_msgwai.patchFrom 656effa89f7bb0559f15061ebba096665eb5a3a6 Mon Sep 17 00:00:00 2001
From: Daniel Dragan <bulk88@hotmail.com>
Date: Sat, 8 Sep 2012 03:03:24 -0400
Subject: [PATCH] [perl #33096] fix over/underflow issues in win32_msgwait
This commit does not completely fix 33096, since the real problem is that
VS IDE's debugger causes a Windows message to be sent to perl on each
breakpoint. Depending on where the breakpoint was set. The BP may make it
impossible to exit the loop because of the Visual Studio IDE Debugger's
design. Various overflow and underflow issues were fixed in win32_msgwait.
Specifically, the time count rolling forwards through zero
(GetSystemTimeAsFileTime), and the time count running ahead of the end
time (rolling backwards through zero) were fixed ("<=" check).
---
pod/perldelta.pod | 5 +++++
win32/win32.c | 45 ++++++++++++++++++++++++++++++++++-----------
2 files changed, 39 insertions(+), 11 deletions(-)
diff --git a/pod/perldelta.pod b/pod/perldelta.pod
index d9b2cc6..19009ff 100644
--- a/pod/perldelta.pod
+++ b/pod/perldelta.pod
@@ -349,6 +349,11 @@ files in F<ext/> and F<lib/> are best summarized in L</Modules and Pragmata>.
=item *
+On Windows, a rare race condition that would lead to L<sleep|perlfunc/sleep>
+taking more time than requested, and upto a hang has been fixed [perl #33096].
+
+=item *
+
XXX
=back
diff --git a/win32/win32.c b/win32/win32.c
index 78b55f6..93c2102 100644
--- a/win32/win32.c
+++ b/win32/win32.c
@@ -2154,13 +2154,32 @@ DllExport DWORD
win32_msgwait(pTHX_ DWORD count, LPHANDLE handles, DWORD timeout, LPDWORD resultp)
{
/* We may need several goes at this - so compute when we stop */
- DWORD ticks = 0;
+ unsigned __int64 ticks = 0;
+ unsigned __int64 endtime = timeout;
if (timeout != INFINITE) {
- ticks = GetTickCount();
- timeout += ticks;
- }
- while (1) {
- DWORD result = MsgWaitForMultipleObjects(count,handles,FALSE,timeout-ticks, QS_POSTMESSAGE|QS_TIMER|QS_SENDMESSAGE);
+ GetSystemTimeAsFileTime((LPFILETIME)&ticks);
+ ticks /= 10000;
+ endtime += ticks;
+ }
+ /* This was a race condition. Do not let a non INFINITE timeout to
+ * MsgWaitForMultipleObjects roll under 0 creating a near
+ * infinity/~(UINT32)0 timeout which will appear as a deadlock to the
+ * user who did a CORE perl function with a non infinity timeout,
+ * sleep for example. This is 64 to 32 truncation minefield.
+ *
+ * This scenario can only be created if the timespan from the return of
+ * MsgWaitForMultipleObjects to GetTickCount exceeds 1 ms. To generate
+ * the scenario, manual breakpoints in a C debugger are required, or a
+ * context switch occured in win32_async_check in PeekMessage, or random
+ * messages are delivered to the *thread* message queue of the Perl thread
+ * from another process (msctf.dll doing IPC among its instances, VS debugger
+ * causes msctf.dll to be loaded into Perl by kernel)
+ */
+ while (ticks <= endtime) {
+ /* if timeout's type is lengthened, remember to split 64b timeout
+ * into multiple non-infinity runs of MWFMO */
+ DWORD result = MsgWaitForMultipleObjects(count,handles,FALSE,(DWORD)(endtime-ticks)
+ , QS_POSTMESSAGE|QS_TIMER|QS_SENDMESSAGE);
if (resultp)
*resultp = result;
if (result == WAIT_TIMEOUT) {
@@ -2170,8 +2189,9 @@ win32_msgwait(pTHX_ DWORD count, LPHANDLE handles, DWORD timeout, LPDWORD result
return 0;
}
if (timeout != INFINITE) {
- ticks = GetTickCount();
- }
+ GetSystemTimeAsFileTime((LPFILETIME)&ticks);
+ ticks /= 10000;
+ }
if (result == WAIT_OBJECT_0 + count) {
/* Message has arrived - check it */
(void)win32_async_check(aTHX);
@@ -2181,10 +2201,13 @@ win32_msgwait(pTHX_ DWORD count, LPHANDLE handles, DWORD timeout, LPDWORD result
break;
}
}
- /* compute time left to wait */
- ticks = timeout - ticks;
/* If we are past the end say zero */
- return (ticks > 0) ? ticks : 0;
+ if(!ticks || ticks > endtime)
+ return 0;
+ /* compute time left to wait */
+ ticks = endtime - ticks;
+ /*if more ms than DWORD, then return max DWORD*/
+ return ticks <= UINT_MAX ? (DWORD)ticks:UINT_MAX;
}
int
--
1.7.9.msysgit.0
|
From @bulk88I also never figured out if this line |
From @b2gillsOn Sat, Sep 8, 2012 at 2:33 AM, bulk 88 via RT
PeekMessage without PM_REMOVE doesn't remove messages from the queue. https://www.google.com/search?q=site%3Amsdn.microsoft.com+PeekMessage&btnI=1
|
From @bulk88On Sat Sep 08 00:20:25 2012, bulk88 wrote:
Can someone please review this patch? |
From @steve-m-haybulk 88 via RT wrote on 2012-09-13:
I hope to have a look very soon... (Apologies for the slow responses to all your investigations and patches, btw. They really are appreciated, but Windows porters are thin on the ground so it can take a while to get round to them. I have numerous others sat in my Inbox too; they haven't been forgotten.) |
From @steve-m-hayOn Thu Sep 13 10:11:32 2012, Steve.Hay@verosoftware.com wrote:
I've applied the patch locally and have done some testing. I reproduced the hang using your test scripts and the original code. bad time t1 1347611551.84041 t2 1347611561.84291 10.0025000572205 at Is this anything to be concerned about? PS: I also tried out your Win32::API 0.71 in the process in order to run |
From @steve-m-haynofork.patchdiff -ruN bulk88-perl5-win32-api-70da72a.orig/Callback/t/iat.t bulk88-perl5-win32-api-70da72a/Callback/t/iat.t
--- bulk88-perl5-win32-api-70da72a.orig/Callback/t/iat.t 2012-09-03 01:32:52.000000000 +0100
+++ bulk88-perl5-win32-api-70da72a/Callback/t/iat.t 2012-09-14 09:09:54.494412500 +0100
@@ -86,6 +86,10 @@
Win32::GetProcAddress(Win32::LoadLibrary("kernel32.dll"), 'QueryPerformanceCounter'),
"GetOriginalFunctionPtr returns real QPC");
+my $can_fork = $Config{d_fork} || $Config{d_pseudofork} ||
+ (($^O eq 'MSWin32' || $^O eq 'NetWare') and
+ $Config{useithreads} and $Config{ccflags} =~ /-DPERL_IMPLICIT_SYS/);
+
SKIP: {
Win32::API::Type->typedef('PRTL_PROCESS_MODULES', 'char *');
my $LdrQueryProcessModuleInformation =
@@ -93,8 +97,8 @@
"NTSTATUS NTAPI LdrQueryProcessModuleInformation(".
"PRTL_PROCESS_MODULES ModuleInformation,
ULONG Size, PULONG ReturnedSize)");
- skip("This Perl doesn't have ithreads and/or this Windows OS doesn't have "
- ."LdrQueryProcessModuleInformation", 6) if ! $Config{'useithreads'}
+ skip("This Perl doesn't have fork and/or this Windows OS doesn't have "
+ ."LdrQueryProcessModuleInformation", 6) if ! $can_fork
|| ! $LdrQueryProcessModuleInformation; #Native API changed, thats ok
is(GetAPITestDLLLoadCount($LdrQueryProcessModuleInformation), 1,
"DLL load count is 1 before fork");
@@ -213,4 +217,4 @@
return $_->{LoadCount};
}
}
-}
\ No newline at end of file
+}
diff -ruN bulk88-perl5-win32-api-70da72a.orig/Callback/t/ithreads.t bulk88-perl5-win32-api-70da72a/Callback/t/ithreads.t
--- bulk88-perl5-win32-api-70da72a.orig/Callback/t/ithreads.t 2012-09-03 01:32:52.000000000 +0100
+++ bulk88-perl5-win32-api-70da72a/Callback/t/ithreads.t 2012-09-14 09:10:33.786912500 +0100
@@ -13,8 +13,12 @@
#HeapBlock class is not public API
+my $can_fork = $Config{d_fork} || $Config{d_pseudofork} ||
+ (($^O eq 'MSWin32' || $^O eq 'NetWare') and
+ $Config{useithreads} and $Config{ccflags} =~ /-DPERL_IMPLICIT_SYS/);
+
SKIP: {
- skip("This Perl doesn't have ithreads", 1) if ! $Config{'useithreads'};
+ skip("This Perl doesn't have fork", 1) if ! $can_fork;
#50 megs should be enough to force a VirtualAlloc and a VirtualFree
my $ptrobj = new Win32::API::Callback::HeapBlock 5000000;
my $pid = fork();
|
From @bulk88On Fri Sep 14 01:36:46 2012, shay wrote:
Thanks for the fork detection patch to Win32::API. If there is something wrong about my assumptions of C integer I changed from GetTickCount (32 bits) to GetSystemTimeAsFileTime (64 I also created a GetSystemTimeAsFileTime equivalent (plus 1 second |
From @bulk88On Fri Sep 14 18:33:39 2012, bulk88 wrote:
http://perl5.git.perl.org/perl.git/blob/b71c54b89aa7c965d23cc7cfdb7dff179d8f4163:/t/op/alarm.t#l35
http://perl5.git.perl.org/perl.git/blob/b71c54b89aa7c965d23cc7cfdb7dff179d8f4163:/t/op/sleep.t
RT web post didn't forward to P5P so I am forwarding. |
From @steve-m-hayOn Sat Sep 15 22:56:21 2012, bulk88 wrote:
http://perl5.git.perl.org/perl.git/blob/b71c54b89aa7c965d23cc7cfdb7dff179d8f4163:/t/op/alarm.t#l35
http://perl5.git.perl.org/perl.git/blob/b71c54b89aa7c965d23cc7cfdb7dff179d8f4163:/t/op/sleep.t
Ok. [...]
I think the use of (LPFILETIME)&ticks as the argument to It would be safer to use a real FILETIME and then convert it to __int64 __int64 qw = ft->dwHighDateTime; (see filetime_to_clock()) or else via ULARGE_INTEGER, or better yet use
I think a note in perlport.pod would be fine.
The new hook64.pl is working fine for me with your patch applied. I'm Thanks. |
From @bulk88On Sun Sep 16 09:23:33 2012, shay wrote:
I changed the patch to use a FT_t. An auto FILETIME is aligned to
I'm thinking rather than a note in perlport.pod, should I add a croak (F) Perl limits the representation of decimal numbers in programs to Integer overflow in %s number (W overflow) The hexadecimal, octal or binary number you have specified Type of arg %d to &CORE::%s must be %s (F) The subroutine in question in the CORE package requires its argument The croak condition would be added to win32_sleep as "t >
What is the standard for Perl, a tab or spaces for indentation? Looking I will upload another patch after the issues brought up by you and me |
From @steve-m-haybulk 88 via RT wrote on 2012-09-17:
I don't think a croak would be very friendly for the user. The gmtime message which you refer to is a warning not a croak (that's what the "(W)" in perldiag signifies). A similar new *warning* with the wording you suggested would be fine, but I'd still be inclined to add a note in perlport too.
I think the fact that alarm won't interrupt a sleep is already covered by the perlport entry for alarm: "Emulated using timers that must be explicitly polled whenever Perl I see no need to remove the 49 day limit from sleep as part of this bug fix. The (existing) limitation will be clearly documented now, so anyone encountering the problem in the future will know that it's a known issue, and can try fixing it if they wish.
The existing tabbing isn't perfect by any means, but I at least try to add new code according to the style specified in perlhack: * 8-wide tabs (no exceptions!) |
From @janduboisOn Mon, 17 Sep 2012, Steve Hay wrote:
Sorry, haven't found time yet to read this thread, but alarm() emulation C:\>perl -E"$SIG{ALRM}=sub{die};say time;alarm 2;eval{sleep 10};say time" sleep() is also emulated, so doesn't count as a "blocking system call". Cheers, |
From @steve-m-hayOn Mon Sep 17 12:22:28 2012, jdb wrote:
Thanks, I hadn't realized that. Would something like this be suitable Inline Patchdiff --git a/pod/perlport.pod b/pod/perlport.pod
index a37bc7c..023a8c9 100644
--- a/pod/perlport.pod
+++ b/pod/perlport.pod
@@ -1921,6 +1921,11 @@ Not implemented. (S<Plan 9>)
Not implemented. (Win32, VMS, S<RISC OS>, VOS)
+=item sleep
+
+Emulated using synchronization functions such that it can be
+interrupted by alarm(). (Win32)
+
=item sockatmark
A relatively recent addition to socket functions, may not |
From @bulk88
But sleep was interrupted by the signal, the sleep when *poof* with the _____________________________________ C:\p517\perl>r I also attached the new patch to this post. |
From @bulk880001-perl-33096-fix-over-underflow-issues-in-win32_msgwai.patchFrom ccb927cbd4253b51de2e5a90d2b03bc8afe37dce Mon Sep 17 00:00:00 2001
From: Daniel Dragan <bulk88@hotmail.com>
Date: Tue, 18 Sep 2012 04:17:21 -0400
Subject: [PATCH] [perl #33096] fix over/underflow issues in win32_msgwait
This commit does not completely fix 33096, since the real problem is that
VS IDE's debugger causes a Windows message to be sent to perl on each
breakpoint. Depending on where the breakpoint was set. The BP may make it
impossible to exit the loop because of the Visual Studio IDE Debugger's
design. Various overflow and underflow issues were fixed in win32_msgwait.
Specifically, the time count rolling forwards through zero
(GetSystemTimeAsFileTime), and the time count running ahead of the end
time (rolling backwards through zero) were fixed ("<=" check).
---
pod/perldelta.pod | 3 +++
pod/perlport.pod | 6 ++++++
win32/win32.c | 45 ++++++++++++++++++++++++++++++++++-----------
3 files changed, 43 insertions(+), 11 deletions(-)
diff --git a/pod/perldelta.pod b/pod/perldelta.pod
index c1b7be1..d20f95a 100644
--- a/pod/perldelta.pod
+++ b/pod/perldelta.pod
@@ -629,6 +629,9 @@ Fixed a problem where perl could crash while cleaning up threads (including the
main thread) in threaded debugging builds on Win32 and possibly other platforms
[perl #114496].
+A rare race condition that would lead to L<sleep|perlfunc/sleep>
+taking more time than requested, and upto a hang has been fixed [perl #33096].
+
=item Solaris
In Configure, avoid running sed commands with flags not supported on Solaris.
diff --git a/pod/perlport.pod b/pod/perlport.pod
index a37bc7c..839468b 100644
--- a/pod/perlport.pod
+++ b/pod/perlport.pod
@@ -1921,6 +1921,12 @@ Not implemented. (S<Plan 9>)
Not implemented. (Win32, VMS, S<RISC OS>, VOS)
+=item sleep
+
+On Win32 C<sleep> is limited to a maximum of 4294967 seconds, approximately 49
+days, and also signals do not cause C<sleep> to return early once a signal
+fires. C<sleep> will always wait the full time period before returning.
+
=item sockatmark
A relatively recent addition to socket functions, may not
diff --git a/win32/win32.c b/win32/win32.c
index c2ad58f..47127d9 100644
--- a/win32/win32.c
+++ b/win32/win32.c
@@ -2217,13 +2217,32 @@ DllExport DWORD
win32_msgwait(pTHX_ DWORD count, LPHANDLE handles, DWORD timeout, LPDWORD resultp)
{
/* We may need several goes at this - so compute when we stop */
- DWORD ticks = 0;
+ FT_t ticks = {0};
+ unsigned __int64 endtime = timeout;
if (timeout != INFINITE) {
- ticks = GetTickCount();
- timeout += ticks;
- }
- while (1) {
- DWORD result = MsgWaitForMultipleObjects(count,handles,FALSE,timeout-ticks, QS_POSTMESSAGE|QS_TIMER|QS_SENDMESSAGE);
+ GetSystemTimeAsFileTime(&ticks.ft_val);
+ ticks.ft_i64 /= 10000;
+ endtime += ticks.ft_i64;
+ }
+ /* This was a race condition. Do not let a non INFINITE timeout to
+ * MsgWaitForMultipleObjects roll under 0 creating a near
+ * infinity/~(UINT32)0 timeout which will appear as a deadlock to the
+ * user who did a CORE perl function with a non infinity timeout,
+ * sleep for example. This is 64 to 32 truncation minefield.
+ *
+ * This scenario can only be created if the timespan from the return of
+ * MsgWaitForMultipleObjects to GetSystemTimeAsFileTime exceeds 1 ms. To
+ * generate the scenario, manual breakpoints in a C debugger are required,
+ * or a context switch occured in win32_async_check in PeekMessage, or random
+ * messages are delivered to the *thread* message queue of the Perl thread
+ * from another process (msctf.dll doing IPC among its instances, VS debugger
+ * causes msctf.dll to be loaded into Perl by kernel), see [perl #33096].
+ */
+ while (ticks.ft_i64 <= endtime) {
+ /* if timeout's type is lengthened, remember to split 64b timeout
+ * into multiple non-infinity runs of MWFMO */
+ DWORD result = MsgWaitForMultipleObjects(count,handles,FALSE,(DWORD)(endtime-ticks.ft_i64)
+ , QS_POSTMESSAGE|QS_TIMER|QS_SENDMESSAGE);
if (resultp)
*resultp = result;
if (result == WAIT_TIMEOUT) {
@@ -2233,8 +2252,9 @@ win32_msgwait(pTHX_ DWORD count, LPHANDLE handles, DWORD timeout, LPDWORD result
return 0;
}
if (timeout != INFINITE) {
- ticks = GetTickCount();
- }
+ GetSystemTimeAsFileTime(&ticks.ft_val);
+ ticks.ft_i64 /= 10000;
+ }
if (result == WAIT_OBJECT_0 + count) {
/* Message has arrived - check it */
(void)win32_async_check(aTHX);
@@ -2244,10 +2264,13 @@ win32_msgwait(pTHX_ DWORD count, LPHANDLE handles, DWORD timeout, LPDWORD result
break;
}
}
- /* compute time left to wait */
- ticks = timeout - ticks;
/* If we are past the end say zero */
- return (ticks > 0) ? ticks : 0;
+ if(!ticks.ft_i64 || ticks.ft_i64 > endtime)
+ return 0;
+ /* compute time left to wait */
+ ticks.ft_i64 = endtime - ticks.ft_i64;
+ /*if more ms than DWORD, then return max DWORD*/
+ return ticks.ft_i64 <= UINT_MAX ? (DWORD)ticks.ft_i64:UINT_MAX;
}
int
--
1.7.9.msysgit.0
|
From @steve-m-haybulk 88 via RT wrote on 2012-09-18:
I don't think Win32 is alone in that. This is as documented in perlfunc/alarm: If you want to use C<alarm> to time out a system call you need to use an eval {
Thanks, I will take a look later. |
From @steve-m-hayOn Tue Sep 18 01:44:54 2012, Steve.Hay@verosoftware.com wrote:
Thanks, applied as 001e9f8. I removed the perlport bit about sleep not being interruptible by alarm Also, were you going to add a warning (like gmtime has) in the case of |
From @bulk88On Tue Sep 18 09:49:03 2012, shay wrote:
I decided not to since I've never used the warnings category system in C Thanks for applying the patch. |
From @jkeenanOn Tue Sep 18 11:03:19 2012, bulk88 wrote:
A patch has been applied. Can we close this ticket which (hint!) was Thank you very much. |
From @steve-m-hayThe patch applied fixes the main bug here. The issues with VS debugger |
From [Unknown Contact. See original ticket]The patch applied fixes the main bug here. The issues with VS debugger |
@steve-m-hay - Status changed from 'open' to 'resolved' |
From @steve-m-hayOn Wed Sep 19 01:06:35 2012, shay wrote:
Warning now added in commit 3b9aea0. I also added the note about |
Migrated from rt.perl.org#33096 (status was 'resolved')
Searchable as RT33096$
The text was updated successfully, but these errors were encountered: