New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
performance bug: perl Thread::Queue is 20x slower than Unix pipe #13196
Comments
From johnh@isi.eduCreated by johnh@isi.eduThis is a bug report for perl from johnh@isi.edu, ----------------------------------------------------------------- Why is Thread::Queue *so* slow? I understand it has to do locking and be careful about data Thread::Queue is correct, but I suggest that 20x slower is a performance bug. One would think that IPC through memory would be at least as fast as a Here's timing of a test program that sends 500k integers between two threads, $ ./thread_ipc_perf.pl -m queue $ ./thread_ipc_perf.pl -m pipe Here's a larger run (1M integers) with the same kind of results. $ ./thread_ipc_perf.pl -N 1000000 -m queue $ ./thread_ipc_perf.pl -N 1000000 -m pipe Source code for the above simple benchmark is at We can quibble over the exact multiplier (maybe it's only 15x slower), Any suggestions? I get similar results if I simplify Thread::Queue to To speculate, I'm thinking the cost is in making all IPC data shared. Thanks for any suggestions, Perl Info
|
From @jkeenanOn Fri Aug 23 17:28:00 2013, johnh@isi.edu wrote:
That's a lot of configuration options. While I don't doubt that you Would it be possible for you to try this again with the absolute minimum Thank you very much. |
The RT System itself - Status changed from 'new' to 'open' |
From @iabynOn Sun, Aug 25, 2013 at 05:37:39PM -0700, James E Keenan via RT wrote:
Because it is nothing like a UNIX pipe. A UNIX pipe takes a stream of bytes, and read and writes chunks of them A T::Q buffer takes a stream of perl "things", which might be objects or
But T::Q is build upon a shared array, and is designed to handled shared I think the performance you are seeing is the performance I would expect, -- |
From johnh@isi.eduOn Mon, 26 Aug 2013 08:11:12 -0700, "Dave Mitchell via RT" wrote:
I understand that Thread::Queue and perl threads allow shared data, and that My concern is that Thread::Queue also *forces* shared data, even when From perlthrtut, the "Pipeline" model The pipeline model divides up a task into a series of steps, and passes For the pipeline model, one does not need repeated sharing, just a But one does not *want* sharing (for the pipeline model) there if it's a If the statement is that queues should require shared data and the Alternatively, I'd love some mechanism to share data between threads -John |
From johnh@isi.eduOn Sun, 25 Aug 2013 17:37:39 -0700, "James E Keenan via RT" wrote:
Thanks for the reply. I don't build perl myself, those are the default configure options for I can build perl if you really want, but let me suggest an alternative I provided source code to my benchmark program at: http://www.isi.edu/~johnh/SOFTWARE/FSDB/thread_ipc_perf.pl.txt and the two invocations that clearly show the difference on my platform:
The benchmark is 293 lines long, but it's mostly POD documentation and If some other platform or build has much different performance, I'll -John |
From @ikegamiHow does Thread::Queue::Any compare? On Mon, Aug 26, 2013 at 11:58 AM, John Heidemann <johnh@isi.edu> wrote:
|
From @lizmatOn Aug 26, 2013, at 5:58 PM, John Heidemann <johnh@isi.edu> wrote:
You should realize that the perl ithreads implementation does *not* have any real shared variables at all. Each thread has its own *copy* of the world. Variables with the :shared trait, are simply tied() variables to some internal logic that will STORE values in yet another, hidden thread. And will FETCH them from that hidden thread again when needed. There is some locking involved there, I would assume. But I think the biggest bottleneck is really that the slow tie() interface is used for shared variables. The forks module does not do this differently. However, instead of making a copy of the world each time a thread is started, the forks module just does a fork() and let's the OS take care of any Copy-On-Write needed. This makes starting a thread *much* faster, especially if you have something like Moose and its dependencies loaded. Reading and writing shared variables are done by using pipes, Unix pipes if possible. Thread::Queue::Any is simply a wrapper around Thread::Queue, and thus suffers from the same performance issues. In other words: don't use Perl 5's ithreads for performance, use it for asynchronous jobs only where not having to wait for something slow, Liz |
From @LeontOn Mon, Aug 26, 2013 at 5:58 PM, John Heidemann <johnh@isi.edu> wrote:
Actually I did write a queue implementation for threads::lite that should
I don't think that would be faster than a queue, given perl's memory model Leon |
From @nwc10On Mon, Aug 26, 2013 at 08:58:14AM -0700, John Heidemann wrote:
Yes, I agree that that's a valid concern, and we could document that better. As someone rather too close to the code, it's not easy to pull back far Do you have a suggestion for where we should document this, such that you
Agree, I'd love this too. It would permit a lot of effective higher level I don't think that this is particularly a Perl problem. I'm not aware of any Nicholas Clark * such as the rather nice constructions that Jonathan Worthing demonstrated |
From johnh@isi.eduOn Tue, 27 Aug 2013 11:18:57 +0100, Nicholas Clark wrote:
A proposed patch to perlthrtut is attached at the end of this message.
I don't know anything about C-level internals of perl. I agree these are inherrent in *shared* variables independent of language. It's too bad there's no way to move data between two threads without What I'll do for now is to get this effect by printing it to pipe and -John Inline Patch--- perlthrtut.pod- 2013-08-27 08:47:16.347167972 -0700
+++ perlthrtut.pod 2013-08-27 08:53:26.159772710 -0700
@@ -465,6 +465,13 @@
data inconsistency and race conditions. Note that Perl will protect its
internals from your race conditions, but it won't protect you from you.
+=head2 Thread Pitfalls: Performance
+
+Shared data is and locking expensive, slowing down access.
+As of perl 5.18, one should expect sharing data between threads
+with tools such as L<Thread::Queue> to be about 15-20x slower
+than copying the data through L<pipe(2)>.
+
=head1 Synchronization and control
Perl provides a number of mechanisms to coordinate the interactions |
From @tamiasOn Tue, Aug 27, 2013 at 05:15:09PM -0700, John Heidemann wrote:
I think this sentence got a bit mixed up. Ronald |
From @nwc10On Tue, Aug 27, 2013 at 05:15:09PM -0700, John Heidemann wrote:
Thanks
Agree that's it's frustrating. That paper seems to predate Perl 1 by about 5 weeks, but I don't think that I feel that it's the same fundamental problem as attempting to retrofit
On Wed, Aug 28, 2013 at 11:30:58PM -0400, Ronald J Kimball wrote:
I think also that it should mention your insight about what's not obvious Shared data and locking are expensive, slowing down access. If in the future someone does radically improve thread performance, then I'd Nicholas Clark |
From @LeontOn Tue, Aug 27, 2013 at 12:11 PM, Leon Timmermans <fawaka@gmail.com> wrote:
You can find it on github at https://github.com/Leont/thread-channel, it Leon |
From johnh@isi.eduOn Fri, 30 Aug 2013 20:27:08 +0200, Leon Timmermans wrote:
That sounds great. Should it be Thread::Queue::Fast -John |
Migrated from rt.perl.org#119445 (status was 'open')
Searchable as RT119445$
The text was updated successfully, but these errors were encountered: