New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
check if %hash 500x times slower than if keys %hash #12350
Comments
From mgrigoriev@nvidia.comCreated by mgrigoriev@smtp.nvidia.comif you run this benchmark: use Benchmark qw( cmpthese :hireswallclock ); my $hash = { 1 .. 10000 }; cmpthese( -10, { if ( %hash ) 136924/s -- -0% -0% -0% -99% -99% -99% Perl Info
|
From @jkeenanOn Fri Aug 24 16:32:12 2012, mgrigoriev@nvidia.com wrote:
AFAICT, all this shows is that correct Perl runs faster than incorrect Perl. By placing the various versions of $hash and %hash inside the expression Take this simple program: ######### print "Finished\n"; Now run it through the Perl debugger, using the 'x' debug command to ######### main::(hash.pl:1): my $hashref = { 1 .. 10000 }; What's happening here? Camel book, 4th ed., page 85 states: ########## That's what we're doing in the first two variants. However, just before ########## 'perldoc perldata' contains much of the same argument. Now, I myself can't say why 'if (%hash)' is so much slower than 'if Unless others want to provide more detail, I recommend that this RT be Thank you very much. |
The RT System itself - Status changed from 'new' to 'open' |
From @cpansproutOn Fri Aug 24 18:52:13 2012, jkeenan wrote:
%hash takes a long time because it has to iterate through the buckets There is an optimisation known as boolkeys, added in 5.11.1 ‘if’ and ‘unless’ are implemented in terms of && and ||, so the The examples above will show that if(%hash) is faster than keys, if an In the case of ||, we can’t optimise it in non-void context, because In the case of &&, we can optimise it, even in non-void context, because I made that change in commit 20e53f5. scalar(%hash) was not optimised at all, so I did that in commit %hash?...:... (aka if/else) was not optimised either, so I did that in -- Father Chrysostomos |
@cpansprout - Status changed from 'open' to 'resolved' |
From @demerphqOn 25 August 2012 09:31, Father Chrysostomos via RT
Wow. Thanks a lot for all the hard work FC. I remember my original Yves -- |
From @demerphqOn 25 August 2012 09:31, Father Chrysostomos via RT
Rereading this I wonder about something, doesn't this indicate a The sub is called in in void context. Shouldn't that propagate to the if()? Why does the user have to stick a raw return in? Are there technical reasons why in this case we cant check the context Yves -- |
From @rurbanOn Sat, Aug 25, 2012 at 11:07 AM, demerphq <demerphq@gmail.com> wrote:
Unfortunately we only store the LVALUE context in PUSHSTUB, not the There's a technical reason, there's not much room for the 3 missing bits. |
From @cpansproutOn Sat Aug 25 02:08:34 2012, demerphq wrote:
The boolkeys optimisation happens at compile time. Am I missing -- Father Chrysostomos |
From @cpansproutOn Sat Aug 25 02:34:45 2012, rurban wrote:
But every sub also does a PUSHBLOCK which sets cx->blk_gimme.
But the sub isn’t called till run time. -- Father Chrysostomos |
From @demerphqOn 25 August 2012 16:03, Father Chrysostomos via RT
Well there were cases where you said we couldn't do boolkeys because My point was if we could determine if we were called in void context Or is it me that is missing something obvious? Yves -- |
From @rurbanOn Sat, Aug 25, 2012 at 4:07 PM, Father Chrysostomos via RT
Oh, right. So we'd need to put the 3 context bits into the entersub op flag, |
From @cpansproutOn Sat Aug 25 08:39:29 2012, demerphq wrote:
Thank you for persisting, because you have made me realising I was In sub { the void context cannot be detected until run time, at which point it is But then I realised that %hash itself should be detecting void context, See commit 6ea72b3. -- Father Chrysostomos |
From @cpansproutOn Sat Aug 25 14:47:01 2012, sprout wrote:
Now I’m wondering whether we even need the boolkeys op. -- Father Chrysostomos |
From mgrigoriev@nvidia.comAnd all just asked was the expected behavior from if %hash. -----Original Message----- On Sat Aug 25 14:47:01 2012, sprout wrote:
Now I’m wondering whether we even need the boolkeys op. -- Father Chrysostomos This email message is for the sole use of the intended recipient(s) and may contain |
From @cpansproutOn Sat Aug 25 14:47:26 2012, sprout wrote:
I’m disabled it with commit c8fe3bd and moved the optimisation I did not remove the boolkeys op type. Can op types be removed, or does -- Father Chrysostomos |
From [Unknown Contact. See original ticket]On Sat Aug 25 14:47:26 2012, sprout wrote:
I’m disabled it with commit c8fe3bd and moved the optimisation I did not remove the boolkeys op type. Can op types be removed, or does -- Father Chrysostomos |
From @cpansproutOn Sat Aug 25 19:52:35 2012, mgrigoriev@nvidia.com wrote:
Yes, it is a pity that the recommended alternative can be much slower. -- Father Chrysostomos |
From @rurbanOn Aug 26, 2012 8:04 AM, "Father Chrysostomos via RT" <
Only little problems, mostly for me. Several have been removed before. Just |
From perl@profvince.com
Cool. Is it really necessary to set OPpTRUEBOOL and OpMAYBE_TRUEBOOL at Vincent |
From @cpansproutOn Sun Aug 26 10:16:29 2012, perl@profvince.com wrote:
The context information returned by GIMME_V is for the rv2hv op itself. BTW, do you know the answer to the question in the subject? -- Father Chrysostomos |
From perl@profvince.com
I hadn't think of that. Thanks for explaining.
Since op types aren't explicitely documented anywere, I consider that Of course there would more to discuss if a really basic op (like Vincent |
From @cpansproutOn Aug 26, 2012, at 12:05 PM, Vincent Pit wrote:
They are listed in Opcode.pm, but I’ve gone ahead and removed it in commit 605fa6b, following the precedent set by 5edb5b2. |
From @LeontOn Sun, Aug 26, 2012 at 8:04 AM, Father Chrysostomos via RT
Given that we modules that manipulate the op-tree, in particular Leon |
From @tseeOn 08/26/2012 11:27 PM, Leon Timmermans wrote:
As long as the op tree is shared between threads and a thread might be --Steffen |
From @doyOn Mon, Aug 27, 2012 at 07:53:18AM +0200, Steffen Mueller wrote:
This is only a problem for runtime optree manipulation, and not for new -doy |
From @rurbanOn Mon, Aug 27, 2012 at 9:51 AM, Jesse Luehrs <doy@tozt.net> wrote:
Independent from new or change. There is even now a hard check. Before it was only an ASSERT, which |
From @LeontOn Mon, Aug 27, 2012 at 9:51 AM, Jesse Luehrs <doy@tozt.net> wrote:
Yeah. During compile-time an OP-tree isn't shared yet, that shouldn't Leon |
From @doyOn Mon, Aug 27, 2012 at 11:35:10AM +0200, Leon Timmermans wrote:
Right, so for things like parser hooks, having a stable optree interface -doy |
From @iabynOn Mon, Aug 27, 2012 at 09:57:01AM +0200, Reini Urban wrote:
But PL_curcop == PL_compiling is just perl's way of determining whether -- |
From @nwc10On Tue, Aug 28, 2012 at 07:22:51AM -0700, Father Chrysostomos via RT wrote:
If one uses lexical hashes, that's been giving a deprecation warning for $ cat hash.pl my $generate_report = defined %sales || defined %purchases; if ($generate_report) {
Yes, I'd not thought of this. Yes, it is obvious.
No, but I can see several options, and more variants may exist 1) just reverse the change to the structure, and make every object bigger 2) The first time it's requested, count the hash buckets, and from then on This will keep pretty much every object smaller. 3) If it's requested, count the hash buckets. Cache the answer. Similar trade offs to above. 3a) Cache the string form of the scalar hash value 4) Make the scalar return value from keys be linked via magic to the hash, If it's used in boolean context (hard to tell? hard to implement?) return This feels like too much work. It's interesting that this changed behaviour has been in 5.12.0 etc, 5.14.0 Nicholas Clark |
@cpansprout - Status changed from 'resolved' to 'open' |
From @cpansproutI’m reopening this, as there are still unresolved issues. -- Father Chrysostomos |
From @nwc10On Wed, Aug 29, 2012 at 08:44:13PM -0700, Father Chrysostomos via RT wrote:
I have a change that I believe fixes the slowdown without also increasing However, it missed the code freeze. Nicholas Clark |
From @demerphqOn 10 April 2013 17:42, Nicholas Clark <nick@ccl4.org> wrote:
And I have a patch somewhere (i forget the branch name right now) which It also is waiting on the code thaw. Yves -- |
From @jkeenanOn Wed Apr 10 11:22:02 2013, demerphq wrote:
Nicholas, Yves: Can you take a look at the patches you referred to above? Thank you very much. |
From @nwc10On Thu, Jun 13, 2013 at 07:19:51PM -0700, James E Keenan via RT wrote:
I've merged mine to blead, but after the previous monthly release. Before I believe that Yves was waiting for a CPAN smoke run before proceeding further. Nicholas Clark |
From @maukeOn Fri Jun 14 02:42:11 2013, nicholas wrote:
What's the status of this ticket? It's listed in perl5200delta. Can it be closed? |
From @iabynOn Mon, Feb 22, 2016 at 02:13:50PM -0800, l.mai@web.de via RT wrote:
Nicholas's work, which caches the value of hv_fill, is merged.
I'm not sure what this refers to, but I suspect it hasn't been merged. -- |
From @demerphqOn 25 Mar 2016 14:16, "Dave Mitchell" <davem@iabyn.com> wrote:
No, it hasn't. Fwiw, The idea was to stop exposing hv_fill via scalar (%hash), and let It is only used to construct the left hand value in "3/8" which might be If scalar %hash became the same as 0+keys, then the first use case is Its worth noting that the current behaviour is an abstraction violation, it If we change it to be the count of keys we get rid of these problems. The It's probably worth revisiting this subject. Ill see what I can do to find Yves |
From @iabynOn Fri, Mar 25, 2016 at 07:15:58PM +0100, demerphq wrote:
I like in principle the idea of simplifying %hash in scalar context. %hash =~ m{^(\d+)/(\d+)/$} All we're then doing is lying about how efficiently HEs are being Conversely, just returning an int rather than a "1/2" fraction means that -- |
From @demerphqOn 28 March 2016 at 11:28, Dave Mitchell <davem@iabyn.com> wrote:
I certainly would welcome such a change as a positive step, and My view is that the denominator of the fraction is a leakage of the This is not a pie-in-the-sky objection either. Tied hashes already So the patch sequence I have been working on aims to do the following: 1. make scalar(%hash) return the count of the keys. The patch sequence is structured so we can fallback to the proposal
Yes, I agree this is a reasonable "middle way" to address this issue,
Yes, which is much faster. Anyway, to repeat, I definitely agree your proposal is a reasonable FWIW I do not think we should go through a deprecation cycle on this. This is predicated on my conclusion that there really are only the 1. simple demo scripts showing how perls hash function works - better So I think this is something that is so rarely used, and so unlikely Yves |
From @demerphqOn 28 March 2016 at 12:37, demerphq <demerphq@gmail.com> wrote:
Although this implies the change is only suitable for a major release. Yves |
From @cpansproutOn Mon Mar 28 03:37:52 2016, demerphq wrote:
Absolutely. +1 -- Father Chrysostomos |
From @demerphqOn 28 March 2016 at 18:15, Father Chrysostomos via RT
Note that boolean context (thanks iirc to your efforts) actually does It is only code that uses scalar(%hash) outside of boolean context Yves -- |
From @iabynOn Mon, Mar 28, 2016 at 06:27:05PM +0200, demerphq wrote:
Now that 5.24.0 is out, are you likely to implement this for 5.25.x? -- |
From @demerphqOn 17 June 2016 at 15:27, Dave Mitchell <davem@iabyn.com> wrote:
If you mean changing the signature of scalar(%hash). IMO yes, we -- |
From @demerphqOn 17 June 2016 at 22:15, demerphq <demerphq@gmail.com> wrote:
then yes I am interested in implementing this for 5.25.x. I actually planned to address this in a post about my vision of hash Yves -- |
From @iabynOn Fri, Jun 17, 2016 at 10:17:09PM +0200, demerphq wrote:
yes, that's what I meant.
+1 -- |
From @maukeAm 17.06.2016 um 22:17 schrieb demerphq:
I'd like that. -- |
From @demerphqOn 20 June 2016 at 20:14, Lukas Mai <plokinom@gmail.com> wrote:
I mostly have a patch ready for this. My plan is to make it return the count of keys in the hash. Yves -- |
From @xsawyerxOn 06/20/2016 08:14 PM, Lukas Mai wrote:
If we're voting, I'll throw a +1 on that too. :) |
From @demerphqOn 21 June 2016 at 16:44, Sawyer X <xsawyerx@gmail.com> wrote:
I just pushed a branch that contains code to do this as: smoke-me/no_xhv_fill The patch relevant to this thread is commit 28ecb97 Change scalar(%hash) to be the same as 0+keys(%hash) I am going to start a new thread to discuss this. Yves -- |
From @jkeenanOn Tue, 21 Jun 2016 18:51:42 GMT, demerphq wrote:
This ticket was originally filed back in 2012 and was a complaint about performance. Have we addressed the original poster's concerns? Is the ticket closeable? Thank you very much. -- |
From @iabynOn Mon, Dec 26, 2016 at 05:24:14PM -0800, James E Keenan via RT wrote:
As far as I'm aware, yes. FC fixed up some of the boolean context stuff, then Yves added: 8bf4c40 Change scalar(%hash) to be the same as 0+keys(%hash) then I reworked the boolean context stuff to be more general. So in 5.25.10, the OP's benchmark code is now fast for all cases (I had to I'll close this ticket now. -- |
@iabyn - Status changed from 'open' to 'resolved' |
Migrated from rt.perl.org#114576 (status was 'resolved')
Searchable as RT114576$
The text was updated successfully, but these errors were encountered: