Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PATCH] 3db6cbfc - SIPHASH is very slow on 32 bit x86 #12638

Closed
p5pRT opened this issue Dec 12, 2012 · 9 comments
Closed

[PATCH] 3db6cbfc - SIPHASH is very slow on 32 bit x86 #12638

p5pRT opened this issue Dec 12, 2012 · 9 comments

Comments

@p5pRT
Copy link

p5pRT commented Dec 12, 2012

Migrated from rt.perl.org#116054 (status was 'open')

Searchable as RT116054$

@p5pRT
Copy link
Author

p5pRT commented Dec 12, 2012

From @bulk88

Created by @bulk88

summary​: SIPHASH on 32 bit processors without native CPU 64 bit integers
(such as 32 bit x86) is atleast an order of magnitude slower than one at
a time hash

The discussion in http​://www.nntp.perl.org/group/perl.perl5.porters/2012/12/msg196172.html
about commit 3db6cbf "Switch default
hash to SIPHASH on 64 bit builds and ONE_AT_A_TIME on 32 bit builds"
seems to have flickered out. I feel its time for the issue to be a bug
report and not a ML discussion so its not forgotten. I don't think
Configure has a way of figuring out what is the largest native CPU
integral type, and I am not sure if there are any GCC or partially
portable macros that can be tested to learn the CPU type or learn if a
particular compiler O level has native 64 bit ints or not. IDK if any
other FOSS project has a framework for figuring these things out (I
imagine a large hand edited header file would work if there are enough
maintainers with enough CPUs and compilers/platforms, for example take
something out of GCC's code base where I'm sure (guess) all this info
exists). As the next best thing, instead of checking for U64 type/macro,
maybe check if sizeof(void *) == 8 and U64, only then use SIPHASH, but
that is bias towards traditional x86 and doesn't cover clever tricks
like http​://en.wikipedia.org/wiki/X32_ABI (where SIPHASH should be
picked even though sizeof(void *) == 4) or the equivalent on other CPUs.

Perl Info

Flags:
     category=core
     severity=medium

Site configuration information for perl 5.17.7:

Configured by Owner at Thu Dec  6 20:21:41 2012.

Summary of my perl5 (revision 5 version 17 subversion 7 patch blead 
2012-12-06.16:42:20 93a641ae382638ffd1980378be4810244d04f4b0 
v5.17.6-186-g93a641a) configuration:
   Snapshot of: 93a641ae382638ffd1980378be4810244d04f4b0
   Platform:
     osname=MSWin32, osvers=5.1, archname=MSWin32-x86-multi-thread
     uname=''
     config_args='undef'
     hint=recommended, useposix=true, d_sigaction=undef
     useithreads=define, usemultiplicity=define
     useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef
     use64bitint=undef, use64bitall=undef, uselongdouble=undef
     usemymalloc=n, bincompat5005=undef
   Compiler:
     cc='cl', ccflags ='-nologo -GF -W3 -MD -Zi -DNDEBUG -O1 -GL -G7 
-DWIN32 -D_CONSOLE -DNO_STRICT  -DPERL_TEXTMODE_SCRIPTS 
-DPERL_IMPLICIT_CONTEXT -DPERL_IMPLICIT_SYS -DUSE_PERLIO 
-D_USE_32BIT_TIME_T',
     optimize='-MD -Zi -DNDEBUG -O1 -GL -G7',
     cppflags='-DWIN32'
     ccversion='13.10.6030', gccversion='', gccosandvers=''
     intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
     d_longlong=undef, longlongsize=8, d_longdbl=define, longdblsize=8
     ivtype='long', ivsize=4, nvtype='double', nvsize=8, 
Off_t='__int64', lseeksize=8
     alignbytes=8, prototype=define
   Linker and Libraries:
     ld='link', ldflags ='-nologo -nodefaultlib -debug -opt:ref,icf 
-ltcg  -libpath:"c:\perl517\lib\CORE"  -machine:x86'
     libpth="C:\Program Files\Microsoft Visual Studio .NET 2003\VC7\lib"
     libs=oldnames.lib kernel32.lib user32.lib gdi32.lib winspool.lib 
comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib 
netapi32.lib uuid.lib ws2_32.lib mpr.lib winmm.lib  version.lib 
odbc32.lib odbccp32.lib comctl32.lib msvcrt.lib
     perllibs=oldnames.lib kernel32.lib user32.lib gdi32.lib 
winspool.lib  comdlg32.lib advapi32.lib shell32.lib ole32.lib 
oleaut32.lib  netapi32.lib uuid.lib ws2_32.lib mpr.lib winmm.lib 
version.lib odbc32.lib odbccp32.lib comctl32.lib msvcrt.lib
     libc=msvcrt.lib, so=dll, useshrplib=true, libperl=perl517.lib
     gnulibc_version=''
   Dynamic Linking:
     dlsrc=dl_win32.xs, dlext=dll, d_dlsymun=undef, ccdlflags=' '
     cccdlflags=' ', lddlflags='-dll -nologo -nodefaultlib -debug 
-opt:ref,icf -ltcg  -libpath:"c:\perl517\lib\CORE"  -machine:x86'

Locally applied patches:



@INC for perl 5.17.7:
     C:/perl517/site/lib
     C:/perl517/lib
     .


Environment for perl 5.17.7:
     HOME (unset)
     LANG (unset)
     LANGUAGE (unset)
     LD_LIBRARY_PATH (unset)
     LOGDIR (unset)
     PATH=C:\perl517\bin;C:\Program Files\Microsoft Visual Studio .NET 
2003\Common7\IDE;C:\Program Files\Microsoft Visual Studio .NET 
2003\VC7\BIN;C:\Program Files\Microsoft Visual Studio .NET 
2003\Common7\Tools;C:\Program Files\Microsoft Visual Studio .NET 
2003\Common7\Tools\bin\prerelease;C:\WINDOWS\system32;C:\WINDOWS;C:\WINDOWS\system32\wbem;
     PERL_BADLANG (unset)
     SHELL (unset)

@p5pRT
Copy link
Author

p5pRT commented Dec 12, 2012

From @demerphq

My current branch checks if HAS_QUAD is defined.

Yves

@p5pRT
Copy link
Author

p5pRT commented Dec 12, 2012

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Dec 23, 2012

From @bulk88

On Tue Dec 11 18​:19​:54 2012, bulk88 wrote​:

This is a bug report for perl from bulk88@​hotmail.com,
generated with the help of perlbug 1.39 running under perl 5.17.7.

-----------------------------------------------------------------
[Please describe your issue here]

summary​: SIPHASH on 32 bit processors without native CPU 64 bit
integers
(such as 32 bit x86) is atleast an order of magnitude slower than one
at
a time hash

See attached patch. I tried it on VC 2003 32 bit Win32 Perl, no Quads,
and on VC 2003 32 bit Win32 Perl, with Quads. Asm showed one_at_a_time
being used on the with Quads Perl. I didn't do a make test on the Perls
for this patch.

--
bulk88 ~ bulk88 at hotmail.com

@p5pRT
Copy link
Author

p5pRT commented Dec 23, 2012

From @bulk88

0001-don-t-use-SIPHASH-on-x86-32-and-non-native-int-64-pl.patch
From 22fb3e894dc7ba2f922861c955e0d5239fb4cf5a Mon Sep 17 00:00:00 2001
From: Daniel Dragan <bulk88@hotmail.com>
Date: Sun, 23 Dec 2012 03:03:12 -0500
Subject: [PATCH] don't use SIPHASH on x86-32 and non-native int 64 platforms

SIPHASH performance 7x slower than ONE_AT_A_TIME on i386 in my testing. On
AMD64 its 2x slower. This commit is related to commit 3db6cbfca3
and fixes Perl RT #116054. Details on speed claims are in
http://www.nntp.perl.org/group/perl.perl5.porters/2012/12/msg196172.html .
Any compiler can emulate any integer. Since the x86-32/i386 instruction
set has no 64 bit instructions, all compilers will emulate 64 bit integers
using multiple 32 bit instructions, and possibly calls to a static linked
math library. This severely degrades performance of the hashing function.

On a 32 bit Perl with Quads on i386, the slower SIPHASH was used
previously. Now, 32 bit Perl with Quads on i386 will use the faster native
ONE_AT_A_TIME instead of SIPHASH. Other CPUs and macros should be added
in the future to the list, but it should not be done without looking at the
assembly output of compilers and knowledge of the typical compiler flags
for that platform, and knowledge of the instruction set for that platform.
Obviously any CPU with 64 bit pointers has 64 bit operand instructions. If
a CPU or platform has emulated 64 bit pointers that will be delt with
when such a bug ticket comes. 32 bit CPUs that I dont know the details
of were put in the comments.
---
 hv.h |   15 ++++++++++++++-
 1 files changed, 14 insertions(+), 1 deletions(-)

diff --git a/hv.h b/hv.h
index 3ee2399..ac395f7 100644
--- a/hv.h
+++ b/hv.h
@@ -156,7 +156,20 @@ struct xpvhv {
         || defined(PERL_HASH_FUNC_ONE_AT_A_TIME_OLD) \
         || defined(PERL_HASH_FUNC_BUZZHASH16) \
     )
-#ifdef U64
+/* Porting note, add CPUs and macros which indicate native 64 bit ints,
+   or no native 64 bit ints. This is to not use SIPHASH on platforms with
+   compiler emulated 64 bit ints, ONE_AT_A_TIME works well with 32 bit CPUs.
+   See http://www.nntp.perl.org/group/perl.perl5.porters/2012/12/msg196172.html
+   for discussion on speed of these hash functions.
+
+   sparc, powerpc, alpha, mips and arm with 32 bit pointers are missing from
+   this list, 32 bit x86 with AVX might one day have native 64 bit ints
+*/
+#if defined(U64) && \
+    (PTRSIZE == 8 \
+    || defined(__ILP32__) \
+    || ( !defined(i386) && !defined(__i386) && !defined(__i386__) \
+        && !defined(_M_IX86) && !defined(__X86__) && !defined(_X86_)))
 #define PERL_HASH_FUNC_SIPHASH
 #else
 #define PERL_HASH_FUNC_ONE_AT_A_TIME
-- 
1.7.9.msysgit.0

@p5pRT
Copy link
Author

p5pRT commented Jul 26, 2013

From @tonycoz

On Sun Dec 23 00​:09​:52 2012, bulk88 wrote​:

On Tue Dec 11 18​:19​:54 2012, bulk88 wrote​:

This is a bug report for perl from bulk88@​hotmail.com,
generated with the help of perlbug 1.39 running under perl 5.17.7.

-----------------------------------------------------------------
[Please describe your issue here]

summary​: SIPHASH on 32 bit processors without native CPU 64 bit
integers
(such as 32 bit x86) is atleast an order of magnitude slower than one
at
a time hash

See attached patch. I tried it on VC 2003 32 bit Win32 Perl, no Quads,
and on VC 2003 32 bit Win32 Perl, with Quads. Asm showed one_at_a_time
being used on the with Quads Perl. I didn't do a make test on the Perls
for this patch.

Perl no longer uses SIPHASH by default, even for 64-bit builds, so this
patch is obsolete.

I discussed this briefly in #p5p, and bulk88 would like the ticket kept
open for now.

Tony

@p5pRT
Copy link
Author

p5pRT commented Oct 19, 2013

From @bulk88

On Thu Jul 25 18​:39​:34 2013, tonyc wrote​:

Perl no longer uses SIPHASH by default, even for 64-bit builds, so this
patch is obsolete.

I discussed this briefly in #p5p, and bulk88 would like the ticket kept
open for now.

Tony

Since this ticket is about which is the best/fastest hash algo for perl,
I did some research on the issue. I published
http​://www.cpantesters.org/distro/B/Benchmark-Perl-CoreHashes.html and
testing DJB2 which was usually the fastest on my machine and most of the
machines on cpantesters, I discovered perl is severely broken with it.
So this ticket is now blocked by #120208 now IMO.

--
bulk88 ~ bulk88 at hotmail.com

@p5pRT
Copy link
Author

p5pRT commented Jul 20, 2015

From @tonycoz

On Sat Oct 19 13​:18​:45 2013, bulk88 wrote​:

On Thu Jul 25 18​:39​:34 2013, tonyc wrote​:

Perl no longer uses SIPHASH by default, even for 64-bit builds, so this
patch is obsolete.

I discussed this briefly in #p5p, and bulk88 would like the ticket kept
open for now.

Tony

Since this ticket is about which is the best/fastest hash algo for perl,
I did some research on the issue. I published
http​://www.cpantesters.org/distro/B/Benchmark-Perl-CoreHashes.html and
testing DJB2 which was usually the fastest on my machine and most of the
machines on cpantesters, I discovered perl is severely broken with it.
So this ticket is now blocked by #120208 now IMO.

#120208 was fixed in 54e07e2.

DJB2 probably isn't suitable as a default algorithm, since the 32-bit space
of seeds is too small to search for even an exhaustive search.

Tony

@toddr
Copy link
Member

toddr commented Feb 13, 2020

Given this ticket subject is not "What's the best hashing algorithm" and this discussion has died, I'm closing this case. If we want to discuss that in github issues, I suggest a new case.

@toddr toddr closed this as completed Feb 13, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants