New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[PATCH 4/11] perl_hash_crc32: use HW hasher as default with SSE4.2 #14583
Comments
From @rurbanThis is a bug report for perl from rurban@cpanel.net, perl_hash_crc32: use HW hasher as default with SSE4.2 You cannot trust any of these fast hash functions for security at all. Perl Info
|
From @rurban0001-perl_hash_crc32-use-HW-hasher-as-default-with-SSE4.2.patchFrom 57fd1b00ed72266d1f18de751c9c0a793e136bba Mon Sep 17 00:00:00 2001
From: Reini Urban <rurban@x-ray.at>
Date: Mon, 14 Apr 2014 17:18:51 -0500
Subject: [PATCH] perl_hash_crc32: use HW hasher as default with SSE4.2
can be enabled with the following ccflags:
-march=native or -mcrc32 or -msse42
also with untested AARCH64_FL_CRC support
https://gcc.gnu.org/ml/gcc-patches/2014-01/msg01012.html
See https://github.com/rurban/perl-hash-stats#perl-hash-stats for the analysis.
iSCSI CRC32-C is by far the fastest hash function tested, and produces on average
the least collisions, so is also the best for the average case.
collisions cycles/hash
CRC32 1.066 29.78
ONE_AT_A_TIME_HARD 1.092 83.75
See https://github.com/rurban/smhasher#smhasher for a more detailled analysis on the hash
function qualities. 'doc/crc32_hw' is by far better than the current 'doc/JenkinsOOAT'
Worst-case:
You cannot trust any of these fast hash functions for security at all. Collision complexity
attacks need to be secured in other ways. It's trivial to generate such attacks without
our randomized hash seeds and iterators on all these hash functions.
Without those measures CRC32 is even more trivial to attack than siphash, city, jenkins
or murmur.
Eventually we will need to switch to a better hash table scheme, which will be faster
(more cache-friendly), has better key checks, and avoids complexity attacks against our
linear linked lists at all. Sorted buckets, double-hashing, universal hashing or adding
a sleep with >10 collisions are all better schemes that I am working on.
---
hv_func.h | 78 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++----
1 file changed, 74 insertions(+), 4 deletions(-)
diff --git hv_func.h hv_func.h
index 24ebf56..6a25014 100644
--- hv_func.h
+++ hv_func.h
@@ -23,9 +23,14 @@
|| defined(PERL_HASH_FUNC_ONE_AT_A_TIME_OLD) \
|| defined(PERL_HASH_FUNC_MURMUR_HASH_64A) \
|| defined(PERL_HASH_FUNC_MURMUR_HASH_64B) \
+ || defined(PERL_HASH_FUNC_CRC32) \
)
+#if (defined(__SSE4_2__) || defined(AARCH64_FL_CRC))
+#define PERL_HASH_FUNC_CRC32
+#else
#define PERL_HASH_FUNC_ONE_AT_A_TIME_HARD
#endif
+#endif
#if defined(PERL_HASH_FUNC_SIPHASH)
# define PERL_HASH_FUNC "SIPHASH_2_4"
@@ -67,6 +72,14 @@
# define PERL_HASH_FUNC "MURMUR_HASH_64B"
# define PERL_HASH_SEED_BYTES 8
# define PERL_HASH_WITH_SEED(seed,hash,str,len) (hash)= S_perl_hash_murmur_hash_64b((seed),(U8*)(str),(len))
+#elif defined(PERL_HASH_FUNC_CRC32)
+# define PERL_HASH_FUNC "CRC32"
+# define PERL_HASH_SEED_BYTES 4
+# define PERL_HASH_WITH_SEED(seed,hash,str,len) (hash)= S_perl_hash_crc32((seed),(U8*)(str),(len))
+#elif defined(PERL_HASH_FUNC_CITY)
+# define PERL_HASH_FUNC "CITY"
+# define PERL_HASH_SEED_BYTES 4
+# define PERL_HASH_WITH_SEED(seed,hash,str,len) (hash)= S_perl_hash_city((seed),(U8*)(str),(len))
#endif
#ifndef PERL_HASH_WITH_SEED
@@ -93,6 +106,8 @@
#define PERL_HASH(hash,str,len) PERL_HASH_WITH_SEED(PERL_HASH_SEED,hash,str,len)
+#include <assert.h>
+
/*-----------------------------------------------------------------------------
* Endianess, misalignment capabilities and util macros
*
@@ -274,6 +289,7 @@ S_perl_hash_superfast(const unsigned char * const seed, const unsigned char *str
int rem= len & 3;
len >>= 2;
+ assert(hash);
for (;len > 0; len--) {
hash += U8TO16_LE (str);
tmp = (U8TO16_LE (str+2) << 11) ^ hash;
@@ -468,8 +484,10 @@ PERL_STATIC_INLINE U32
S_perl_hash_djb2(const unsigned char * const seed, const unsigned char *str, const STRLEN len) {
const unsigned char * const end = (const unsigned char *)str + len;
U32 hash = *((U32*)seed) + (U32)len;
+ assert(hash);
while (str < end) {
- hash = ((hash << 5) + hash) + *str++;
+ hash = ((hash << 5) + hash) + *str;
+ str++;
}
return hash;
}
@@ -478,8 +496,10 @@ PERL_STATIC_INLINE U32
S_perl_hash_sdbm(const unsigned char * const seed, const unsigned char *str, const STRLEN len) {
const unsigned char * const end = (const unsigned char *)str + len;
U32 hash = *((U32*)seed) + (U32)len;
+ assert(hash);
while (str < end) {
- hash = (hash << 6) + (hash << 16) - hash + *str++;
+ hash = (hash << 6) + (hash << 16) - hash + *str;
+ str++;
}
return hash;
}
@@ -504,6 +524,7 @@ PERL_STATIC_INLINE U32
S_perl_hash_one_at_a_time(const unsigned char * const seed, const unsigned char *str, const STRLEN len) {
const unsigned char * const end = (const unsigned char *)str + len;
U32 hash = *((U32*)seed) + (U32)len;
+ assert(hash);
while (str < end) {
hash += *str++;
hash += (hash << 10);
@@ -519,7 +540,7 @@ PERL_STATIC_INLINE U32
S_perl_hash_one_at_a_time_hard(const unsigned char * const seed, const unsigned char *str, const STRLEN len) {
const unsigned char * const end = (const unsigned char *)str + len;
U32 hash = *((U32*)seed) + (U32)len;
-
+ assert(hash);
while (str < end) {
hash += (hash << 10);
hash ^= (hash >> 6);
@@ -554,6 +575,7 @@ PERL_STATIC_INLINE U32
S_perl_hash_old_one_at_a_time(const unsigned char * const seed, const unsigned char *str, const STRLEN len) {
const unsigned char * const end = (const unsigned char *)str + len;
U32 hash = *((U32*)seed);
+ assert(hash);
while (str < end) {
hash += *str++;
hash += (hash << 10);
@@ -628,7 +650,7 @@ S_perl_hash_murmur_hash_64a (const unsigned char * const seed, const unsigned ch
a 32 bit value
Note uses unaligned 32 bit loads - will NOT work on machines with
- strict alginment requirements.
+ strict alignment requirements.
Also this code may not be suitable for big-endian machines.
*/
@@ -692,6 +714,54 @@ S_perl_hash_murmur_hash_64b (const unsigned char * const seed, const unsigned ch
}
#endif
+
+#if defined(PERL_HASH_FUNC_CRC32) && (defined(__SSE4_2__) || defined(AARCH64_FL_CRC))
+#include <smmintrin.h>
+
+/* Byte-boundary alignment issues */
+#define ALIGN_SIZE 0x08UL
+#define ALIGN_MASK (ALIGN_SIZE - 1)
+#define CALC_CRC(op, crc, type, buf, len) \
+ do { \
+ for (; (len) >= sizeof (type); (len) -= sizeof(type), buf += sizeof (type)) { \
+ (crc) = op((crc), *(type *) (buf)); \
+ } \
+ } while(0)
+
+PERL_STATIC_INLINE U32
+S_perl_hash_crc32(const unsigned char * const seed, const unsigned char *str, STRLEN len) {
+ const char* buf = (const char*)str;
+ U32 hash = *((U32*)seed); /* tested nok + len in variant .1 much higher collision costs */
+
+ /* Align the input to the word boundary */
+ for (; (len > 0) && ((size_t)buf & ALIGN_MASK); len--, buf++) {
+ hash = _mm_crc32_u8(hash, *buf);
+ }
+
+#ifdef __x86_64__
+ CALC_CRC(_mm_crc32_u64, hash, uint64_t, buf, len);
+#endif
+ CALC_CRC(_mm_crc32_u32, hash, uint32_t, buf, len);
+ CALC_CRC(_mm_crc32_u16, hash, uint16_t, buf, len);
+ CALC_CRC(_mm_crc32_u8, hash, uint8_t, buf, len);
+
+ return hash;
+}
+#endif
+
+#if 0
+/* Should be the best on non-x86 CPUs. NYI */
+PERL_STATIC_INLINE U32
+S_perl_hash_city(const unsigned char * const seed, const unsigned char *str, const STRLEN len) {
+ U32 hash = *((U32*)seed);
+#define k2 0x9ae16a3b2f90404fULL
+ return HashLen16(CityHash64(str, len) - k2, hash);
+#undef k2
+ return (hash ^ 0xFFFFFFFF);
+}
+#endif
+
+
/* legacy - only mod_perl should be doing this. */
#ifdef PERL_HASH_INTERNAL_ACCESS
#define PERL_HASH_INTERNAL(hash,str,len) PERL_HASH(hash,str,len)
--
2.1.4
|
From @bulk88To whatever committer will review this, there are also a couple more hash algos, including hardware AES hash on an abandoned/stalled branch http://perl5.git.perl.org/perl.git/shortlog/refs/heads/yves/hash_wip_lastet specifically commit http://perl5.git.perl.org/perl.git/commitdiff/694bbdcff696678ca9df69c15dea62a6e6e0df2a Notice rurban's patch is hardware CRC hash. Hardware CRC and hardware AES did not come out together in Intel CPUs, so the wont share the exact same CPP macros controlling their usage, but there might need to be some thought given (maybe in Configure, IDK), on how to support hardware specific code in the Perl core, and of both of these patches are using the "right" macros to control their availability. I am not expecting rurban to write such an infrastructure, but someone else (maybe a committer) wants to, they should. -- |
The RT System itself - Status changed from 'new' to 'open' |
From @rurbanOn 2nd thought I wont feel comfortable enough to have this now enabled as default. I'm rewriting the hash tables soon, and will not need HW CRC32-C. I'd rather prefer glibc double hashing (hsearch) with their short hashers, or even universal hashers for maximum security (ie. 2 random hash functions). And I haven't worked on security with the new hash table using CRC32-C, which is known to be easily attacked. |
From @kentfredricOn 16 March 2015 at 23:38, bulk88 via RT <perlbug-followup@perl.org> wrote:
+1. Hardware specific codepaths indeed need to be configurable and not 100% If it can be expressed in terms of a specific CPUFlag oriented toggle, not --with-cpu-avx For instance. -- *KENTNL* - https://metacpan.org/author/KENTNL |
From @rurbanOn 03/17/2015 06:04 PM, Kent Fredric via RT wrote:
IMHO not needed. The only theoretical problem is with packagers who compile i386 with For cross-compilation it is current practice that the target config.sh A run-time check for sse4.2/aarch64 cpu features is only needed in a jit. |
From @tonycozOn Sat Mar 14 03:50:13 2015, rurban@cpanel.net wrote:
I think these assertions: @@ -504,6 +524,7 @@ PERL_STATIC_INLINE U32 are just plain wrong. What happens when (U32)(*seed + len) just happen to be zero? I don't think crc32 has enough hash seed length to be the default hashing algorithm. Tony |
From @rurbanOn 03/30/2015 02:37 AM, Tony Cook via RT wrote:
Yes, this was only there for debugging to catch the common zero-collapse
hash seed length? This is a global for all hash functions. What do you mean? Having calculated the statistical average case your sentiment is wrong. You can argue about the security by known attacks on the CRC scheme. |
From @tonycozOn Fri Apr 03 02:50:43 2015, rurban@cpanel.net wrote:
No, the problem is that (U32)(seed+len) will be zero in 1 in 2**32 cases[1] (consider a seed of 0xFFFFFF0 and len = 0x10), so the assert() shouldn't be there at all. assert() is only tested under -DDEBUGGING by default anyway.
The problem is, even with hash order randomization, I expect a 32-bit seed to be fairly cheap to brute-force[2] I'm not claiming that ONE_AT_A_TIME_HARD is crytographically strong, but the 64-bit seed does provide some protection against more casual attackers. I'd have no problem accepting a patch that added CRC32 hashes as an option, which will make it simpler for you to build your own perls with that hash. I won't accept a patch that makes CRC32 the default hash algorithm. Tony [1] complicated by the distribution of len values, which tend to be short, but seed should be fairly evenly distrubuted. [2] the original 32-bit seeds were easily brute-forced on an Intel Q9660. |
From @rurbanOn Sun Aug 02 17:56:25 2015, tonyc wrote:
I preferred zero hashes to assert, as they are invariant against \0 attacks.
I wrote a brute-forcer, and the faster the hash, the faster the brute-forcer.
I see. 64bit seeds are trivial to add to all hash funcs.
I made it much simplier for me to use different hashes with a But I switched to FNV1A now, which makes the fastest HT, and doesn't have
Because? It creates the best distribution? It is the fastest?
Exactly. That's why the whole security theatre is bogus. |
From @tonycozOn Tue Aug 25 07:41:42 2015, rurban wrote:
asserts should only trigger when the code is in error. Since a zero (seed+len) can occur during normal operation the asserts are inappropriate.
Because it has only a 32-bit seed space, which makes it trivial to brute-force, as you confirmed above.
We use a 64-bit seed to make such brute-forcing harder (not impossible). Tony |
From @demerphqOn 26 August 2015 at 01:58, Tony Cook via RT <perlbug-followup@perl.org> wrote:
It is not just that. The hash Reini was advocating has a trivial So the same set of blocks breaks all of the following functions: crc32_hw 32 32 32 FAILED 45 of 129 tests. Differential: The only hash functions that use CRC that do not get tripped up by You can see detailed test reports, summaries, graphs, etc at https://github.com/demerphq/smhasher cheers, -- |
From @kentfredricOn 16 March 2017 at 21:02, demerphq <demerphq@gmail.com> wrote:
Can we expect to see -DPERL_HASH_FUNC_CRC in the future anyway, with sufficient Obviously this is a security risk if your Perl instance is exposed on But I'm very much not concerned about that threat with no public How likely is it I'm going to run 160k blocks locally in a non-threat How bad are the side effects of that going to be realistically? To the extent if it were feasible, I'd champion adding 2 top level -Dprofile=fast-but-insecure ( no taint support maybe, a fast hash ( where the default would be neither of those things and would give Cos Its frustrating the on-going conflict we have between speed and We don't have to force everyone to compromise, surely?! NB: Currently using PERL_HASH_FUNC_SDBM because I /do not care/ , but -- KENTNL - https://metacpan.org/author/KENTNL |
From @demerphqOn 16 March 2017 at 19:48, Kent Fredric <kentfredric@gmail.com> wrote:
Er, no. I don't think so.
I don't think you understood the point. An attacker need only look at
Excessive collisions will noticable affect performance.
Currently you can select the hash function you want to build with.
And where we lean towards secure over fast. I understand your concern.
I find that quite surprising. Also unfortunate, I removed support for I hope to provide some better alternatives soon. Yves -- |
From @kentfredricOn 17 March 2017 at 09:20, demerphq <demerphq@gmail.com> wrote:
The question is: How does this attacker mount their attack against my code. If they're relying on some other exploit that allows executing The only path I can think of here would be a malicious user shipping But I'm not running any services or software that deals with 3rd I understand the theoretical problem, I just can't find any place it
To be clear, its not a "leaning" for me, its a "usecase". For local-access only on developmental-only tasks, fast is preferable, But if I *was* building a web service that wanted external access, Just the tools at my disposal to take this to these extremes is a Hash Key taint support is something nobody can presently opt in to, But on the other side, fast-but-insecure hash algorithms aren't Why can't it just be "give us both of these things and then fence them You could very much argue if I wanted those things I could spend time -- KENTNL - https://metacpan.org/author/KENTNL |
From @demerphqOn 16 March 2017 at 21:34, Kent Fredric <kentfredric@gmail.com> wrote:
By getting you to hash something causes degenerate behavior in your code.
Yes, I get you.
I understand, in many ways the *security* aspect only applies to
Well, I can definitely think of many use cases where "attack" is not
I am not sure we are on the same page here. What I mean is that Perl Having said that, I think there are some different issues here that First off, we want to find a happy medium between "secure enough", and Another thing is that you seem to think that the trade off is "speed" Also just for the record, there is a distinction in my mind and usage Also, I think its worth considering that there is a relationship
Playing devils advocate here, how can you say that when you already
Well if speed is your concern then IMO using sdbm is a poor choice
The front line can be something as simple rpm, or a sysadmin
Well, no. Fast but insecure hash algorithms are rejected because
That is what I have always been trying to do, however I would not put I mean, if you consider something much less controversial than Anway, my feeling is that we should provide three options (compounded We should provide a "believed to be really secure" hash function. But really, I wonder at choosing sdbm. From what I know, pretty much Yves -- |
From @kentfredricOn 17 March 2017 at 21:28, demerphq <demerphq@gmail.com> wrote:
Its been a long time now, so its hard to say I recall I used Benchmark::Perl::CoreHashes at some point and recall SDBM was substantially quicker at that use case for me, so I just used that one. I'll clearly have to re-evaluate though because apparently, now that ( I suspect that's a perk of reinis's approach to configure is that it -- KENTNL - https://metacpan.org/author/KENTNL |
From @demerphqOn 17 March 2017 at 10:28, Kent Fredric <kentfredric@gmail.com> wrote:
Interesting. It's a pity you dont have a record of better numbers.
I will review INSTALL. I removed a bunch of the hashes as maintaining
Nod. Yves -- |
From @kentfredricOn 17 March 2017 at 23:07, demerphq <demerphq@gmail.com> wrote:
The really odd thing now is how this happens: I updated the patches of # time for PERL_HASH_FUNC_SIPHASH is 4.51946496963501 ^ These are "time taken for 5 runs of some loop entirely in C" and I believe these say of the hash algos in core, for the test case it Granted, this benchmark is _not_ factoring in for collision cost, its But the good news is I did record the outcome of a previous run in an # time for PERL_HASH_FUNC_SIPHASH is 12.3072559833527 Maybe I'm just gambling on most hashes not having even as many as 20 I mean... a hash algorithm that always returns 4 is fine if your hash Its really weird how much the baseline speed has changed though in the -- KENTNL - https://metacpan.org/author/KENTNL |
From @demerphqOn 17 March 2017 at 11:22, Kent Fredric <kentfredric@gmail.com> wrote:
I have no idea what those benchmarks say beyond that they are such a You can't reduce a complex discussion about performance to a single The performance of a hash function is dependent on the length of the This is not a new observation. We know that. Its why we have the On the other hand, it can be ameliorated somewhat. I have the # JenkinsOAATH 0 byte keys 11.000 c/h # SipHash13 0 byte keys 20.000 c/h So SipHash13 *should* be beating JenkinsOAATH already, however I
Yes. A benchmark like this is fraught with issues. That being one of them.
In the limited testing I have done hash keys are in the 6-24 byte lengths.
I really wouldnt put much stock in this benchmark. Our hash functions When we decided to accept the latest hash function choices it was If you want to do some benchmarks then do some real benchmarks, build But relying on *that* benchmark alone to decide which hash function I have lots of performance data and graphs in my version of smhasher. https://github.com/demerphq/smhasher/tree/master/doc/graph cheers, -- |
From @kentfredricOn 18 March 2017 at 21:52, demerphq <demerphq@gmail.com> wrote:
Yeah, The point is I wasn't. It was just an interesting data point for These benchmarks represent a limit, because they demonstrate the All the other stuff on top of that will only slow the algorithm down Its interesting in the "for X iterations of a data set with average Throughput in reality _WILL_ be lower than all of the stated rates. But if you look at this: # PERL_HASH_FUNC_SDBM is 1.7642650604248 ^ that says to me that for the given sample, I can afford to have Its not something to base a final decision on, but its still -- KENTNL - https://metacpan.org/author/KENTNL |
From @demerphqOn 18 March 2017 at 10:18, Kent Fredric <kentfredric@gmail.com> wrote:
No, it is not interesting, it is misleading to the point of being
You keep talking about "maximum performance". To the extent that there You keep trying to reduce the comparison of two linear equations down Either you say, "I have sampled a large number of keys, and this is The benchmark you keep quoting does neither. The keys it hashes are Add some long string to the benchmark. Make the benchmark update the That benchmark is literally not worth the paper it is written on. Further, if you are going to quote benchmarks the way you are you Cheers, perl -Mre=debug -e "/just|another|perl|hacker/" |
From @kentfredricOn 19 March 2017 at 01:22, demerphq <demerphq@gmail.com> wrote:
So I built a few Perl's just to compare their test run times, as that 5.25.10 Siphash 13 5.25.10 Siphash 13-Hybrid 5.25.10 OAAT 5.24.1 SDBM 5.24.1 OAAT A difference of 105 CPU Seconds is nothing to sneeze at :) and I CFLAGS: -fstack-protector-strong -fno-stack-protector -O3 The first two of those is just the only way to actually disable the Non-debugging non-threaded perl. The only other things I do that are slightly interesting is turning on x86-64 Linux on Intel(R) Core(TM) i5-2410M CPU @ 2.30GHz gcc version 5.4.0 (Gentoo 5.4.0-r3 p1.3, pie-0.6.5) -- KENTNL - https://metacpan.org/author/KENTNL |
From @kentfredricOn 19 March 2017 at 23:00, Kent Fredric <kentfredric@gmail.com> wrote:
I've now run 5x runs of Moose's test suite on all of the target perls, Graph comparing times in user-seconds: http://i.imgur.com/jRs4EBZ.png I can only really afford to do so many runs, 5 targets * 5 runs * 2 -- KENTNL - https://metacpan.org/author/KENTNL |
From @kentfredricOn 20 March 2017 at 01:02, Kent Fredric <kentfredric@gmail.com> wrote:
Test Times for Dzil are inconclusive, mostly because you've got 2 - 5.25.10 has newer Test::More ( slower ) Graph: http://i.imgur.com/Mow6oyt.png However, dzil *build* times get the Test::More slowdown out of the mix This reveals Siphash13 to be promising in this scenario, but It also demonstrates SDBM hash to be faster than OAATH for this usecase. Graph: http://i.imgur.com/ghMCFB5.png The only curiosity would be how SDBM compares to SIPHASH13 with all ( which is why I double-accounted OAATH everywhere because it was the So if SDBM aint' returning, I'll be switching to SIPHASH13 25% off `dzil build`? Yes please :) -- KENTNL - https://metacpan.org/author/KENTNL |
@demerphq is there any further action to take on this ticket? |
On Fri, 31 Jan 2020 at 06:33, Todd Rinaldo ***@***.***> wrote:
@demerphq <https://github.com/demerphq> is there any further action to
take on this ticket?
Hrm... Lets leave the ticket open for a bit longer. I think we should make
Siphash 1-3 the default these days but i havent had time to lobby for it.
Yves
|
Migrated from rt.perl.org#124064 (status was 'open')
Searchable as RT124064$
The text was updated successfully, but these errors were encountered: