About the implementation of Java HashMap

Go To StackoverFlow.com

2

Why the capacity must be a multiple or 2? Why use "&" in the indexFor functions? Why recompute the hash in the hash function instead of directly using the key's hash code?

I think there are some important differences between the this implementation and the description on the "Introduction to Algorithm".

What does ">>>" mean?

static int hash(int h) {
        // This function ensures that hashCodes that differ only by
        // constant multiples at each bit position have a bounded
        // number of collisions (approximately 8 at default load factor).
        h ^= (h >>> 20) ^ (h >>> 12);
        return h ^ (h >>> 7) ^ (h >>> 4);
}

Can anyone give me some guide ? I appreciate If some one can explain the hash algorithm. Thanks a lot!

2012-04-03 21:28
by lingguang1997
I know using "&", the key can be mapped to the limited slots. What about influence to the collision in the hash map - lingguang1997 2012-04-03 21:30
>>> is unsigned right shift. Regular >> in Java will preserve and propagate the sign bit and leave a negative number negative. >>> will fill the sign bit with zeros as shifting occurs - Hot Licks 2012-04-04 16:10


5

This is a performance optimization. The usual way to map a hash code to a table index is

table_index = hash_code % table_length;

The % operator is expensive. If table_length is a power of 2, then the calculation:

table_index = hash_code & (table_length - 1);

is equivalent to the (much) more expensive modulo operation.

2012-04-03 21:34
by Ted Hopp
This is why it's a good idea to learn to code in assembly on an old processor - biziclop 2012-04-03 21:53
Danger Will Robinson! 50% of hash codes are negative numbers, so hash_code % table_length could be negative, resulting in an ArrayOutOfBoundsException. You need to use Math.abs(hash_code % table_length) to get a usable (ie safe) inde - Bohemian 2012-04-03 23:20
@Bohemian - Yes, negative numbers are an added complication. That's another advantage of the second method, which works just fine with negative hash code values - Ted Hopp 2012-04-04 00:18


0

Pay no attention to the man behind the curtain.

The actual algorithm is no doubt a combination of "what feels good" to the developer, fixes for some odd degenerate cases, and simple tradition (for which users often develop obscure dependencies).

And note this:

 * Applies a supplemental hash function to a given hashCode, which
 * defends against poor quality hash functions.  This is critical
 * because HashMap uses power-of-two length hash tables, that
 * otherwise encounter collisions for hashCodes that do not differ
 * in lower bits. Note: Null keys always map to hash 0, thus index 0.

Net: So long as it works and the performance is good, you don't care.

2012-04-03 21:46
by Hot Licks
Thanks for everyone's answer. They are answered my question - lingguang1997 2012-04-04 16:03