Skip to content

Java 7 hashing drastically better than Java 6

April 6, 2013

How does Java 6 hashing work?

java.lang.String#hashCode() is calculated by iterating over each character in the string, and executing this function:

h = 31*h + val[i]

Now let’s examine how the hash code is used by a popular data structure: HashMap. Looking at java.util.HashMap#put(K,V) runs HashMap#hash(int) on the key’s hashCode. That function is a supplemental hash that provides some protection against bad hash function.

h ^= (h >>> 20) ^ (h >>> 12);
return h ^ (h >>> 7) ^ (h >>> 4);

This provides no randomization or protection from collision attacks. In Java 6, every JVM running everywhere follows these 2 mathematical functions. Some might think, Java hash code should always be the same whenever you call hashCode on the same string, but that’s only true within a single invocation of the JVM. There is no requirement that if 2 different JVM instances running should resolve to the same hashCode for the same string.

Java 7 finally fixed hashing.

java.lang.String has add a static initializer for the single purpose of creating a HASHING_SEED. This is brand new for Java 7. The “seed materials” used to randomize hash codes so that different JVMs don’t have predictable hash codes are: System.currentTimeMillis and System.nanoTime and others.

But wait, if you look at java.lang.String#hashCode() in Java 7 source it looks identical to Java 6’s implementation! So let’s checkout java.util.HashMap in Java 7, and we find that the hash() function there has changed. It takes an Object instead of an int, and if the Object is an instanceof String, then through sun.misc.Hashing.stringHash32() it actually called a new method inside java.lang.String#hash32(). The hash32() function passes the character data plus the HASHING_SEED computed above to sun.misc.Hashing.murmur3_32(HASHING_SEED, value, 0, value.length)

The Murmur3 hashing function with the hashing seed data which is based on the time that java.lang.String’s static initializer is executed provides excellent protection against collision attacks.

public class Hash {
public static void main(String[] args) {
String str = "Java Programming Language";
System.out.println(str + " hashCode(): "
+ str.hashCode()
+ " new Java 7 hashCode used by collections: "
+ sun.misc.Hashing.stringHash32(str));

view raw
hosted with ❤ by GitHub

From → Java

Leave a Comment

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: