Archive

Archive for the ‘Java’ Category

Single-word Wang/Jenkins Hash in ConcurrentHashMap

March 19th, 2010 No comments

ConcurrentHashMap is hash table supporting full concurrency of retrievals and adjustable expected concurrency for updates. I recently came across this code during testing, and one part really got my attention. To generate the hash, ConcurrentHashMap uses an algorithm based on bitshifting and bitwise operations.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
========================================
Variant of single-word Wang/Jenkins hash
========================================
private static int hash(int h) {
 
// Spread bits to regularize both segment and index locations,
// using variant of single-word Wang/Jenkins hash.
h += (h <<  15) ^ 0xffffcd7d;
h ^= (h >>> 10);
h += (h <<   3);
h ^= (h >>>  6);
h += (h <<   2) + (h << 14);
 
return h ^ (h >>> 16);
}

According to the comment in the code, this method applies a supplemental hash function to a given hashCode, which defends against poor quality hash functions.

Good hash functions are important as a hash table effectively turns from a map to a linked list, in the worst case, all keys in the same bucket. There are also other considerations that come into play such as the performance of hash calculation and the number of buckets. Dr. Heinz M. Kabutz explains the power of “power-of-two number of buckets” which gives us some good starting point to understand what is really going on here.

Let’s look at the code above and see how things change, line-by-line. To make things simple, I use int 1 to perform all the operations.

In Java, the int data type is a 32-bit signed two’s complement integer. To represent int 1 in binary code, we have the following:

1
h=1 > 0000-0000-0000-0000-0000-0000-0001

Now, let’s dissect the following line:

1
h += (h << 15) ^ 0xffffcd7d

First, let’s re-write this into an easier-to-read format.. at least for me :) .

1
2
3
4
h1 = h << 15      =  0000-0000-0000-0000-1000-0000-0000-0000
hex = 0xffffcd7d  =  1111-1111-1111-1111-1100-1101-0111-1101
h2 = h1 ^ hex     =  1111-1111-1111-1111-0100-1101-0111-1101
h2 + h            =  1111-1111-1111-1111-0100-1101-0111-1110

Using the same thought processing and applying it to each line, we end-up with:

1
2
3
4
5
6
h += (h << 15) ^ 0xffffcd7d = 1111-1111-1111-1111-0100-1101-0111-1110
h ^= (h >>> 10)	            = 1111-1111-1100-0000-1011-0010-1010-1101
h += (h << 3)		    = 1111-1101-1100-0110-0100-1000-0001-0101
h ^= (h >>> 6)              = 1111-1110-0011-0001-0101-0001-0011-0101
h += (h << 2) + (h << 14)   = 0100-1011-0100-0011-1101-0110-0000-1001
h ^= (h >>> 16)             = 0100-1011-0100-0011-1001-1101-0100-1010

Result:
Bin = 0100-1011-0100-0011-1001-1101-0100-1010
Decimal = 1,262,722,378

JProfiler: Getting Started

February 4th, 2010 No comments

JProfiler helps you find performance bottlenecks, pin down memory leaks and resolve threading issues in your Java application. JProfiler combines CPU, Memory and Thread profiling in one application and is developed by ej-technologies. The latest version at the time of this article is 6.0.

Java: GC Options

June 24th, 2009 No comments

Verbose GC options

Option Default value Max Value Description
-XX:+PrintGCDetails false PrintGC details
-XX:+PrintGCTimeStamps false Adds timestamp info to GC details
-XX:+PrintHeapAtGC false Prints detailed GC info including heap occupancy before and after GC
-XX:+PrintTenuringDistribution false Prints object aging or tenuring information
-XX:+PrintHeapUsageOverTime false Print heap usage and capacity with timestamps
-Xloggc:filename false Prints GC info to a log file
-verbose:gc false Prints some GC info
-XX:+PrintTLAB false Print TLAB information

Read more…