Archive for March 2010

ConcurrentHashMap is hash table supporting full concurrency of retrievals and adjustable expected concurrency for updates. I recently came across this code during testing, and one part really got my attention. To generate the hash, ConcurrentHashMap uses an algorithm based on bitshifting and bitwise operations.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
========================================
Variant of single-word Wang/Jenkins hash
========================================
private static int hash(int h) {
 
// Spread bits to regularize both segment and index locations,
// using variant of single-word Wang/Jenkins hash.
h += (h <<  15) ^ 0xffffcd7d;
h ^= (h >>> 10);
h += (h <<   3);
h ^= (h >>>  6);
h += (h <<   2) + (h << 14);
 
return h ^ (h >>> 16);
}

According to the comment in the code, this method applies a supplemental hash function to a given hashCode, which defends against poor quality hash functions.

Good hash functions are important as a hash table effectively turns from a map to a linked list, in the worst case, all keys in the same bucket. There are also other considerations that come into play such as the performance of hash calculation and the number of buckets. Dr. Heinz M. Kabutz explains the power of “power-of-two number of buckets” which gives us some good starting point to understand what is really going on here.

Let’s look at the code above and see how things change, line-by-line. To make things simple, I use int 1 to perform all the operations.

In Java, the int data type is a 32-bit signed two’s complement integer. To represent int 1 in binary code, we have the following:

1
h=1 > 0000-0000-0000-0000-0000-0000-0001

Now, let’s dissect the following line:

1
h += (h << 15) ^ 0xffffcd7d

First, let’s re-write this into an easier-to-read format.. at least for me :).

1
2
3
4
h1 = h << 15      =  0000-0000-0000-0000-1000-0000-0000-0000
hex = 0xffffcd7d  =  1111-1111-1111-1111-1100-1101-0111-1101
h2 = h1 ^ hex     =  1111-1111-1111-1111-0100-1101-0111-1101
h2 + h            =  1111-1111-1111-1111-0100-1101-0111-1110

Using the same thought processing and applying it to each line, we end-up with:

1
2
3
4
5
6
h += (h << 15) ^ 0xffffcd7d = 1111-1111-1111-1111-0100-1101-0111-1110
h ^= (h >>> 10)	            = 1111-1111-1100-0000-1011-0010-1010-1101
h += (h << 3)		    = 1111-1101-1100-0110-0100-1000-0001-0101
h ^= (h >>> 6)              = 1111-1110-0011-0001-0101-0001-0011-0101
h += (h << 2) + (h << 14)   = 0100-1011-0100-0011-1101-0110-0000-1001
h ^= (h >>> 16)             = 0100-1011-0100-0011-1001-1101-0100-1010

Result:
Bin = 0100-1011-0100-0011-1001-1101-0100-1010
Decimal = 1,262,722,378

Conversion Table

HEX DECIMAL BINARY
0 0 = 0+0+0+0 0000
1 1 = 0+0+0+1 0001
2 2 = 0+0+2+0 0010
3 3 = 0+0+2+1 0011
4 4 = 0+2+0+0  0100
5 5 = 0+4+0+1 0101
6 6 = 0+4+2+0 0110
7 7 = 0+4+2+1 0111
8 8 = 8+0+0+0 1000
9 9 = 8+0+0+1 1001
A 10 = 8+0+2+0 1010
B 11 = 8+0+0+1 1011
C 12 = 8+4+0+0 1100
D 13 = 8+4+0+1 1101
E 14 = 8+4+2+0 1110
F 15 = 8+4+2+1  1111

Data Types and Data Structures

Primitive Type Size Minimum Value Maximum Value
char 16-bit Unicode 0 Unicode 216-1
byte 8-bit -128 +127
short 16-bit -215
(-32,768)
+215-1
(32,767)
int 32-bit -231
(-2,147,483,648)
+231-1
(2,147,483,647)
long 64-bit -263
(-9,223,372,036,854,775,808)
+263-1
(9,223,372,036,854,775,807)
float 32-bit 32-bit IEEE 754 floating-point numbers
double 64-bit 64-bit IEEE 754 floating-point numbers
boolean 1-bit true or false
void —– —– Void

Creating a patch file

Creating a patch file is really easy. First, check out the most recent version of the code from Subversion using the ‘checkout’ command.

Make your changes. Then, in the root the project run the following command. It will store the patch file in your home directory. Make sure to give it meaningful filename.

svn diff > ~/fix_bug.diff

The file has the .diff extention, which stands for differences. This extension is recognized by many text editors and enables ’syntax highlighting’ automatically. (Give it a try with TextMate and you’ll know what I mean.) You can send the diff-file to the author of the project by email, or you can create a ticket in Trac and add it as an attachment. The author will review the changes you made and possibly apply them to the source.

Applying a patch

You should never apply patches from any person other than your development team without first reading through the changes, apply them locally and test your application and then commit them. Patches can not only include bug fixes, but also alterations to create back doors or add other exploits to your code.

Always read through a patch before applying it!

When you are sure the patch will bring no harm to you, your application or your customers, go ahead an apply it to your working copy. Here, I assume that you downloaded the patch file we previously generated, and placed it in your home directory. In the root of your application now run:

patch -p0 -i ~/fix_bug.diff

This will apply all the changes in the patch to your source. The -p0 option makes sure that all files can be found correctly (this has to do with something called ‘zero directories’, I won’t get into that right now). The -i option tells ‘patch’ what to use as input, in this case the ‘fix_bug.diff’ file in your home directory.
With the code changes in place, run your tests and make sure everything works as expected. If it does, commit your changes.