Friday, November 25, 2011

Hidden facts about HashMap and HashSet

We all know that hashcode is used when you store objects in map or set. And we all know that the hashcode() method of the object is used to uniquely identify the object stored in the set and that HashSet will not store duplicate values.

What do we derive from these? Shall we say, "Objects having same hashcode are considered as same"?!
Guess what is the output of the following program.
public class HashCodeTest {

static class Person {

int hashCode = 1;

@Override
public int hashCode() {
return hashCode;
}
}

public static void main(String[] args) {
Set persons = new HashSet();
persons.add(new Person());
persons.add(new Person());
persons.add(new Person());
persons.add(new Person());
System.out.println("Set size: " + persons.size());

Map map = new HashMap();
map.put(new Person(), 1);
map.put(new Person(), 2);
map.put(new Person(), 3);
map.put(new Person(), 4);
System.out.println("Map size: " + map.size());

}
}OUTPUT:
Set size: 4
Map size: 4

Why?? Isn't supposed to be 1? Try the following program.
public class HashCodeTest {

static class Person {

int hashCode = 1;

@Override
public int hashCode() {
return hashCode;
}

@Override
public boolean equals(Object obj) {
return hashCode == ((Person) obj).hashCode;
}
}

public static void main(String[] args) {
Set persons = new HashSet();
persons.add(new Person());
persons.add(new Person());
persons.add(new Person());
persons.add(new Person());
System.out.println("Set size: " + persons.size());

Map map = new HashMap();
map.put(new Person(), 1);
map.put(new Person(), 2);
map.put(new Person(), 3);
map.put(new Person(), 4);
System.out.println("Map size: " + map.size());

}
}
OUTPUT:
Set size: 1
Map size: 1

Surprised what equals() to do with hash set or map? Let us look at the source code of HashMap as HashSet is backed by HashMap. Here it goes,...

public V put(K key, V value) {
if (key == null)
return putForNullKey(value);
int hash = hash(key.hashCode());
int i = indexFor(hash, table.length);
for (Entry e = table[i]; e != null; e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
V oldValue = e.value;
e.value = value;
e.recordAccess(this);
return oldValue;
}
}

modCount++;
addEntry(hash, key, value, i);
return null;
}

The line if (e.hash == hash && ((k = e.key) == key || key.equals(k))) answers our question.
So just having hashcode alone to be unique will not identify the object uniquely.