Wednesday 29 July 2015

hashCode contract

Objects that are equal must have the same hash code within a running process.

Common misconceptions:
Unequal objects must have different hash codes – WRONG!

Objects with the same hash code must be equal – WRONG!


Whenever you implement equals, you MUST also implement hashCode.

An object’s hashCode() method must take the same fields into account as its equals method.

By overriding the equals() method, you’re declaring some objects as equal to other objects, but the original hashCode() method treats all objects as different (JVM create default hash code using the object memory address). So you will have with different hash codes.

For example, calling contains() on a HashMap will return false, even though the object has been added.

HashCode collisions

Whenever two different objects have the same hash code, we call this a collision.
It is nothing critical, it just means that there is more than one object in a single bucket, so a HashMap lookup has to look again to find the right object.

A lot of collisions will degrade the performance of a system, but they won’t lead to incorrect results.

For example: the Strings "Aa" and "BB" produce the same hashCode: 2112.

HashCodes can change

Important and surprising for hash code contract: hash code does not guarantee the same result in different executions.
For these classes like String, the hash code will always be the same because of their exact formula.
There are Java libraries that actually return different hashCode values in different processes and this tends to confuse people.
Example: Google’s Protocol Buffers.

Do not use hashCode in distributed applications

A remote object may have a different hash code than a local one, even if the two are equal. Therefore, you should not use the hash code in distributed applications.

We must aware that the implementation of a hashCode() method may change from one version to another. Therefore your code should not depend on any particular hash code values.

Example: We should not use the hash code to persist state. Because the hashCode method implementation may change next time, the hash codes of the “same” objects may be different due to version change.

Important point:

“Don’t use hash code at all”, except when you create hash-based algorithms.

No comments:

Post a Comment

Related Posts Plugin for WordPress, Blogger...