Dataslope logoDataslope

Equality and Hashing

The equals / hashCode contract — what every Set, Map, and contains() check secretly depends on, and the bugs that lurk when you get it wrong

Almost every interesting method on a Set or Map answers the same question: "have I seen this before?" That question is answered by two methods that every Java object inherits from Object:

public boolean equals(Object other);
public int     hashCode();

They are tiny. They are also the single most-bugged corner of the whole framework. This page is about why, and how to get them right once and for all.

What collections actually do with equals

List.contains(x), List.indexOf(x), List.remove(x), Set.contains(x), Set.add(x), Map.containsKey(k), Map.get(k) — all of them, under the hood, call equals to compare candidates.

== compares identity (same object in memory). .equals compares value (logically the same thing). For a sane container, equality has to mean value.

Code Block
Java 8 (Update 492)

If String only implemented ==, List<String>.contains("hello") would essentially never work.

What hashCode is for

Hashing turns an arbitrary object into an int "fingerprint." A HashSet or HashMap uses that fingerprint to jump directly to the bucket where the element should live — instead of scanning all buckets.

So a hash collection's lookup is two steps:

  1. hashCode() → which bucket to look in
  2. equals() → which candidate in that bucket is actually the target

If hashCode lies — say, returns a different value for two objects that are .equals — the second step never even runs, because the collection looks in the wrong bucket and concludes "not present."

The contract

This is the contract every override must satisfy. It's worth internalizing.

RuleWhat it says
Reflexivex.equals(x) is true
Symmetricx.equals(y) iff y.equals(x)
TransitiveIf x.equals(y) and y.equals(z), then x.equals(z)
ConsistentRepeated calls return the same answer if neither object changes
x.equals(null) is falseNever throw, just return false
equals implies same hashCodeIf x.equals(y), then x.hashCode() == y.hashCode()

That last rule is the one people get wrong, with painful consequences inside hash-based collections.

A class that breaks the contract

Code Block
Java 8 (Update 492)

Run that. The set says false. Why? Because Point inherited Object.hashCode, which is essentially the object's memory address. Two Point(1,2)s are equal by .equals, but their hashCodes are different — so the second one lands in a different bucket, and the set never even reaches the .equals step.

Whenever you override equals, override hashCode. Always.

The fix

Code Block
Java 8 (Update 492)

Objects.hash(...) is the standard way to combine fields into a hashcode. It is not the fastest hash in the world, but it is correct and clear, which is what almost every domain class needs.

Records: the modern shortcut

Java 16+ has records, which generate a correct equals, hashCode, and toString for you. For value-like data, prefer them:

Code Block
Java 8 (Update 492)

Records were added partly because getting equals/hashCode right by hand is such a constant source of bugs.

Mutable keys: the other classic disaster

A hash-based collection assumes that its keys' hashcodes do not change while they are in the collection. If you mutate a field that participates in hashCode, the element becomes "lost" — it is still there, but the collection looks for it in the wrong bucket.

Code Block
Java 8 (Update 492)

Rule: anything you put in a HashSet or use as a HashMap key should be immutable (or at least, you must not change the fields that contribute to equals / hashCode).

The other contract: equals consistency with Comparable

Hash-based collections care about equals/hashCode. Sorted collections (TreeSet, TreeMap) care about compareTo. The strongly recommended rule is:

If x.compareTo(y) == 0, then x.equals(y) should be true.

If this is violated, a TreeSet will silently behave differently from a HashSet for the same data — almost always a bug. We'll meet Comparable in detail in its own chapter.

A multi-file challenge: a uniqueness service

This is one of the most common real uses of equals/hashCode: a "have I seen this before?" service.

Challenge
Java 8 (Update 492)
Implement Email value equality

Write a class Email that wraps a String address and a UniquenessTracker that uses a Set<Email> to detect duplicates.

  • Email is constructed from one String. Two Emails are equal if their addresses are equal case-insensitively ("Ada@x.com" equals "ada@x.com").
  • The hashCode must agree — use the lowercased address.
  • UniquenessTracker has boolean register(Email e) returning true on the first time the email is seen and false afterward.

Expected output:

first  Ada@x.com? true
again  Ada@x.com? false
again  ada@x.com? false
new    bob@x.com? true
total unique = 2

Test your understanding

QuestionSelect one

What goes wrong if you override equals but not hashCode?

The code fails to compile

All collections stop working

Hash-based collections (HashSet, HashMap) can fail to find equal objects because they look in the wrong bucket

equals calls go into infinite recursion

QuestionSelect one

Why is it dangerous to use a mutable object as a HashMap key?

Mutable objects can't be stored at all

They double the memory cost

If a field that participates in hashCode changes while the object is a key, the entry becomes unreachable

HashMap deep-copies keys, so the cost is performance

QuestionSelect one

What's the cleanest modern way to get a correct equals / hashCode for a small immutable value type?

Hand-write both methods using Objects.hash

Inherit them from Object

Declare the class as a record

Implement Comparable

On this page