Equality and Hashing
The equals / hashCode contract — what every Set, Map, and contains() check secretly depends on, and the bugs that lurk when you get it wrong
Almost every interesting method on a Set or Map answers the same
question: "have I seen this before?" That question is answered by
two methods that every Java object inherits from Object:
public boolean equals(Object other);
public int hashCode();They are tiny. They are also the single most-bugged corner of the whole framework. This page is about why, and how to get them right once and for all.
What collections actually do with equals
List.contains(x), List.indexOf(x), List.remove(x),
Set.contains(x), Set.add(x), Map.containsKey(k), Map.get(k)
— all of them, under the hood, call equals to compare candidates.
== compares identity (same object in memory). .equals compares
value (logically the same thing). For a sane container,
equality has to mean value.
If String only implemented ==, List<String>.contains("hello")
would essentially never work.
What hashCode is for
Hashing turns an arbitrary object into an int "fingerprint." A
HashSet or HashMap uses that fingerprint to jump directly to
the bucket where the element should live — instead of scanning all
buckets.
So a hash collection's lookup is two steps:
hashCode()→ which bucket to look inequals()→ which candidate in that bucket is actually the target
If hashCode lies — say, returns a different value for two objects
that are .equals — the second step never even runs, because the
collection looks in the wrong bucket and concludes "not present."
The contract
This is the contract every override must satisfy. It's worth internalizing.
| Rule | What it says |
|---|---|
| Reflexive | x.equals(x) is true |
| Symmetric | x.equals(y) iff y.equals(x) |
| Transitive | If x.equals(y) and y.equals(z), then x.equals(z) |
| Consistent | Repeated calls return the same answer if neither object changes |
x.equals(null) is false | Never throw, just return false |
equals implies same hashCode | If x.equals(y), then x.hashCode() == y.hashCode() |
That last rule is the one people get wrong, with painful consequences inside hash-based collections.
A class that breaks the contract
Run that. The set says false. Why? Because Point inherited
Object.hashCode, which is essentially the object's memory address.
Two Point(1,2)s are equal by .equals, but their hashCodes are
different — so the second one lands in a different bucket, and the
set never even reaches the .equals step.
Whenever you override equals, override hashCode. Always.
The fix
Objects.hash(...) is the standard way to combine fields into a
hashcode. It is not the fastest hash in the world, but it is correct
and clear, which is what almost every domain class needs.
Records: the modern shortcut
Java 16+ has records, which generate a correct equals,
hashCode, and toString for you. For value-like data, prefer them:
Records were added partly because getting equals/hashCode right
by hand is such a constant source of bugs.
Mutable keys: the other classic disaster
A hash-based collection assumes that its keys' hashcodes do not
change while they are in the collection. If you mutate a field that
participates in hashCode, the element becomes "lost" — it is still
there, but the collection looks for it in the wrong bucket.
Rule: anything you put in a HashSet or use as a HashMap key
should be immutable (or at least, you must not change the fields
that contribute to equals / hashCode).
The other contract: equals consistency with Comparable
Hash-based collections care about equals/hashCode. Sorted
collections (TreeSet, TreeMap) care about compareTo. The
strongly recommended rule is:
If
x.compareTo(y) == 0, thenx.equals(y)should be true.
If this is violated, a TreeSet will silently behave differently from
a HashSet for the same data — almost always a bug. We'll meet
Comparable in detail in its own chapter.
A multi-file challenge: a uniqueness service
This is one of the most common real uses of equals/hashCode: a
"have I seen this before?" service.
Write a class Email that wraps a String address and a UniquenessTracker that uses a Set<Email> to detect duplicates.
Emailis constructed from oneString. TwoEmails are equal if their addresses are equal case-insensitively ("Ada@x.com" equals "ada@x.com").- The
hashCodemust agree — use the lowercased address. UniquenessTrackerhasboolean register(Email e)returningtrueon the first time the email is seen andfalseafterward.
Expected output:
first Ada@x.com? true
again Ada@x.com? false
again ada@x.com? false
new bob@x.com? true
total unique = 2
Test your understanding
What goes wrong if you override equals but not hashCode?
The code fails to compile
All collections stop working
Hash-based collections (HashSet, HashMap) can fail to find equal objects because they look in the wrong bucket
equals calls go into infinite recursion
Why is it dangerous to use a mutable object as a HashMap key?
Mutable objects can't be stored at all
They double the memory cost
If a field that participates in hashCode changes while the object is a key, the entry becomes unreachable
HashMap deep-copies keys, so the cost is performance
What's the cleanest modern way to get a correct equals / hashCode for a small immutable value type?
Hand-write both methods using Objects.hash
Inherit them from Object
Declare the class as a record
Implement Comparable
Iterables and Iterators
The single contract — Iterable / Iterator — that unifies every container in the framework, and the for-each loop that rides on top of it
Generic Classes and Methods
How to declare your own generic types and methods, and the small-but-deep set of design choices that come with them