Third Normal Form (3NF)
Learn how third normal form removes transitive dependencies so non-key columns depend on the key and nothing else.
A table can pass 1NF and 2NF and still store a fact in the wrong place. The warning sign is subtler: a non-key column explains another non-key column.
Third normal form (3NF) says non-key columns should depend on the key, the whole key, and nothing but the key.
The transitive dependency problem
A transitive dependency happens when a non-key column depends on another non-key column instead of depending directly on the key.
Imagine an orders table like this:
The key is order_id. The customer_id tells you which customer placed
the order. But customer_name and customer_city are not facts about
the order itself. They are facts about the customer.
The dependency travels through another column: order_id determines
customer_id, and customer_id determines customer_city. That is why
it is called transitive.
Why 3NF matters
If Ada moves from London to Oxford, every order row containing Ada's city must be updated. Miss one row and the database claims the same customer lives in two cities.
That is the same disease you saw earlier: one fact has many homes. 3NF finds this particular version of the disease by looking for non-key columns that depend on other non-key columns.
Quick check
In an orders table keyed by order_id, customer_city is stored because customer_id identifies the customer. What is the 3NF problem?
customer_city is a text column.
customer_city depends on customer_id, a non-key column, rather than directly on order_id.
order_id is too short.
The table contains more than one row.
Which phrase is the classic memory aid for 3NF?
One cell, one value.
The left key joins the right key.
The key, the whole key, and nothing but the key.
Denormalize early and often.
The fix: extract the dependent group
If customer details depend on the customer, give customers their own table. Orders should keep the customer foreign key, not a copied set of customer attributes.
The customer facts now have one home. The order row keeps the fact that really belongs to the order: which customer placed it.
A step-by-step 3NF test
Use this walkthrough after a table has reached 1NF and 2NF:
For the bad orders table:
customer_idbelongs inordersbecause it identifies who placed the order.customer_namebelongs incustomersbecause it describes the customer.customer_citybelongs incustomersfor the same reason.
How 3NF differs from 2NF
2NF and 3NF both ask dependency questions, but they catch different mistakes.
2NF says a non-key column cannot depend on only part of a composite key. 3NF says a non-key column cannot depend on another non-key column. Together, they push every fact toward the table where it naturally belongs.
Check your understanding
What is a transitive dependency?
A column contains a comma-separated list.
A non-key column depends on part of a composite key.
A non-key column depends on another non-key column.
A primary key references a foreign key.
In an orders table, customer_city depends on customer_id. Where should customer_city usually be stored?
In every order row for that customer.
In the customers table.
In a table with no key.
In the order_items table.
Which design best matches 3NF for orders and customers?
orders(order_id, customer_id, customer_name, customer_city)
orders(order_id, customer_city) only
customers(customer_id, name, city) and orders(order_id, customer_id, order_date)
customers(order_id, name, city)
Why does 3NF still allow joins?
Joins are a sign that normalization failed.
Tables store facts once, and joins rebuild the combined view when needed.
3NF requires every query to join every table.
Joins replace the need for primary keys.