Dataslope logoDataslope

Why Normalization Exists

Learn why duplicated facts create update, insertion, and deletion anomalies, and why normalization gives each fact one home.

A spreadsheet-style table can feel comforting at first: every order, customer, product, and price appears in one row. You can scroll across and see the whole story. So why do database designers so often split that story into several tables?

Because duplication has a cost. Before learning the cure, you need to recognize the disease.

The disease: duplicated facts

Imagine a shop stores orders in one flat table:

The table is easy to read, but look carefully. Ada's email appears in two rows. The notebook price appears in two rows. Those are not two separate facts; they are the same facts copied into several places.

A good design tries to store each fact in exactly one place. When a fact has more than one home, the database can start disagreeing with itself.

Duplication is not the same as repetition in a result

A query result may repeat values because a join is showing related data. That is fine. The danger here is storing the same fact repeatedly in base tables, where every copy must be kept consistent by hand.

The three anomalies

An anomaly is a strange or unsafe behavior caused by the shape of the table. Duplicated data creates three classic anomalies.

Update anomaly

Ada changes her email address. In a flat orders table, her email is copied into every order row. If you update one row but miss another, Ada now has two emails in the same database.

SQL
PostgreSQL 17

The update statement is legal SQL. The problem is not syntax; the problem is the design allowed one fact to have multiple homes.

Insertion anomaly

Suppose the shop wants to record a new product before anyone orders it. Where does that row go? The only table is about orders, so a product cannot exist unless an order exists too.

That is an insertion anomaly: you cannot insert one kind of fact without inventing another kind of fact.

Deletion anomaly

Now suppose Grace has only one order, and the shop deletes that order because it was canceled. If Grace's name and email live only in that order row, deleting the order also deletes the only stored fact about Grace.

That is a deletion anomaly: removing one fact accidentally removes a different fact.

Quick check

QuestionSelect one

In the flat orders_flat table, Ada's email appears on every order row. What makes that dangerous?

It prevents SQL from filtering rows.

A single real-world fact must be changed in several stored places.

It means Ada can place only one order.

It means every product must have the same price.

QuestionSelect one

Which situation is the clearest deletion anomaly?

A query returns no rows because the WHERE clause is too strict.

A customer changes email and one old row is missed.

Deleting a customer's last order also removes the only record of the customer's email.

A table has a primary key column.

The cure: give each fact one home

Normalization starts with a simple question: what things are we really tracking? In this example, there are customers, products, and orders. Those are different kinds of facts, so they deserve different homes.

After the split, Ada's email lives once in customers. The notebook price lives once in products. An order points to the customer and the product instead of copying their details.

This does not mean you can never display a flat view again. You can join the tables whenever you need the readable report. The key difference is that the repeated-looking output is computed, not stored as duplicate truth.

Where normal forms fit

The normal forms are a step-by-step path away from messy, duplicated storage and toward single-source-of-truth tables.

This page focuses on the reason normalization exists. The next pages walk through the cure one stage at a time.

Check your understanding

QuestionSelect one

What is the main design goal behind normalization?

To make every table contain every fact needed by the application.

To store each fact in one place so the database has a single source of truth.

To avoid all joins forever.

To remove primary keys from tables.

QuestionSelect one

A shop cannot add a new product until someone buys it because products are stored only inside order rows. Which anomaly is this?

Update anomaly.

Insertion anomaly.

Deletion anomaly.

Join anomaly.

QuestionSelect one

After splitting customers, products, and orders into separate tables, how can you show the familiar all-in-one order report?

Store the customer email again in orders for convenience.

Join the related tables when reading the report.

Delete the product table after orders are inserted.

Put all columns into the primary key.

QuestionSelect one

Which statement best describes an update anomaly?

A new row cannot be inserted because another fact is missing.

A row deletion accidentally removes another kind of fact.

A fact changes, but the database stores several copies that must all be updated.

A foreign key points to an existing primary key.

On this page