Evolving a Schema Safely
Learn design-focused migration patterns for changing schemas without breaking existing data or code.
Changing a schema is easy when no one uses it yet. It is harder after real rows, real queries, and real application code depend on it.
A safe schema change treats the migration as a sequence of deliberate steps, not one dramatic rewrite.
Migrations are ordered changes
A migration is an intentional change to the schema. It might add a column, create a table, copy data, add a constraint, or remove an old column after the new design is ready.
Good migrations are:
- small enough to understand
- ordered so each step has a reason
- reversible when possible
- tested against existing data
This page stays at the design level. The point is not database operations; it is how to protect meaning while the model changes.
Additive changes are safer
An additive change adds something without removing the old shape. Examples include adding a nullable column, adding a new table, or adding a new relationship table.
A destructive change removes or changes something existing. Dropping a column, renaming a table, or changing a required type can break code and lose data.
Why are additive schema changes usually safer than destructive ones?
They never require design thinking.
Existing data and code can often keep using the old shape while the new shape is introduced.
PostgreSQL refuses to run destructive changes.
Additive changes automatically normalize every table.
Add a NOT NULL column safely
A common mistake is trying to add a required column in one step when old rows do not have values yet.
A safer pattern is:
- Add the column as nullable.
- Backfill values for existing rows.
- Check that every row now has a value.
- Set the column
NOT NULL.
The end state has a required, constrained column. The path avoids asking old rows to magically have a value before you provide one.
Backfilling preserves meaning
A backfill fills new structure using existing data or a chosen default. The design question is: what value should old rows mean?
Backfills should be explicit. If every old task should start as
normal priority, write that rule down in the migration.
In the safe NOT NULL pattern, why do you backfill before setting NOT NULL?
Backfilling deletes rows with missing values.
Existing rows need valid values before the database can enforce the required rule.
PostgreSQL requires every nullable column to be backfilled daily.
It makes the column disappear from queries.
Expand and contract
The expand/contract pattern, also called parallel change, avoids a big-bang switch.
First you expand the schema so both old and new shapes can exist. Then you move data and code toward the new shape. Finally, after the old shape is no longer needed, you contract by removing it.
For example, a customers.full_name column might become separate
first_name and last_name columns.
The middle version is temporary, but important. It gives the product a safe bridge from old meaning to new meaning.
Before and after a relationship change
Suppose each task originally has one owner.
Later, tasks can have multiple assignees. The expanded schema adds a new table before the old column is removed.
Only after data and code use task_assignees would a later migration
remove tasks.owner_id.
Check your understanding
What is the safest order for adding a required column to a table with existing rows?
Add it as NOT NULL, then decide values later.
Add nullable, backfill, verify, then set NOT NULL.
Drop the table and recreate it from memory.
Add a foreign key to an unrelated table first.
What does the "expand" phase do in expand/contract?
Removes the old schema immediately.
Adds new structure while keeping the old structure available.
Deletes all existing data.
Prevents any future constraints.
Why is a backfill a design decision?
It chooses what old rows mean in the new schema.
It only changes table names.
It is required only for empty tables.
It replaces primary keys.
When should the old column usually be removed in a parallel change?
Before the new table exists.
Immediately after writing the first diagram.
After data and dependent code have moved to the new structure.
Never, even if nothing depends on it.