Dataslope logoDataslope

Case Study: Designing a Blog

Apply the design process to a blog schema with users, posts, threaded comments, tags, and a junction table.

Suppose you are asked to design the database for a small blog.

The first request sounds simple: users write posts, readers comment, and posts have tags. But those few sentences already contain several important design decisions.

This page applies the design process end to end.

Start with requirements

Write the requirements as plain sentences before you write tables:

  • A user has an email and display name.
  • A user writes many posts.
  • A post has a title, body, status, and publish time.
  • A post can have many comments.
  • A comment belongs to one post and one user.
  • A comment may reply to another comment.
  • A post can have many tags, and a tag can label many posts.

The nouns become candidates. The verbs tell us how those candidates connect.

Choose entities

The main entities are users, posts, comments, and tags. The many-to-many relationship between posts and tags needs a fifth table: post_tags.

post_tags is not just a technical trick. It represents the fact that one post can have many tags and one tag can appear on many posts.

QuestionSelect one

Why does the blog need a post_tags table?

Because every blog table must have a junction table.

Because posts and tags have a many-to-many relationship.

Because tags cannot have primary keys.

Because comments are optional.

Assign attributes

Each attribute belongs with the entity it describes.

The post does not store the author's email. It stores author_id, a foreign key to users. That keeps user facts in one place.

Draw the full model

Now the ER diagram can show the whole shape of the schema.

The self-reference on comments.parent_comment_id is optional. It lets a comment reply to another comment while still keeping every comment in the same table.

Turn the model into tables

The SQL below creates the complete blog schema and inserts a tiny sample dataset.

SQL
PostgreSQL 17

The query joins across the design: post to author, post to junction table, and junction table to tags.

Why these decisions?

The design choices come from the rules:

  • users.email is unique because one account should own one email.
  • posts.author_id is required because every post needs an author.
  • posts.status is constrained because only known states are valid.
  • comments.parent_comment_id is nullable because top-level comments are not replies.
  • post_tags has a composite primary key so the same tag cannot be attached to the same post twice.
SQL
PostgreSQL 17

Check your understanding

QuestionSelect one

In the blog design, why does posts store author_id instead of author_email?

Email addresses cannot be stored in PostgreSQL.

The post should reference the user row, while user facts stay in users.

The author is a many-to-many relationship.

The post title already identifies the author.

QuestionSelect one

What does nullable comments.parent_comment_id allow?

Every comment must be a reply.

Some comments can be top-level, while others can reply to a comment.

Comments can belong to no post.

A comment can have many authors.

QuestionSelect one

Which constraint prevents the same tag from being attached to the same post twice?

posts.author_id REFERENCES users(id)

tags.name UNIQUE

PRIMARY KEY (post_id, tag_id) on post_tags.

comments.parent_comment_id REFERENCES comments(id)

On this page