Common Table Expressions

CTEs (WITH clauses) turn one tangled query into a readable pipeline of named steps. The single most important tool for writing analytical SQL that humans can read and trust.

Analytical questions get complicated fast: filter, then aggregate, then rank, then compare. Cramming all of that into one nested query is how SQL earns its reputation for being unreadable. Common Table Expressions (CTEs) — the WITH clause — are the cure. They let you build a query as a sequence of named, readable steps, like a recipe. This is arguably the most important style skill in analytical SQL, so it gets a page of its own.

The problem: queries that nest inward

Without CTEs, a multi-step analysis nests subqueries inside subqueries. You have to read it inside-out, and the logic is buried:

-- Hard to read: logic nested inside out
SELECT region, avg_order
FROM (
  SELECT region, AVG(amount) AS avg_order
  FROM (
    SELECT * FROM orders WHERE status = 'paid'
  )
  GROUP BY region
)
WHERE avg_order > 100;

The actual steps — keep paid orders, average per region, keep big averages — are scrambled. A reader has to mentally unwrap it.

The fix: a pipeline of named steps

A CTE pulls each step out, names it, and stacks the steps top to bottom in the order you think about them. The same logic becomes a readable pipeline:

Read it straight down: define paid, then per_region built on paid, then the final query built on per_region. Each step has a name that says what it is. This is how analysts write queries they can revisit a month later and still understand.

Why CTEs matter for analysis specifically

CTEs are not just cosmetic. For exploratory analytical work they are a genuine accelerator:

Readability. Named steps document the logic; the query is its own explanation.
Iteration. You can run a CTE pipeline and inspect any step by temporarily selecting from it — perfect for the question-query-look loop.
Reuse within a query. A CTE can be referenced multiple times below its definition, so you compute something once and use it twice.
Composability. Complex analyses become a stack of simple transformations, each easy to verify on its own.

Reusing a CTE: compute once, compare twice

A classic pattern: compute a summary, then compare each row to that summary. With a CTE you compute the summary once and join or cross-reference it. Here, each region's revenue is compared to the company-wide average.

The region_rev step is computed once and reused; overall summarizes it; the final query compares them. Three small, verifiable steps replace one inscrutable nest.

A note on recursive CTEs

CTEs can also be recursive (WITH RECURSIVE), which lets a query reference itself to walk hierarchies — org charts, category trees, graph paths. It is a powerful corner of SQL, but it leans toward data structure rather than the summarizing analytics this course focuses on, so we only flag it here. When you meet a "find all descendants" problem, remember recursive CTEs exist.

Check your understanding

QuestionSelect one

What is the main benefit of rewriting a deeply nested query using CTEs?

CTEs always make queries run faster.

They turn inside-out nesting into a top-to-bottom pipeline of named steps that is far easier to read and verify.

They remove the need for GROUP BY.

They convert the query into multiple permanent tables.

QuestionSelect one

In a WITH a AS (...), b AS (...) SELECT ... query, what can b reference?

Only the original tables, never a.

The CTE a defined above it, as well as base tables — CTEs can build on earlier CTEs.

Only b itself.

The final SELECT's result.

QuestionSelect one

Why are CTEs especially useful during exploratory analysis?

They prevent you from ever changing the query.

You can inspect any named step by temporarily selecting from it, making the question-query-look loop fast.

They hide the intermediate results so you only see the final answer.

They automatically visualize the data.

CTEs are the named way to build sub-results. SQL also has unnamed, inline ones — subqueries — each with its own best use. That contrast is the next page.

Pivoting and Reshaping

Long vs. wide data, and how analysts move between them — manual pivots with conditional aggregation, DuckDB's PIVOT and UNPIVOT, and when each shape is the right one.

Subqueries in Analytics

Scalar, IN, and correlated subqueries — where each fits in analytical work, how they compare to CTEs and joins, and how to choose the clearest tool for the question.

Common Table Expressions

The problem: queries that nest inward

The fix: a pipeline of named steps

Why CTEs matter for analysis specifically

Reusing a CTE: compute once, compare twice

A note on recursive CTEs

Check your understanding

On this page