Common Table Expressions
CTEs (WITH clauses) turn one tangled query into a readable pipeline of named steps. The single most important tool for writing analytical SQL that humans can read and trust.
Analytical questions get complicated fast: filter, then aggregate, then
rank, then compare. Cramming all of that into one nested query is how SQL
earns its reputation for being unreadable. Common Table Expressions
(CTEs) — the WITH clause — are the cure. They let you build a query as
a sequence of named, readable steps, like a recipe. This is arguably the
most important style skill in analytical SQL, so it gets a page of its
own.
The problem: queries that nest inward
Without CTEs, a multi-step analysis nests subqueries inside subqueries. You have to read it inside-out, and the logic is buried:
-- Hard to read: logic nested inside out
SELECT region, avg_order
FROM (
SELECT region, AVG(amount) AS avg_order
FROM (
SELECT * FROM orders WHERE status = 'paid'
)
GROUP BY region
)
WHERE avg_order > 100;The actual steps — keep paid orders, average per region, keep big averages — are scrambled. A reader has to mentally unwrap it.
The fix: a pipeline of named steps
A CTE pulls each step out, names it, and stacks the steps top to bottom in the order you think about them. The same logic becomes a readable pipeline:
Read it straight down: define paid, then per_region built on paid,
then the final query built on per_region. Each step has a name that
says what it is. This is how analysts write queries they can revisit a
month later and still understand.
Why CTEs matter for analysis specifically
CTEs are not just cosmetic. For exploratory analytical work they are a genuine accelerator:
- Readability. Named steps document the logic; the query is its own explanation.
- Iteration. You can run a CTE pipeline and inspect any step by temporarily selecting from it — perfect for the question-query-look loop.
- Reuse within a query. A CTE can be referenced multiple times below its definition, so you compute something once and use it twice.
- Composability. Complex analyses become a stack of simple transformations, each easy to verify on its own.
Reusing a CTE: compute once, compare twice
A classic pattern: compute a summary, then compare each row to that summary. With a CTE you compute the summary once and join or cross-reference it. Here, each region's revenue is compared to the company-wide average.
The region_rev step is computed once and reused; overall summarizes
it; the final query compares them. Three small, verifiable steps replace
one inscrutable nest.
A note on recursive CTEs
CTEs can also be recursive (WITH RECURSIVE), which lets a query
reference itself to walk hierarchies — org charts, category trees, graph
paths. It is a powerful corner of SQL, but it leans toward data
structure rather than the summarizing analytics this course focuses on,
so we only flag it here. When you meet a "find all descendants" problem,
remember recursive CTEs exist.
Check your understanding
What is the main benefit of rewriting a deeply nested query using CTEs?
CTEs always make queries run faster.
They turn inside-out nesting into a top-to-bottom pipeline of named steps that is far easier to read and verify.
They remove the need for GROUP BY.
They convert the query into multiple permanent tables.
In a WITH a AS (...), b AS (...) SELECT ... query, what can b reference?
Only the original tables, never a.
The CTE a defined above it, as well as base tables — CTEs can build on earlier CTEs.
Only b itself.
The final SELECT's result.
Why are CTEs especially useful during exploratory analysis?
They prevent you from ever changing the query.
You can inspect any named step by temporarily selecting from it, making the question-query-look loop fast.
They hide the intermediate results so you only see the final answer.
They automatically visualize the data.
CTEs are the named way to build sub-results. SQL also has unnamed, inline ones — subqueries — each with its own best use. That contrast is the next page.
Pivoting and Reshaping
Long vs. wide data, and how analysts move between them — manual pivots with conditional aggregation, DuckDB's PIVOT and UNPIVOT, and when each shape is the right one.
Subqueries in Analytics
Scalar, IN, and correlated subqueries — where each fits in analytical work, how they compare to CTEs and joins, and how to choose the clearest tool for the question.