Why Window Functions Exist
The problem GROUP BY cannot solve — keeping every row while also seeing a summary. Window functions explained from the ground up, with the OVER clause demystified.
Window functions are the technique that separates casual SQL users from fluent analysts — and the one most people find genuinely hard at first. This page goes slowly and builds the why before any syntax, because the mechanics only make sense once you see the problem they solve. Because this is a foundational, often-confusing concept, this page has extra practice questions.
The problem GROUP BY cannot solve
GROUP BY is powerful but it destroys the individual rows. When you
group orders by region, you get one row per region — the orders
themselves are gone. That is fine for a summary, but it fails the moment
you want to keep every row and also show a summary beside it.
Consider a deceptively simple request:
"Show every order, and next to each one, what percentage of its region's total it represents."
You need both at once: the individual order (a row-level fact) and
the region total (a group-level fact). GROUP BY can give you one or the
other, never both in the same row.
A window function is the answer: it computes a summary across a set of related rows but attaches the result to every row instead of collapsing them. You keep all your detail and gain the summary.
The OVER clause is what makes it a window
Any aggregate becomes a window function when you add an OVER (...)
clause. SUM(amount) collapses; SUM(amount) OVER (...) does not —
it computes a sum over a "window" of rows and writes it onto each row.
Start with the simplest possible window: OVER () with empty parentheses
means "the whole result set is one window." Every row gets the grand
total beside it.
Notice all five rows survived, each carrying the grand total 505. A
plain SUM(amount) would have returned a single row. The OVER () is the
entire difference.
PARTITION BY — windows per group
OVER () uses the whole table as one window. PARTITION BY region splits
the rows into one window per region, so each row gets its own
region's total. This is the window-function echo of GROUP BY — but the
rows stay.
Now the original request is one step away — divide each amount by its
region_total:
That pct_of_region is impossible with GROUP BY alone in a single
query, because it needs the row and the group total side by side.
Window functions make it one line.
GROUP BY vs window — the key contrast
GROUP BY | Window function (OVER) | |
|---|---|---|
| Rows in result | One per group | One per original row |
| Detail rows | Collapsed away | Preserved |
| Sees the group summary? | Yes | Yes |
| Sees each individual row? | No | Yes |
| Typical use | "Total per region" | "Each order vs its region total" |
The one-sentence summary: GROUP BY summarizes instead of the rows; a
window function summarizes alongside the rows.
Anatomy of a window function
SUM(amount) OVER ( PARTITION BY region ORDER BY order_date )
└────┬────┘ └──────┬──────────┘ └────────┬─────────┘
what to which rows share order within
compute a window the window- The function (
SUM,AVG,COUNT,ROW_NUMBER, ...) is what to compute. PARTITION BYdecides which rows belong to the same window (optional — omit it for one big window).ORDER BYorders rows inside each window, which matters for running totals and rankings (next pages).
Check your understanding
What is the fundamental difference between a window function and GROUP BY?
Window functions can only count, while GROUP BY can sum.
A window function keeps every original row and attaches a summary to each; GROUP BY collapses rows into one per group.
GROUP BY keeps every row while window functions collapse them.
They always return identical results.
What does the empty OVER () clause mean in SUM(amount) OVER ()?
Compute the sum of just the current row.
Treat the entire result set as one window, so every row receives the grand total.
It is invalid syntax.
Sort the rows ascending.
What does PARTITION BY region do inside an OVER clause?
It permanently splits the table into separate region tables.
It divides the rows into one window per region, so each row's window calculation uses only its own region's rows.
It removes the region column from the output.
It is required in every window function.
Why is "each order's percentage of its region's total" hard with GROUP BY alone?
Because GROUP BY cannot compute a SUM.
It needs the individual order and the region total in the same row, but GROUP BY collapses the orders away.
Because percentages are not allowed in SQL.
Because regions cannot be grouped.
Which clause turns an ordinary aggregate like AVG(score) into a window function?
GROUP BY
HAVING
OVER (...)
DISTINCT
Now that the OVER clause makes sense, the next pages put it to work:
ranking rows, then computing running totals and moving averages.
Filtering Groups with HAVING
WHERE filters rows before grouping; HAVING filters groups after aggregating. Why analysts need both, and how to keep them straight.
Ranking and ROW_NUMBER
Numbering and ranking rows within groups — ROW_NUMBER, RANK, DENSE_RANK — and the per-group top-N pattern that powers "best per category" questions.