Dataslope logoDataslope

Ranking and ROW_NUMBER

Numbering and ranking rows within groups — ROW_NUMBER, RANK, DENSE_RANK — and the per-group top-N pattern that powers "best per category" questions.

A huge family of analytical questions is about position: "the top 3 products per category", "each customer's first order", "the runner-up in each region." All of them rank rows within a window and then act on that rank. This page covers the three ranking functions and the per-group top-N pattern they unlock. Because ranking is foundational and the RANK/DENSE_RANK distinction trips people up, there are extra questions at the end.

Numbering rows with ROW_NUMBER

ROW_NUMBER() assigns 1, 2, 3, ... to rows in the order you specify. It requires an ORDER BY inside OVER, because numbering only makes sense once you have decided the order.

SQL
DuckDB 1.32.0

The biggest order is rn = 1, the next rn = 2, and so on — a clean sequential rank over the whole table.

Ranking within groups with PARTITION BY

Add PARTITION BY and the numbering restarts in each group. This is how you ask "rank within each region" — every region gets its own 1, 2, 3.

SQL
DuckDB 1.32.0

Per-group top-N: the pattern to memorize

The most common use of ranking is top-N per group: keep only the rows whose rank is at or below N. Because you cannot filter on a window function in WHERE (it is computed too late), DuckDB gives you QUALIFY, a WHERE-like clause that filters after window functions run.

SQL
DuckDB 1.32.0

This is the canonical "best N per category" shape: partition by the category, order by the measure, rank with ROW_NUMBER, and keep ranks ≤ N with QUALIFY. Internalize it — you will use it constantly.

ROW_NUMBER vs RANK vs DENSE_RANK

The three ranking functions differ only in how they handle ties (rows with equal ORDER BY values). The difference matters and is a frequent source of confusion.

  • ROW_NUMBER — always distinct: 1, 2, 3, 4. Ties are broken arbitrarily; no two rows share a number.
  • RANK — ties share a number, then it skips: 1, 1, 3, 4. (Two golds, no silver.)
  • DENSE_RANK — ties share a number, no skip: 1, 1, 2, 3.
SQL
DuckDB 1.32.0

Choosing between them:

  • Use ROW_NUMBER when you need a unique number per row (e.g. "exactly one top row per group").
  • Use RANK when ties should share a place and you want gaps to reflect how many tied (competition-style standings).
  • Use DENSE_RANK when ties share a place but you want consecutive rank numbers (e.g. "the top 3 distinct scores").

Check your understanding

QuestionSelect one

Why must a ROW_NUMBER() OVER (...) include an ORDER BY inside the OVER?

Because ROW_NUMBER cannot be used with PARTITION BY.

Numbering rows 1, 2, 3 only makes sense once an order is defined, so the window needs an ORDER BY.

Because ORDER BY sorts the final output.

It does not — ORDER BY is optional for ROW_NUMBER.

QuestionSelect one

Players score 100, 100, 90. What does RANK() (ordering by score descending) assign?

1, 2, 3

1, 1, 3 — the tied 100s both get rank 1, and the next rank skips to 3.

1, 1, 2

1, 2, 2

QuestionSelect one

You want the single highest-paid employee per department, exactly one row each even if salaries tie. Which function fits best?

RANK, because ties should share the top spot.

ROW_NUMBER, because it gives a unique number per row so you can keep just = 1.

DENSE_RANK, because it never skips numbers.

COUNT, because it counts employees.

QuestionSelect one

Why does the top-N-per-group pattern use QUALIFY instead of WHERE to filter on the rank?

Because WHERE cannot compare numbers.

Window functions are computed after WHERE runs, so you cannot filter on a window result in WHERE; QUALIFY filters after windows are computed.

Because QUALIFY sorts the rows.

Because WHERE only works on the first partition.

QuestionSelect one

What does PARTITION BY category change about ROW_NUMBER() OVER (PARTITION BY category ORDER BY price DESC)?

It numbers all rows 1..N across the whole table ignoring category.

The numbering restarts at 1 within each category, ranking rows separately per group.

It removes the category column.

It sorts categories alphabetically.

Ranking answers "where does this row stand?" The next page answers a different positional question — "how does this row's value accumulate or compare over time?" — with running totals and moving averages.

On this page