Dataslope logoDataslope

Sorting and Top-N

ORDER BY as an analytical tool — surfacing extremes, ranking results, and answering "top 10" questions, plus DuckDB's QUALIFY for per-group top-N.

"Which are the biggest? The worst? The most recent?" An enormous share of analytical questions are really top-N questions, and they all reduce to sort, then look at the head. This page treats ORDER BY not as presentation but as a way to find the rows that matter, and introduces the per-group top-N pattern analysts use constantly.

Sorting surfaces the extremes

The fastest way to learn something about a dataset is to look at its biggest and smallest values. ORDER BY ... DESC brings the largest to the top; LIMIT lets you read just those.

SQL
DuckDB 1.32.0

Flip DESC to ASC (or drop it — ascending is the default) to find the smallest. The "top 5 / bottom 5" glance is often the very first thing an analyst does after slicing.

Sorting by more than one key

When the first sort key ties, a second key breaks the tie. Analysts use this to organize a result meaningfully — e.g. newest-first within each region.

SQL
DuckDB 1.32.0

Read it as "group the eye by region, and within each, descend by amount." The result reads like a small report.

Sorting by a computed measure

You can sort by an expression that is not even in the table — a derived measure. "Most efficient", "best value", "highest rate" are all ORDER BY <expression>.

SQL
DuckDB 1.32.0

Be careful: NULLs and ties

Two subtleties that quietly trip up analysts:

  • NULLs have to go somewhere. In DuckDB, NULLs sort last by default in ASC and you can control this with NULLS FIRST / NULLS LAST. If your "smallest" query is returning blanks at the top, this is why.
  • Ties are arbitrary unless you break them. If ten rows share the top amount, LIMIT 5 returns some five of them, unpredictably. Add a tiebreaker column to ORDER BY to make results stable and reproducible — which matters for analysis you intend to repeat.

Per-group top-N with QUALIFY

"Top 5 orders overall" is easy. "Top 2 orders per region" is the question analysts actually ask — and a plain LIMIT cannot do it, because LIMIT caps the whole result, not each group. The clean DuckDB answer pairs a window function (which you will meet properly soon) with QUALIFY, a filter on the window result.

SQL
DuckDB 1.32.0

Do not worry about the OVER (...) mechanics yet — the Window Functions section explains them in full. For now, note the shape of the question: rank within each group, then keep the top of each. It is one of the most common analytical patterns there is, and QUALIFY expresses it without an extra subquery.

Check your understanding

QuestionSelect one

What is the most analytical reason to use ORDER BY ... DESC LIMIT 10?

To permanently store the table in sorted order.

To surface the ten largest values — answering a "top 10" question.

To delete all but ten rows.

To shuffle rows randomly before display.

QuestionSelect one

Why can a tie in the sort column make ORDER BY amount DESC LIMIT 5 give unstable results?

Because LIMIT randomly deletes rows from the table.

When several rows share the same amount, their relative order is undefined, so which five appear can vary unless you add a tiebreaker.

Because DuckDB cannot sort decimal columns.

Because DESC is not a valid keyword.

QuestionSelect one

You need the top 2 orders within each region. Why does a plain ORDER BY amount DESC LIMIT 2 fail?

It works perfectly for per-region top-N.

LIMIT restricts the whole result to 2 rows total, not 2 per group; you need per-group ranking (e.g. ROW_NUMBER + QUALIFY).

Because ORDER BY cannot be combined with LIMIT.

Because regions cannot be sorted.

You have now finished the exploration toolkit: profile, slice, sort. The real analytical power, though, comes from summarizing — turning many rows into a few meaningful numbers. That is the next section.

On this page