Sorting and Ranking

Ordering rows by one or more columns, and assigning ranks within groups.

After filtering, ordering is the second most common manipulation you will perform on a DataFrame. Pandas keeps it simple.

`sort_values`

Sort by a single column:

Sort by multiple columns — earlier columns take priority:

sort_values returns a new DataFrame — it does not modify the original. If you want to overwrite, reassign:

df = df.sort_values("salary", ascending=False)

`sort_index`

Sort by the index instead of a column:

This is especially common with time-series data — df .sort_index() on a DatetimeIndex gives you chronological order.

Stable sorts and ties

By default Pandas uses a stable sort: rows with equal values keep their original relative order. This matters when you want to "sort by A, breaking ties by B":

The two-call form is occasionally useful, but the single multi-column call is usually clearer.

`nlargest` / `nsmallest`

A common task: find the top (or bottom) N rows by some column. These are equivalent to sort_values(...).head(N) but more direct and slightly faster.

Handling missing values in sorts

By default, NaN values go to the bottom of an ascending sort. You can change this:

Ranking

Sorting reorders rows. Ranking assigns each row a number representing its position in the sort, leaving row order alone.

The rank method has a method argument controlling how ties are broken:

"average" (default) — tied rows get the average of their ranks.
"min" — tied rows all get the lowest rank.
"dense" — like "min" but no gaps between ranks.
"first" — break ties by original order.

Rank within groups

A common business question: "Within each department, who are the top 3 earners?" That is a sort + group + head problem.

Reading this line by line:

Sort everyone by income, descending.
Group by department.
From each group, take the first 3 rows (which are now the top 3 because of the sort).
Project to the columns we care about.
Sort by department for a tidy display.

This is a great example of how a few Pandas building blocks compose into something powerful.

Check your understanding

QuestionSelect one

What does df.sort_values(["dept", "salary"], ascending=[True, False]) do?

Sorts by salary first

Sorts only by department

Sorts by department ascending; within each department, sorts by salary descending

Throws an error

QuestionSelect one

What is the practical difference between df.sort_values("x").head(3) and df.nlargest(3, "x")?

They give different rows

One returns a Series, one a DataFrame

They give the same rows, but nlargest is slightly faster and reads more directly

nlargest requires the data to be pre-sorted

QuestionSelect one

In rank(method="dense"), two tied rows are followed by a third rank value. What does "dense" mean?

It produces unique floating-point ranks

It removes ties

Tied rows share a rank, and the next distinct rank value is one higher (no gaps), unlike "min" which leaves gaps

The result type is dense

Filtering Data

Boolean masks — the single most important pattern in Pandas, used dozens of times per analysis.

Creating New Columns

Computed columns, conditional columns, mapped columns, and the in-place vs. assign trade-off.

Sorting and Ranking

On this page