Dataslope logoDataslope

Pivot Tables

pivot_table — Pandas's answer to spreadsheet pivot tables — with aggregation, multi-level indexes, and totals.

If you've ever made a pivot table in Excel — drag region to rows, product to columns, sum of sales to values — Pandas's pivot_table is the exact same idea, expressed in code.

The shape of a pivot

You're saying: "I want a grid. The rows come from this column, the columns come from that one, and the cells aggregate this third one."

A first pivot

Code Block
Python 3.13.2

Each cell is "sum of amount where region AND product match that row/column".

Pivot vs pivot_table

Featurepivotpivot_table
Duplicate (index, columns) allowed❌ no✅ yes
Aggregation functionn/ayes (sum, mean, ...)
Multiple aggregationsn/ayes (list of funcs)
Margins (row/column totals)n/amargins=True
Handling NaN cellsn/afill_value=

In practice, pivot_table is almost always the right choice because it gracefully handles duplicates.

Multiple aggregation functions

Code Block
Python 3.13.2

The result has a MultiIndex on the columns. Beautiful for reporting; sometimes annoying for further computation (you may want to flatten).

Margins — row and column totals

Code Block
Python 3.13.2

This produces a row and column called Total containing the overall sums — exactly like an Excel pivot's grand totals.

Multiple index or column levels

Code Block
Python 3.13.2

Hierarchical indexes are powerful for grouping (year is the outer level, region is the inner). You can later slice with .loc[(2024, "S")] or do per-level computations.

Flattening for downstream use

When you need a "flat" DataFrame for export:

Code Block
Python 3.13.2

This is the right shape if you want to write the result back to CSV with no weird header rows.

Pivot table as a quick summary

Code Block
Python 3.13.2
Initialization code (Python)read-only

A single call replaces a multi-step groupby + unstack.

Mini challenge

Challenge
Python 3.13.2
Build a regional product report

Given the DataFrame sales (provided), build a DataFrame report where:

  • Rows are region (alphabetical)
  • Columns are product (alphabetical)
  • Cells are the mean of amount
  • Missing (region, product) combinations should be 0, not NaN
  • There must be a "Total" row and a "Total" column at the end containing the means across the whole region/product respectively

Check your understanding

QuestionSelect one

What is the key advantage of pivot_table over plain pivot?

It is newer

It accepts an aggregation function, so it works even when (index, columns) combinations have duplicate rows

It is faster

It supports CSV

QuestionSelect one

What does margins=True add to a pivot table?

Padding around the result

Borders

A "Total" row and column that contain the aggregation applied to the entire row/column slice (typically grand totals)

Markdown formatting

QuestionSelect one

After pivoting, you want to write the result to CSV without weird "product" header rows. What's the right cleanup?

Convert to JSON instead

Delete the index manually

Call .reset_index() (and optionally clear columns.name) to flatten the pivot back into a regular DataFrame

Use pivot instead of pivot_table

On this page