Correlation Heatmaps

A scatter plot compares two variables. But a dataset with six numeric columns has fifteen pairs — far too many scatter plots to scan one at a time. A heatmap collapses all of those pairwise relationships into a single colored grid you can read in one glance.

sns.heatmap takes a 2-D matrix of numbers and draws each cell as a colored square, with the color encoding the cell's value. Its most common job by far is rendering a correlation matrix — a square table whose cell at row i, column j is the correlation between variable i and variable j.

Building a correlation matrix

A correlation coefficient (Pearson's r) measures how tightly two numeric variables move together on a scale from −1 to +1: +1 is a perfect upward line, −1 a perfect downward line, and 0 no linear relationship. pandas computes the whole matrix at once with .corr(numeric_only=True) (the flag tells it to skip non-numeric columns rather than error on them):

The result is square and symmetric, with 1.0 down the diagonal (every variable correlates perfectly with itself). Now hand it to heatmap and turn on annot=True to print each number inside its cell:

To read it: pick a row and a column, find where they cross, and that cell's color and number are their correlation. flipper_length_mm and body_mass_g light up strongly — long-flippered penguins are heavier — while bill_depth_mm runs negative against the size variables.

The key lesson: choose a colormap that matches the data

A heatmap's whole message is carried by color, so the colormap is not decoration — it is the encoding. And correlation is a special kind of number: it is signed, running from −1 to +1, with 0 as a meaningful midpoint that means "no linear relationship."

That demands a diverging colormap: two contrasting hues meeting at a neutral color in the middle. Center it at zero and fix the limits to the full range so the colors are anchored to their true meaning:

Now the colors mean something consistent: one hue for positive, the other for negative, pale neutral at zero, and saturation growing toward each extreme. A strong negative and a strong positive correlation look clearly different, not merely "light" versus "dark."

Why a sequential colormap misleads here

A sequential map (like "viridis" or "Blues") runs from low to high along a single direction — it is built for data with a natural floor, such as counts or magnitudes. Drop a correlation matrix onto it and −0.9 and +0.9 — opposite relationships — get pushed to opposite ends of one ramp, as if −0.9 were simply "less" than +0.9. The crucial sign distinction, and the special status of zero, both vanish. Signed data needs a diverging map centered at its midpoint. (We dig much deeper into choosing palettes in the color-palettes chapter.)

So the reading rules for a correlation heatmap are:

Saturation / brightness → strength (pale ≈ 0, vivid ≈ ±1).
Hue (which color) → direction (one color positive, the other negative).
The diagonal is always 1.0 — every variable with itself — so ignore it; the story is in the off-diagonal cells.

QuestionSelect one

You are drawing a heatmap of a correlation matrix (values from −1 to +1). Which colormap is the right choice, and why?

A sequential map like "viridis", because it looks modern and is easy to read.

A diverging map like "vlag" centered at 0, because correlation is signed with a meaningful zero midpoint.

Any colormap works; the annotations carry the meaning, so color is just decoration.

A reversed sequential map, so high values are dark.

A second use: heatmaps of a pivot table

Correlation is the famous case, but heatmap will color any 2-D matrix where both axes are meaningful. A natural source is a pivot table: reshape a tidy table into a grid and color the cells. The flights data — monthly airline passengers over a span of years — is perfect:

Two patterns jump out of the colors immediately: every column gets brighter from bottom to top (passenger numbers grew year over year), and the same summer rows stay bright across every year (a seasonal travel peak). That is two-dimensional structure you would struggle to see in a table of raw numbers.

Heatmaps want WIDE / matrix data

Notice the input shape. Most Seaborn plots want tidy/long data, but a heatmap is the exception we flagged back on the tidy-data page: it needs a wide matrix, because its two axes are the row and column labels and each cell is a value. pivot (the reverse of melt) is how you reshape long data into that grid. Passenger counts here are non-negative magnitudes, so the default sequential colormap is the correct choice — sequential for one-directional magnitudes, diverging only for signed data like correlations.

Pitfalls

Correlation only sees STRAIGHT-line relationships

Pearson's r measures linear association only. A relationship can be strong, clean, and obvious to the eye yet have r ≈ 0 — for example a perfect U-shape (y = x²), where the upward and downward halves cancel out. This is the heatmap's blind spot: a near-zero, pale cell does not mean "no relationship," only "no straight-line relationship." (Remember Anscombe's quartet: wildly different scatters can share the same correlation.) A heatmap is a fast triage tool — confirm anything interesting with an actual scatter plot.

A second, practical pitfall: a huge matrix is unreadable. Thirty variables make a 30×30 grid of 900 tiny cells — far too dense to scan, and the annotations become unreadable. Two fixes:

Select the variables that matter before computing .corr(), rather than dumping every column in.
Mask the redundant half. A correlation matrix is symmetric, so the upper and lower triangles are mirror images — half the grid is repeated. You can hide the upper triangle by passing a boolean mask built with NumPy's np.triu (the triangle, upper). Cells where the mask is True are left blank:

With the duplicated half hidden, each pair appears exactly once and the eye has far less to wade through — a small change that makes a large matrix legible.

Your turn

Using the mpg dataset:

Compute its correlation matrix with mpg.corr(numeric_only=True) and store it in corr.
Draw an annotated heatmap of corr using a diverging colormap (e.g. cmap="vlag") centered at 0.
Assign the Axes that sns.heatmap returns to a variable named ax (i.e. ax = sns.heatmap(...)).

Check your understanding

QuestionSelect one

Why is a diverging colormap (centered at 0) the right choice for a correlation heatmap?

Because diverging colormaps have more colors, so they show more detail.

Because correlation is signed (−1 to +1) with 0 as a meaningful midpoint, and a diverging map represents both direction and the neutral center.

Because diverging colormaps are colorblind-safe and sequential ones never are.

Because Seaborn refuses to draw a correlation matrix with a sequential map.

QuestionSelect one

A cell in your correlation heatmap is pale and shows r ≈ 0.02. What is the safest interpretation?

The two variables are completely unrelated.

There is little linear relationship; a curved relationship could still exist, so check a scatter plot.

The data must contain an error, because real variables are always correlated.

The colormap is wrong and is hiding the true value.

QuestionSelect one

Your dataset has 25 numeric columns and the resulting 25×25 heatmap is an illegible wall of tiny cells. What is a reasonable fix?

Make the figure smaller so it all fits on screen at once.

Select the variables you actually care about before computing .corr(), and/or mask the redundant upper triangle.

Turn off annot and rely on guessing the values from color.

Switch to a sequential colormap to compress the range.

You can now read every pairwise linear relationship in a dataset from a single grid — and you know exactly where that view goes blind. Next we keep exploring many variables together with pairwise relationship plots, which restore the scatter that the heatmap had to leave out.

Building a correlation matrix

The key lesson: choose a colormap that matches the data

A second use: heatmaps of a pivot table

Pitfalls

Your turn

Check your understanding

Correlation Heatmaps

Building a correlation matrix

The key lesson: choose a colormap that matches the data

A second use: heatmaps of a pivot table

Pitfalls

Your turn

Check your understanding

On this page