Dataslope logoDataslope

Heatmaps

Encoding a two-dimensional table with color to reveal patterns that numbers hide

A heatmap is a grid of colored cells, where each cell's color encodes a numeric value. It's the natural chart for any data that sits in a two-dimensional table — categories on rows and columns, numbers in the cells.

You will see heatmaps used for correlation matrices, schedule calendars, hourly traffic, missing-data maps, and confusion matrices in ML — all the same chart, applied to different data shapes.

When a heatmap is right

Use a heatmap when:

  • You have a 2-D table of values: rows × columns of numbers.
  • You want to spot patterns, clusters, or anomalies across both axes simultaneously.
  • The number of rows and columns isn't huge — a 20×20 grid is comfortable, a 1000×1000 grid is a different kind of chart.

A simple heatmap

Plotly Express offers px.imshow (for image-like 2-D matrices) and px.density_heatmap (which bins two continuous variables on the fly).

Code Block
Python 3.13.2

Where the scatter plot showed every dot, this shows cell counts. The hot cells (yellow) are where most diners fall — small-to-medium bill, small tip.

A correlation matrix heatmap

A classic use is a correlation matrix — every numeric column correlated against every other.

Code Block
Python 3.13.2

Three Plotly Express tricks worth noticing:

  • text_auto=True writes the numeric value in each cell.
  • color_continuous_scale="RdBu_r" picks a diverging palette.
  • color_continuous_midpoint=0 centers the white at 0 so positive correlations are blue and negative are red.

For diverging data (anything with a meaningful zero, like correlations or year-over-year change), always use a diverging palette centered at zero.

A calendar/schedule heatmap

Heatmaps also show counts across two categorical axes:

Code Block
Python 3.13.2

The hot cell (Sat dinner) leaps out immediately. A table of the same numbers would require the reader to scan; the heatmap is read in a single glance.

Choosing a color scale

This is the single most important decision when making a heatmap.

  • Sequential (light → dark of one hue): "Viridis", "Plasma", "Blues", "YlOrRd". Use for values that go from low to high with no special midpoint.
  • Diverging (color → white → other color): "RdBu", "RdBu_r" (reversed), "PiYG". Use for values with a meaningful zero (correlations, deviations, gain/loss).
  • Avoid Jet / rainbow. It is perceptually non-uniform — bright cyan and yellow bands look like ridges in the data even where there aren't any. Viridis is its modern replacement.

Pitfalls of heatmaps

  • Too many cells (1000×1000) make individual cells invisible. Either aggregate or switch to a density-based representation.
  • Wrong palette (rainbow / Jet) introduces false structure.
  • Forgetting the colorbar legend — without it, the chart's colors mean nothing to the reader. Always show the legend.
  • Truncating the color range can mask outliers, similar to a truncated y-axis on a bar chart.

Check your understanding

QuestionSelect one

A heatmap is best suited for visualizing:

A single numeric variable's distribution.

A trend over time.

A 2-D table of numeric values — rows × columns — where you want to spot patterns or hotspots.

A part-of-a-whole composition.

QuestionSelect one

For a correlation matrix heatmap (values from -1 to +1), which color palette is most appropriate?

A sequential palette like Viridis.

A categorical palette like Plotly.

A diverging palette like RdBu or RdBu_r, centered at zero.

A grayscale ramp.

QuestionSelect one

Why should you avoid the Jet (rainbow) colormap for heatmaps?

It is copyrighted.

It is only available on Windows.

It is perceptually non-uniform — bright cyan and yellow bands appear as visual "ridges" even where data is flat, leading viewers to see structure that isn't there.

It uses too many colors.

On this page