Dataslope logoDataslope

Graphical Encodings

The handful of visual channels that every chart in the world is built from

You have now met encoding twice — first as the formal definition of what a visualization is, and then through the Cleveland-McGill ranking of how accurately different encodings convey magnitude. This page consolidates everything into a single working vocabulary.

By the end of this page you will be able to look at any chart and say: "This chart maps column A to position-on-x, column B to position-on-y, column C to color hue, and column D to marker size." That sentence is the anatomy of a Plotly Express call.

The seven (or so) visual channels

Every chart that has ever been made encodes data through some combination of these channels:

ChannelWhat it shows wellPlotly Express argument
Position (x)Quantity or ordered categoryx="..."
Position (y)Quantity or ordered categoryy="..."
LengthQuantity (bar charts)implicit from y in px.bar
Color hueCategorical groupingcolor="..."
Color luminanceSequential quantitycolor="..." with continuous scale
SizeQuantity (less precise)size="..."
ShapeCategorical grouping (small N)symbol="..."
FacetsCategorical grouping (split panels)facet_col="...", facet_row="..."
Hover textIdentification of a single pointhover_name="...", hover_data=[...]

That is essentially the entire Plotly Express argument vocabulary for "what data maps to what." Master these and you can describe any Plotly Express chart in one sentence.

The grammar in action

Here is a four-encoding chart. Notice how each = is mapping a column name to a visual channel.

Code Block
Python 3.13.2

Read the call out loud:

"Scatter plot. GDP on x. Life expectancy on y. Color by continent. Size by population."

The function call is the chart specification. There is nothing hidden.

Which channel is right for which kind of data?

Data comes in several flavors, and they suit different channels:

A bit more formally, four common data types:

  • Quantitative (numbers with a meaningful magnitude): age, temperature, dollars. Use position or length.
  • Ordinal (ordered but not numeric): low / medium / high. Use position along a ranked axis, or a sequential color scale.
  • Nominal (unordered categories): country, color, sport. Use color hue, shape, or facets.
  • Temporal (time): dates and times. Use the x-axis, almost always. Time is sacred — keep it on x.

Encodings tell a story about importance

Recall the perceptual ranking. Whichever variable you put on the most accurate channel implicitly becomes the most important one in the chart. The reader's eye will go to it first.

This means you can convey your narrative through encoding choice. Consider three variations of the same scatter:

# Variation A: GDP on x, life expectancy on y
px.scatter(df, x="gdpPercap", y="lifeExp", color="continent")

# Variation B: continent on x, life expectancy on y
px.scatter(df, x="continent", y="lifeExp", color="gdpPercap")

# Variation C: GDP on x, life expectancy on y, but use color for population
px.scatter(df, x="gdpPercap", y="lifeExp", color="pop")

Variation A tells a story about the relationship between GDP and life expectancy. Variation B tells a story about how continents differ in life expectancy. Variation C makes population the prominent third variable. Same data; three different stories.

Encoding mistakes to avoid

  • Using color for a quantity that needs precise comparison. Use position or length instead.
  • Using shape with more than ~6 categories. People can't tell ● ■ ▲ ▼ ◆ ★ ✦ ✚ apart in a crowded plot.
  • Doubling up two channels redundantly when one would do. If you color="continent" AND facet_col="continent", you've used two channels for one variable — wasteful and busy.
  • Using size for a negative meaning (bigger dot = worse). The eye reads "bigger" as "more important" by default; flipping this confuses readers.

A vocabulary check

Test yourself: read the chart below and say, in one sentence, which column is mapped to which channel.

Code Block
Python 3.13.2

The answer: x = sepal width, y = sepal length, color & shape = species, size = petal length. The chart spends color and shape on species (redundant encoding) which is sometimes useful for color-blind viewers but can also be argued as wasted ink. Tradeoffs.

Check your understanding

QuestionSelect one

Which Plotly Express argument maps a column to horizontal position?

color="..."

size="..."

x="..."

facet_col="..."

QuestionSelect one

You want the reader to judge precisely how price relates to square footage. Which mapping puts those two variables on the strongest (most accurate) channels?

Put square footage on x and price on color.

Put square footage on x and price on y.

Put square footage on size and price on color.

QuestionSelect one

Which kind of data is best encoded with color hue (rainbow categorical colors like the default Plotly palette)?

Continuous prices (e.g., $0 – $1,000,000).

An ordered scale of customer satisfaction (1-5).

A small number (~3-7) of distinct, unordered categories such as continents or product lines.

Time stamps over a 100-year span.

QuestionSelect one

You facet a Plotly Express chart by facet_col="region" and also color by color="region". What is the result?

The chart will fail to render.

Plotly will crash.

The chart will work, but you've spent two visual channels on a single variable — a missed opportunity that often adds clutter without adding information.

Plotly will use a random color per facet.

On this page