Graphical Encodings

The handful of visual channels that every chart in the world is built from

You have now met encoding twice — first as the formal definition of what a visualization is, and then through the Cleveland-McGill ranking of how accurately different encodings convey magnitude. This page consolidates everything into a single working vocabulary.

By the end of this page you will be able to look at any chart and say: "This chart maps column A to position-on-x, column B to position-on-y, column C to color hue, and column D to marker size." That sentence is the anatomy of a Plotly Express call.

The seven (or so) visual channels

Every chart that has ever been made encodes data through some combination of these channels:

Channel	What it shows well	Plotly Express argument
Position (x)	Quantity or ordered category	`x="..."`
Position (y)	Quantity or ordered category	`y="..."`
Length	Quantity (bar charts)	implicit from `y` in `px.bar`
Color hue	Categorical grouping	`color="..."`
Color luminance	Sequential quantity	`color="..."` with continuous scale
Size	Quantity (less precise)	`size="..."`
Shape	Categorical grouping (small N)	`symbol="..."`
Facets	Categorical grouping (split panels)	`facet_col="..."`, `facet_row="..."`
Hover text	Identification of a single point	`hover_name="..."`, `hover_data=[...]`

That is essentially the entire Plotly Express argument vocabulary for "what data maps to what." Master these and you can describe any Plotly Express chart in one sentence.

The grammar in action

Here is a four-encoding chart. Notice how each = is mapping a column name to a visual channel.

Read the call out loud:

"Scatter plot. GDP on x. Life expectancy on y. Color by continent. Size by population."

The function call is the chart specification. There is nothing hidden.

Which channel is right for which kind of data?

Data comes in several flavors, and they suit different channels:

A bit more formally, four common data types:

Quantitative (numbers with a meaningful magnitude): age, temperature, dollars. Use position or length.
Ordinal (ordered but not numeric): low / medium / high. Use position along a ranked axis, or a sequential color scale.
Nominal (unordered categories): country, color, sport. Use color hue, shape, or facets.
Temporal (time): dates and times. Use the x-axis, almost always. Time is sacred — keep it on x.

Encodings tell a story about importance

Recall the perceptual ranking. Whichever variable you put on the most accurate channel implicitly becomes the most important one in the chart. The reader's eye will go to it first.

This means you can convey your narrative through encoding choice. Consider three variations of the same scatter:

# Variation A: GDP on x, life expectancy on y
px.scatter(df, x="gdpPercap", y="lifeExp", color="continent")

# Variation B: continent on x, life expectancy on y
px.scatter(df, x="continent", y="lifeExp", color="gdpPercap")

# Variation C: GDP on x, life expectancy on y, but use color for population
px.scatter(df, x="gdpPercap", y="lifeExp", color="pop")

Variation A tells a story about the relationship between GDP and life expectancy. Variation B tells a story about how continents differ in life expectancy. Variation C makes population the prominent third variable. Same data; three different stories.

Encoding mistakes to avoid

Using color for a quantity that needs precise comparison. Use position or length instead.
Using shape with more than ~6 categories. People can't tell ● ■ ▲ ▼ ◆ ★ ✦ ✚ apart in a crowded plot.
Doubling up two channels redundantly when one would do. If you color="continent" AND facet_col="continent", you've used two channels for one variable — wasteful and busy.
Using size for a negative meaning (bigger dot = worse). The eye reads "bigger" as "more important" by default; flipping this confuses readers.

A vocabulary check

Test yourself: read the chart below and say, in one sentence, which column is mapped to which channel.

The answer: x = sepal width, y = sepal length, color & shape = species, size = petal length. The chart spends color and shape on species (redundant encoding) which is sometimes useful for color-blind viewers but can also be argued as wasted ink. Tradeoffs.

Check your understanding

QuestionSelect one

Which Plotly Express argument maps a column to horizontal position?

color="..."

size="..."

x="..."

facet_col="..."

QuestionSelect one

You want the reader to judge precisely how price relates to square footage. Which mapping puts those two variables on the strongest (most accurate) channels?

Put square footage on x and price on color.

Put square footage on x and price on y.

Put square footage on size and price on color.

QuestionSelect one

Which kind of data is best encoded with color hue (rainbow categorical colors like the default Plotly palette)?

Continuous prices (e.g., $0 – $1,000,000).

An ordered scale of customer satisfaction (1-5).

A small number (~3-7) of distinct, unordered categories such as continents or product lines.

Time stamps over a 100-year span.

QuestionSelect one

You facet a Plotly Express chart by facet_col="region" and also color by color="region". What is the result?

The chart will fail to render.

Plotly will crash.

The chart will work, but you've spent two visual channels on a single variable — a missed opportunity that often adds clutter without adding information.

Plotly will use a random color per facet.

How Humans Perceive Visuals

The pre-attentive visual properties that make some encodings instantly readable and others a chore

Introducing Plotly Express

Your first proper tour of the library — what it imports, what it returns, and how to read its function names

Graphical Encodings

The seven (or so) visual channels

The grammar in action

Which channel is right for which kind of data?

Encodings tell a story about importance

Encoding mistakes to avoid

A vocabulary check

Check your understanding

On this page