Mapping vs. Setting
The single most common ggplot2 confusion — putting a property inside aes() versus outside it — explained with a clear mental model.
This page covers the mistake every ggplot2 learner makes at least once. It is worth its own page because once it clicks, a whole class of "why is my chart the wrong color?" bugs disappears forever.
The two situations
There are two completely different things you might want to do with a visual property like color:
- Map it — let a column control it, so the value varies row by
row. → goes inside
aes(). - Set it — fix it to a constant, the same for every mark. →
goes outside
aes().
Setting: a constant for every mark
When you want every point blue, you set color outside aes():
color = "blue" lives in geom_point(), outside aes(). Result:
every point is exactly blue, and there is no legend — because no
variable is being encoded.
Mapping: a column controls it
When you want color to encode a variable, you map it inside
aes():
Here color varies with drv, ggplot2 chooses the palette via a scale,
and a legend appears automatically. You do not get to pick the exact
colors here — that is the scale's job (a later chapter).
The classic bug
Now the trap. What happens if you put a constant color inside
aes()?
Run it. The points are red (or salmon), not blue — and there is a
useless legend labeled "blue".
Why? Because everything inside aes() is interpreted as data to be
mapped. ggplot2 sees color = "blue" and thinks: "Ah, a new
variable whose value is the constant string 'blue' for every row.
Map that variable to color through the color scale." The scale then
assigns the first default color (reddish) to the single category
"blue", and builds a legend for it.
The mnemonic
Inside aes() = the name of a column / a variable to encode.
Outside aes() = a literal value to use as-is.
If you ever see an unwanted legend whose label is a literal color name
like "blue" or "red", you have set a constant inside aes() by
mistake.
Side by side
The fix for the bug is simply to move the constant out of aes():
Does this apply to all aesthetics? Yes.
The same rule governs size, shape, alpha, fill — everything:
geom_point(size = 4)→ every point size 4 (setting).aes(size = cty)→ size varies withcty, legend appears (mapping).
What is the difference between geom_point(color = "red") and geom_point(aes(color = "red"))?
They are identical; aes() makes no difference here.
The first throws an error.
The first sets every point to the literal color red; the second maps a constant variable "red" to color, producing scale-chosen colors and a spurious legend.
The second is the correct way to make points red.
You want every point in your plot to be exactly the same shade of dark green. Where does color = "darkgreen" belong?
Inside aes(), as aes(color = "darkgreen").
Outside aes(), as an argument to the geom: geom_point(color = "darkgreen").
In a theme() call.
It cannot be done; ggplot2 always chooses colors itself.
A learner's plot shows reddish points and a legend titled with the literal text "blue". What almost certainly happened?
The data has a column literally named blue.
ggplot2 is broken.
They wrote aes(color = "blue"), so the constant "blue" was treated as a one-category variable and mapped through the color scale.
They forgot to call library(ggplot2).
Key takeaways
- Map (inside
aes()) when a column should control a property — the value varies per row and a legend appears. - Set (outside
aes(), as a geom argument) when you want a constant — applied literally, no legend. - Everything inside
aes()is treated as data. A constant placed there becomes a bogus one-category variable. - A legend labeled with a literal value (like
"blue") is the signature of this mistake. - The rule applies to all aesthetics: color, fill, size, shape, alpha, linetype.
Aesthetic Mappings
What aes() really means — connecting data columns to visual channels like position, color, size, and shape — and why mappings are the heart of the grammar.
What Is a Geom?
Geometries are the marks ggplot2 draws — understand them as interchangeable representations of the same data and mappings.