Your First ggplot
Build a ggplot from nothing, one component at a time, and learn to read the ggplot() + aes() + geom_*() skeleton like a sentence.
Now we slow down and construct a plot from an empty canvas, watching each component earn its place. By the end you will be able to read any basic ggplot as an English sentence.
The skeleton
Almost every ggplot follows the same shape:
Three pieces:
ggplot(data, aes(...))— name the data and the mappings.+ geom_*()— add at least one geometry.+ ...— optionally add more components.
Step 1: an empty plot
What does ggplot() with only data and mappings produce?
Run it. You get axes but no points. That is not a bug — it is the grammar being honest. You told ggplot2 what the data is and what maps to x and y, so it drew the coordinate system and scaled the axes. But you never said what marks to draw, so it draws no marks.
A blank panel is informative
The empty plot proves that data + mappings is a real, separate stage. The axes already span the right range because the scales were computed from the data — before any geometry existed.
Step 2: add a geometry
Give it a geom and the marks appear:
Read this aloud as a sentence:
"Take
mpg; map engine displacement to x and highway MPG to y; draw a point for each row."
Every row of mpg becomes one dot positioned by its displ and hwy.
That is the entire chart.
Step 3: add another mapping
Want color to encode drivetrain? Add one mapping. You do not touch the geometry, the axes, or anything else.
One new word — color = drv — and you get colored points plus a
legend, for free. This is the payoff from the intro: mappings produce
their own legends.
Step 4: add a second geometry
Layers stack. Add a smoother on top of the points by adding a second
geom with +:
The points and the trend line share the same data and the same
mappings (declared once in ggplot()), but draw different marks. We
will explore this layering idea fully on the next page.
Two ways to write the same plot
The mappings can live in ggplot() (shared by all layers) or inside a
specific geom (used by that layer only). These two are equivalent:
Putting shared mappings in ggplot() is the common style, because
most layers want the same x and y. Put a mapping inside a geom when
only that layer should use it.
Mind the + placement
The + must end the line, not begin the next one. Write
geom_point() + then a newline — never a line that starts with +.
A leading + makes R think the previous statement was finished and
throws an error.
Running ggplot(mpg, aes(x = displ, y = hwy)) with no geom produces axes but no points. Why?
It is an error that happens to render.
The data failed to load.
Data and mappings define the coordinate space and scales, but marks are only drawn by a geometry, which has not been added yet.
ggplot always requires color to be mapped before drawing.
You want a scatter plot where x and y are shared by all layers but a geom_point layer is colored by drv and a geom_smooth layer is not. Where should color = drv go?
In ggplot(aes(...)), so every layer shares it.
Nowhere; you cannot do this in ggplot2.
Inside geom_point(aes(color = drv)), so only that layer uses the color mapping.
In the theme().
Key takeaways
- The ggplot skeleton is
ggplot(data, aes(...)) + geom_*(). ggplot()alone draws axes from the data and mappings — geometries draw the actual marks.- Read a ggplot as a sentence: take this data, map these columns, draw these marks.
- Mappings in
ggplot()are shared by all layers; mappings inside a geom belong to that layer only. - The
+ends a line; it never begins one.
The Seven Components
The seven building blocks of every ggplot — data, mappings, geometries, statistics, scales, coordinates, and facets — plus the theme that styles them.
Thinking in Layers
Why ggplot2 builds plots by stacking independent layers with +, and how the layered model makes complex figures simple to reason about.