Anatomy of a Complete Plot
Revisit the intimidating plot from the welcome page and decompose it, component by component, now that you know the whole grammar.
Remember the plot on the welcome page — the one we said would feel "inevitable" by the end? It is time. You now know every component it uses. Let us dissect it and prove the grammar has become second nature.
The plot
Decoding it, line by line
Walk down the code and name the component each line belongs to. This is the skill the whole course has been building:
| Code | Component | What it contributes |
|---|---|---|
ggplot(mpg, ...) | Data | The table being shown |
aes(displ, hwy, color = drv) | Mappings | Three channels: x, y, color |
geom_point(alpha = 0.7) | Geometry | Raw points; alpha is set, not mapped |
geom_smooth(method = "lm") | Geometry + Statistic | A linear fit per drivetrain (color inherited) |
scale_color_brewer() | Scale | Specific colors for the drv categories |
facet_wrap(~ class) | Facets | A small-multiple panel per class |
labs(...) | Labels | The communication layer |
theme_minimal() | Theme | Non-data styling |
Nothing here is mysterious anymore. Every line is one named component doing one job.
Predicting behavior from the grammar
Because you understand the components, you can predict what edits would do — without running them. Test yourself, then run to confirm.
If you predicted one smoother per panel (instead of one per
drivetrain), you have internalized layer-scoped mappings. The points
are still colored by drv, but the smoother — no longer seeing
color — fits a single line to each panel's data.
Building it up, not writing it whole
Experienced ggplot users do not type that whole block at once. They grow it, checking each layer:
This incremental rhythm — add a component, look, add another — is the natural way to work in a grammar where plots are built by addition. It also makes debugging trivial: if something breaks, it is almost always the component you just added.
You can now read any ggplot
Hand yourself any ggplot code in the wild and you can narrate it: "data here, mappings there, two geoms, a custom scale, faceted by X, themed minimally." That narration is fluency in the Grammar of Graphics.
In the complete plot, geom_smooth(method = "lm") draws one line per drivetrain within each class panel. Which two components working together cause that?
The theme and the labels.
The data and the coordinate system.
The color = drv mapping (inherited from ggplot(), so the smoother groups by drivetrain) and facet_wrap(~ class) (which splits the data into per-class panels).
scale_color_brewer and alpha = 0.7.
Why do experienced users build a complex ggplot incrementally (add a layer, view, add another) rather than writing it all at once?
Because ggplot2 cannot parse long expressions.
Because each layer must be saved to disk before the next.
Because plots are built by addition, so adding one component at a time makes each step easy to verify and any breakage easy to localize to the last component added.
Because incremental plots use less memory.
Key takeaways
- Any complete ggplot decomposes cleanly into data, mappings, geoms, stats, scales, facets, labels, and theme — you can now name each.
- Understanding components lets you predict the effect of an edit (e.g. moving a mapping into one layer) before running it.
- Build complex plots incrementally — add a component, view, repeat — which mirrors the grammar and makes debugging easy.
Labels, Titles, and Annotations
Turning a correct chart into a communicative one — labs(), titles, direct annotations, and reference lines that guide the reader's eye.
A Complete Workflow
A realistic end-to-end ggplot2 session — from a question, through exploratory plots, to a polished final figure — using the grammar at every step.