Plotting Systems That Don't Scale
Why traditional, command-by-command plotting APIs become unmanageable as charts grow — the problem the Grammar of Graphics was created to solve.
Before we touch any ggplot2 code, it helps to understand the pain
that ggplot2 was designed to remove. If you have used base R graphics,
matplotlib, or a spreadsheet chart wizard, you have felt this pain —
maybe without naming it.
The "draw one thing at a time" model
Most traditional plotting systems work like a drawing program. You issue a sequence of commands, and each command paints something on a canvas:
Draw the axes. Now draw the points. Now add a legend. Now draw a second set of points in red. Now add a title. Now adjust the y-axis limits so nothing gets clipped.
Here is what that looks like in base R — a perfectly capable system that nonetheless makes you manage every detail by hand:
This works. But look closely at everything you had to do:
- Manually build a lookup table mapping cylinder values to colors.
- Manually convert
cylto a character to index that table. - Manually construct a legend that repeats the same color and shape information — and which has no automatic connection to the points. If you change the colors above, you must remember to change them in the legend too.
The computer is not helping you. It is taking dictation.
Where it breaks down
A single static scatter plot is fine. The trouble starts when the chart needs to grow:
Every new requirement forces you to thread more state by hand: more color tables, more loops over subsets, more legend entries to keep in sync, more axis math. Small changes ripple through the whole script. The plot's appearance and the plot's logic are tangled together.
The core problem
In a command-by-command system, there is no separation between
what the chart means (this column controls color) and how it is
drawn (this point is tomato). Because meaning and drawing are
fused, every change means editing drawing instructions by hand.
What we actually want
Notice that when you describe a chart to a colleague, you do not read out drawing commands. You say something like:
"Plot weight against MPG, color the points by cylinder count, and fit a line through each color group."
That sentence is structured. It names a data set, some mappings from columns to visual properties, and a couple of kinds of marks. It says nothing about color lookup tables or legend coordinates — those should be consequences of the description, not things you manage.
What if you could write the chart the way you describe it, and let the system work out the drawing? That is exactly the promise of a grammar of graphics — and the subject of the next page.
In the base R example above, why did we have to build a legend manually with the same colors and shapes as the points?
Base R cannot draw legends at all.
In a command-by-command system the legend is a separate drawing step with no automatic link to the data mapping, so it must be kept in sync by hand.
Legends are only needed for line charts.
The legend updates itself whenever the colors change.
Key takeaways
- Traditional plotting systems are imperative: you issue drawing commands one at a time.
- They do not separate a chart's meaning from its drawing, so the two get tangled.
- As charts grow — more variables, groups, panels, transformations — the code grows faster and becomes fragile.
- We want to describe charts structurally and let the system handle the drawing. That desire leads directly to the Grammar of Graphics.
Which statement best captures why command-by-command plotting code becomes hard to scale?
Computers are too slow to redraw large charts.
R is a poorly designed language for graphics.
The chart's meaning and its drawing instructions are fused, so every new requirement forces manual edits to low-level drawing steps.
Scatter plots specifically cannot show more than two variables.
Welcome
A deep, intuition-first tour of ggplot2 and the Grammar of Graphics for analysts who already make charts but want to truly understand them.
The Origins of the Grammar
How Leland Wilkinson's Grammar of Graphics reframed a chart as a structured combination of components rather than a named chart type.