What Scales Do
Scales are the translators between data values and visual values — and the source of every axis and legend in a ggplot.
Mappings say which column controls which aesthetic. Scales decide
the actual visual values that result. They are the quiet workhorses
that turn displ = 4.5 into a pixel position and drv = "f" into a
specific blue. Every axis and every legend you have ever seen on a
ggplot is a scale showing its work.
The mapping/scale division of labor
This pairing mirrors the stat/geom split from earlier. One component describes intent; the other carries it out:
- Mapping: "
displcontrols x." (intent) - Scale: "the range 1.6–7.0 stretches across the panel, with ticks at 2, 3, 4, ...; here is the axis." (execution)
Every aesthetic has a scale
There is one scale per aesthetic in use. You usually never see them because ggplot2 adds defaults automatically — but they are always there:
| Aesthetic | Default scale (example) | Visible as |
|---|---|---|
| x (continuous) | scale_x_continuous() | the x-axis |
| y (continuous) | scale_y_continuous() | the y-axis |
| color (discrete) | scale_color_hue() | a color legend |
| color (continuous) | scale_color_continuous() | a color bar |
| size | scale_size() | a size legend |
The naming pattern is utterly regular: scale_<aesthetic>_<type>().
Once you see it, you can guess almost any scale function's name.
Making a hidden scale visible
These two plots are identical — the second just writes out the scale ggplot2 was adding silently:
You add a scale explicitly only when you want to change something its default does — the breaks, the limits, the labels, the transformation, the palette.
Scales control four things
A scale governs how data becomes visual and how that mapping is labeled. Concretely, you reach for a scale to set:
- Limits — the range of data the scale covers.
- Breaks — where ticks / legend keys appear.
- Labels — the text at those breaks (e.g.
$1,000instead of1000). - Transformation — e.g. a log scale.
The points did not move in meaning; you only changed how the x scale presents itself — which ticks, with which labels, over which range.
Why this matters conceptually
Scales are where "data space" meets "visual space." Keeping this component separate means you can completely restyle an axis or legend — log-transform it, relabel it, recolor it — without touching the data, the mapping, or the geom. Each component stays independent.
What is the role of a scale in ggplot2, as distinct from a mapping?
The mapping draws the marks; the scale chooses the geom.
They are two words for the same thing.
The mapping says which column controls an aesthetic; the scale translates those data values into actual visual values and provides the matching axis or legend.
The scale loads the data into ggplot2.
Adding scale_x_continuous() explicitly to a plot that already had a continuous x changes nothing visually. Why?
Because scale_x_continuous() is broken.
Because scales only affect color, never position.
Because ggplot2 was already adding that scale by default; writing it out just makes the implicit default explicit.
Because the data has no continuous columns.
Key takeaways
- Scales translate data values into visual values — positions, colors, sizes — and produce the axes and legends.
- There is one scale per aesthetic; ggplot2 adds defaults
automatically, named
scale_<aesthetic>_<type>(). - You add a scale explicitly to change limits, breaks, labels, or transformations — or the palette for color/fill.
- Scales keep "data space" and "visual space" cleanly separated, so you can restyle one without touching the others.
Smoothers and Summaries
Using statistical layers to reveal trends and summaries — geom_smooth, stat_summary, and the idea that a layer can show a computed pattern rather than raw data.
Position Scales
Controlling the x and y axes — limits, breaks, transformations like log scales, and why zooming with a scale is different from zooming with coordinates.