Color and Fill Scales
How ggplot2 turns data into color — discrete palettes, continuous gradients, the color vs fill distinction, and choosing perceptually honest scales.
Color is the most powerful — and most abused — aesthetic. ggplot2 handles it through color and fill scales, and understanding them turns color from a guessing game into a deliberate choice. This is a foundational page, so it has extra practice questions.
color vs. fill, one more time
Two color aesthetics exist because marks have an outline and an interior:
color— the outline of points, the stroke of lines, the border of bars.fill— the interior of areas: bars, boxes, ribbons, tiles.
Each has its own family of scales: scale_color_*() and
scale_fill_*(). Use the one that matches the aesthetic you mapped.
Discrete vs. continuous color
The single biggest fork: is the mapped variable categorical or continuous? It determines the kind of color scale entirely.
Discrete: distinct hues for distinct categories
Categories have no order, so use a qualitative palette where colors are easy to tell apart but none looks "bigger":
Continuous: a gradient for ordered magnitude
A numeric variable has order and magnitude, so use a gradient where the color changes smoothly from low to high:
Why viridis?
The viridis palettes are designed to be perceptually uniform (equal data steps look like equal color steps) and to stay legible for viewers with color-vision deficiency, and when printed in greyscale. A classic rainbow palette fails all three: it has misleading bright bands and is hostile to color-blind readers. Prefer viridis for continuous data.
Build-your-own gradients
For full control, scale_color_gradient() interpolates between two
colors, and scale_color_gradient2() builds a diverging scale
around a meaningful midpoint (great for "above/below average"):
A diverging scale only makes sense when the midpoint means something (zero, an average, a target). Using one for data without a natural center misleads the reader into seeing a division that is not there.
Manual scales: exact control
When you need specific brand or semantic colors, map them by hand with
scale_*_manual():
Color is not free
Mapping color always asks the reader to decode a legend. Do not map color to a variable you could show through position instead, and avoid encoding a continuous variable with a rainbow of hues. The right color scale clarifies; the wrong one invents patterns that are not in the data.
You map a categorical variable to color. Which kind of color scale is appropriate?
A continuous gradient from light to dark.
A qualitative palette of distinct, easily distinguished hues (e.g. scale_color_brewer(palette = "Dark2")).
A diverging scale centered on a midpoint.
No scale at all; categorical variables cannot use color.
Why are the viridis palettes recommended for mapping a continuous variable to color?
They use the brightest possible colors.
They are the only palettes ggplot2 includes.
They are perceptually uniform (equal data steps look like equal color steps) and remain readable for color-blind viewers and in greyscale.
They automatically reverse direction for negative numbers.
When is a diverging color scale (e.g. scale_color_gradient2()) the right choice?
Always, for any continuous variable.
For categorical variables with many levels.
When the variable has a meaningful midpoint — like zero, an average, or a target — so values above and below it should read as opposite directions.
Whenever you want the plot to look more colorful.
You mapped fill = drv on a bar chart but tried to style it with scale_color_brewer() and nothing changed. Why?
scale_color_brewer does not exist.
Brewer palettes do not work on bars.
The bars use the fill aesthetic, so they are governed by scale_fill_*; a scale_color_* controls the unused outline aesthetic instead.
You needed to set color = drv as a constant.
Key takeaways
colorstyles outlines/lines;fillstyles interiors — match the scale family (scale_color_*vsscale_fill_*) to the aesthetic you mapped.- Categorical color → a qualitative palette of distinct hues; continuous color → a gradient (prefer viridis).
- Diverging scales suit data with a meaningful midpoint only.
- Use
scale_*_manual()for exact, semantic colors. - Color always costs the reader decoding effort — map it only when it earns its place.
Position Scales
Controlling the x and y axes — limits, breaks, transformations like log scales, and why zooming with a scale is different from zooming with coordinates.
Cartesian and Beyond
The coordinate system is the space marks live in — usually Cartesian, but the choice is a real grammar component with real consequences.