Dataslope logoDataslope

The Seven Components

The seven building blocks of every ggplot — data, mappings, geometries, statistics, scales, coordinates, and facets — plus the theme that styles them.

This page is the conceptual spine of the whole course. Read it slowly. Everything after this is a deep dive into one of these components.

The Grammar of Graphics says that every chart is assembled from a small set of components. ggplot2 organizes them like this:

Let us meet each one with a single sentence and a concrete example using the built-in mpg data set (fuel-economy records for 234 cars).

1. Data

The table of values the chart describes. Without data there is nothing to show. In ggplot2 the data is almost always a data frame (or tibble) — rows are observations, columns are variables.

Code Block
R 4.6.0

2. Aesthetic mappings

The rules that connect columns to visual properties. "Put displ on the x-axis. Put hwy on the y-axis. Let drv control color." A mapping is a correspondence, written inside aes().

3. Geometries (geoms)

The kind of mark used to represent each row — points, lines, bars, boxes. The geom is what you literally see on the canvas.

4. Statistics (stats)

A transformation applied to the data before it is drawn. Some geoms draw the raw data (a scatter plot's points are the rows themselves). Others summarize first: a histogram bins and counts; a boxplot computes quartiles. That summarizing step is the statistic.

5. Scales

The translation from data values into visual values. A scale decides that displ = 2.0 sits at a particular x-pixel, or that drv = "f" becomes a particular shade of blue. Axes and legends are the visible face of scales.

6. Coordinates

The space the marks are placed in. Usually Cartesian (x across, y up), but it could be flipped, or polar — which, as we saw, turns a stacked bar into a pie.

7. Facets

A rule for splitting the data into a grid of small panels, one per group, sharing axes so panels are comparable at a glance.

And the theme

The theme is everything not tied to data: fonts, background color, gridline style, legend position. It changes how a plot looks without changing what it means.

Seeing all seven at once

Here is one plot. Run it, then read the annotated breakdown below — try to point to where each component lives.

Code Block
R 4.6.0
LineComponent
ggplot(mpg, ...)Data
aes(x = displ, y = hwy, color = drv)Mappings
geom_point()Geometry
geom_smooth(method = "lm")Geometry carrying a Statistic
scale_color_brewer(...)Scale
coord_cartesian(...)Coordinates
facet_wrap(~ drv)Facets
theme_minimal()Theme

You only write what you need

You rarely specify all seven. ggplot2 fills in sensible defaults: an unspecified scale, coordinate system, and theme are chosen for you. A minimal plot is often just data + mappings + one geometry — the other components quietly take their defaults.

This is the most important page in the course, so it gets extra practice. Take your time with these.

QuestionSelect one

Which component decides that one column controls the color of the marks (as opposed to deciding which actual colors are used)?

The geometry.

The aesthetic mapping.

The theme.

The coordinate system.

QuestionSelect one

A histogram needs to bin a continuous variable and count how many values fall in each bin before any bars can be drawn. Which component is responsible for that binning-and-counting step?

The geometry.

The scale.

The statistic.

The facet.

QuestionSelect one

Why can a minimal ggplot often be written as just data + one mapping + one geometry, with nothing else specified?

ggplot2 cannot use scales, coordinates, or themes unless you import extra packages.

Those other components do nothing useful.

ggplot2 supplies sensible defaults for the unspecified components (a default scale, Cartesian coordinates, a default theme).

Minimal plots secretly disable the other components.

Key takeaways

  • Every ggplot is built from data, mappings, geometries, statistics, scales, coordinates, and facets, with a theme for styling.
  • Mappings bind columns to visual properties; scales decide the actual visual values; the two are distinct.
  • Statistics transform data before drawing; some geoms draw raw data, others summarize first.
  • You only write the components you want to change — the rest take sensible defaults.

On this page