Dataslope logoDataslope

Principles of Visualization

Before you learn a plotting library, learn what makes a chart good. A short tour of the timeless rules: encode well, declutter ruthlessly, tell one story.

A good chart is a thought made visible. A bad chart is a distraction at best, a lie at worst. Both use the same software, the same data, even sometimes the same chart type. The difference is in the choices — and most of the choices follow a handful of principles that have been refined over a century.

This page is about those principles. We won't write much code on this page. The next page (ggplot2) gives you the tools to apply what you learn here.

Charts are encodings

Every chart maps a variable in the data to a visual property on the page. The toolbox of visual properties is small:

Visual propertyBest forNotes
Position (x, y)numeric variables, comparisonby far the most accurate channel
Lengthnumeric, compared to a common baselinebars
Angle / areanumeric (poorly)avoid pie charts when accuracy matters
Color (hue)categorical~7 distinct values is a hard limit
Color (intensity)numeric (ordered)sequential scales
Shapecategoricallow precision; limit to a few values
Sizenumericlow precision; only for emphasis

The most important rule: encode your most important variable in position. People read position with near-perfect accuracy and color/size only crudely.

Match the chart type to the question

Different questions call for different charts:

QuestionBest chart
How is one numeric variable distributed?histogram, density
How do two numeric variables relate?scatterplot
How does a numeric variable differ by group?boxplot, violin, dot plot
How do counts compare across categories?bar chart
How does something change over time?line chart
How is a whole made of parts?bar chart (yes — almost always better than a pie)

Bar charts are dramatically underrated. Pie charts and 3D charts are wildly overrated. The eye is bad at angles and worse at volumes.

Less is more: the data-ink ratio

Edward Tufte coined the term data-ink ratio: the proportion of a chart's ink that is actually showing data. Maximize it. Ruthlessly remove:

  • 3D effects on 2D data
  • Background colors that don't encode anything
  • Drop shadows
  • Heavy gridlines (light is fine; thick is noise)
  • Decorative icons
  • Redundant labels and legends

A chart's job is to show data. Everything else is competition.

Use color with intention

Color is powerful and easily abused.

  • Categorical hues (red, blue, green, ...): use to distinguish unordered categories. Keep the count low — most people can hold ~7 colors apart, and many fewer if the chart is small or the dots are tiny.
  • Sequential color scales (light → dark single hue): use for ordered numeric variables. Light = low, dark = high.
  • Diverging scales (blue → white → red, etc.): use when the variable has a meaningful midpoint (zero, average).

Two additional rules:

  1. Color is not free. Each color you add is one more thing the reader has to translate. If you can encode something structurally instead (with position, faceting, or order), prefer that.
  2. Be colorblind-friendly. Avoid red/green as the only distinguishing pair. Use palettes like viridis or RColorBrewer's color-safe sets — modern R plotting tools make this almost free.

Order matters

If your categories don't have a natural order (alphabetic doesn't count), sort them by the thing you're showing. A bar chart of sales by region, sorted by sales, is dramatically clearer than the same chart sorted alphabetically. The eye can read "decreasing in this direction" without effort.

Aspect ratio and scale

  • Don't truncate axes for bar charts. (Truncating amplifies small differences and is a classic chart crime.)
  • Don't add a meaningless baseline to line charts. Starting a y-axis at zero is the default for bars but a choice for lines.
  • Use the right aspect ratio so trends look the way they actually are. Stretching a chart wide flattens trends; squishing it tall exaggerates them.

Title, axis labels, units

This sounds obvious. It is, and yet 80% of beginner charts skip at least one:

  • Title that says what the chart is showing, not just the variable names ("MPG falls as weight rises" is better than "mpg vs wt")
  • Axis labels with units ("Weight (1000s of lbs)" not just "wt")
  • Legend that doesn't restate something already obvious

If a chart needs paragraphs of caption to make sense, the chart isn't doing its job.

One chart, one message

Cram more than one main message into a chart, and most readers will absorb none of them. If you have two things to say, make two charts. The eye is a serial reader; charts are best when they are crisp and singular.

A bad chart and a good chart, same data — look at them and pick the one you'd put in a report.

Code Block
R 4.6.0
Code Block
R 4.6.0

What changed:

  • Sorted (so the order encodes ranking)
  • Horizontal (so the names fit and read naturally)
  • Single color (no false categorical distinctions)
  • Title that says what the chart is about
  • Axis label with the right name

The data is the same. The reader's experience is not.

Test your understanding

QuestionSelect one

Which visual property is read with the highest accuracy by the human eye?

Color hue

Position along a common scale

Area

Shape

QuestionSelect one

Why are pie charts often a poor choice?

They take more pixels.

They cannot show more than two categories.

The eye reads angles and areas poorly — bar charts (which use position/length) are usually clearer and more accurate.

R does not support them.

QuestionSelect one

If you have a bar chart with five regions and no inherent order, you should:

Sort alphabetically.

Sort by region ID.

Sort by the value you're plotting, so the chart's order itself encodes a ranking.

Leave them in a random order.

The principles are language-agnostic. The next page is about the language that makes them easy to apply in R: the grammar of graphics as implemented in ggplot2.

On this page