Dataslope logoDataslope

Bars and Histograms

Counting and binning in depth — position adjustments, bin width, and why a histogram is a bar chart of a binned statistic.

Bars are everywhere in analytics, and almost every bar chart is really a statistic made visible. This page goes deep on the two most common counting stats — stat_count (bars) and stat_bin (histograms) — and the position adjustments that decide how bars share space.

A histogram is binning + counting

A histogram answers: how is one continuous variable distributed? The stat slices the range of x into equal bins and counts how many values land in each.

The single most important knob is the bin width — it changes the story the histogram tells:

Code Block
R 4.6.0
Code Block
R 4.6.0

Always choose your bins consciously

ggplot2 defaults to 30 bins and prints a message telling you to pick a better value. The default is rarely ideal. Set bins = or binwidth = deliberately — the same data can look unimodal or jagged depending on this one choice.

Bar charts: counts across categories

geom_bar() is the categorical cousin: count rows per category. Mapping a second categorical variable to fill splits each bar — and now we must decide how the sub-bars share space. That decision is the position adjustment.

Position adjustments: stack, dodge, fill

When bars (or their sub-pieces) would occupy the same place, a position controls the arrangement:

See all three on the same data:

Code Block
R 4.6.0
Code Block
R 4.6.0
Code Block
R 4.6.0

Each position answers a different question from the same data:

  • stack — "what is the total, and its composition?"
  • dodge — "how do the groups compare within each category?"
  • fill — "what is the proportion mix, ignoring totals?"

This is the grammar again: you do not switch chart types, you switch one position argument and the chart re-poses to answer a new question.

Position is its own grammar component

Position adjustments also apply beyond bars. `position = "jitter"` on points nudges overlapping dots apart so you can see density — `geom_jitter()` is just `geom_point(position = "jitter")`.

Code Block
R 4.6.0
QuestionSelect one

In a histogram, what does the bin width control, and why does it matter?

The color of the bars.

The number of rows in the data set.

How finely the continuous range is sliced before counting — too wide hides structure, too narrow makes the histogram noisy.

Whether the x-axis is continuous or discrete.

QuestionSelect one

You map fill = drv on a bar chart of class and want to compare drivetrains side by side within each class. Which position adjustment do you use?

position = "stack".

position = "dodge".

position = "fill".

position = "jitter".

QuestionSelect one

What does position = "fill" show that position = "stack" does not?

The raw counts within each segment.

Side-by-side bars for direct height comparison.

The proportion (relative composition) within each category, by stretching every bar to the same full height.

Nothing different; they are identical.

Key takeaways

  • A histogram = stat_bin (slice x into bins, count) drawn as bars; bin width is the decisive choice.
  • A bar chart = stat_count (count rows per category) drawn as bars.
  • Position adjustments decide how overlapping bars share space: stack (totals + composition), dodge (side-by-side), fill (proportions).
  • Positions extend beyond bars — jitter separates overlapping points.
  • Switching the question often means switching a position, not the chart type.

On this page