Dataslope logoDataslope

Continuous vs. Categorical

The single distinction that drives every chart choice — is a variable a number on a scale, or a label for a group? — and how Seaborn maps each kind.

Almost every decision you make in this course — which chart, which axis, whether color means a gradient or a group — comes down to one question about each column:

Is this variable a number measured on a scale, or a label that names a group?

Get that right and Seaborn's behavior stops feeling mysterious. This page is foundational, so we'll go slowly and check understanding often.

The two kinds of variables

Continuous (also called quantitative or numeric) variables are measured on a scale where the spacing between values is meaningful and there are, in principle, infinitely many possible values in between.

  • body_mass_g (a penguin can weigh 3801 g or 3802 g)
  • total_bill (a restaurant bill of $18.43)
  • flipper_length_mm, horsepower, sepal_length

For these, arithmetic makes sense: a 4000 g penguin is twice the mass of a 2000 g one, and the average mass is a meaningful number.

Categorical variables name a finite set of groups. The value is a label, not a quantity.

  • species — Adelie, Chinstrap, Gentoo
  • day — Thur, Fri, Sat, Sun
  • sex, smoker, island

Averaging a categorical column is nonsense — there is no "mean species." Instead you count them or split the data by them.

Nominal vs. ordinal — a useful sub-distinction

Categorical variables come in two flavors:

  • Nominal — no inherent order (species, island). Any ordering you give them is arbitrary.
  • Ordinal — a natural order, but unequal or unmeasured spacing (size = small < medium < large; day = Mon < Tue < ...).

The order matters for how you arrange a chart, even though you still can't do arithmetic on the labels.

How Seaborn maps each kind

This is the crux. The type of a variable decides how Seaborn turns it into something you can see:

If the variable is...Seaborn gives it...Reach for...
Continuousa continuous axis; a color gradient for hue; marker sizerelplot, displot (scatter, line, histogram, KDE)
Categoricalevenly spaced discrete slots; distinct colors for hue; one facet per levelcatplot (bar, count, box, violin, strip, swarm)

The clearest place to feel this is the hue parameter, which behaves in two completely different ways depending on the column you give it.

hue on a categorical column → distinct colors

Code Block
Python 3.13.2

hue on a continuous column → a color gradient

Code Block
Python 3.13.2

Same parameter, same dataset — but Seaborn read the column's type and chose distinct colors for the groups versus a smooth gradient for the measurement. You did not ask for either; the data's nature decided.

QuestionSelect one

You map hue to a column and Seaborn draws a continuous colorbar instead of a discrete legend with separate swatches. What does that tell you about the column?

It is an ordinal categorical variable.

Seaborn is treating it as a continuous (numeric) variable.

The column contains missing values.

You forgot to call set_theme.

The gotcha: numbers that are really categories

Seaborn infers a column's type from its pandas dtype, not from what it means. So a column stored as numbers gets treated as continuous — even when it's really a category in disguise.

A classic example is party size in the tips dataset: it's stored as an integer (1–6), but it behaves like an ordered category. Watch what happens when we treat it each way.

Code Block
Python 3.13.2

Hand size to a categorical plot and Seaborn gives each party size its own discrete slot — exactly what we want:

Code Block
Python 3.13.2

When a numeric code is secretly a category

Columns like month (1–12), year, pclass (1/2/3), or a size code are numbers to pandas but categories to you. If a plot treats one as a continuous axis when you wanted discrete groups, either use a categorical plot (catplot), pass an explicit order=, or convert the column with df["col"] = df["col"].astype("category"). The fix is to tell Seaborn the meaning, since it can only see the dtype.

Why this distinction runs the whole course

Every chapter ahead is organized around this split:

  • Relational plots (scatter, line) expect two continuous variables.
  • Distribution plots (histogram, KDE, ECDF) describe one continuous variable.
  • Categorical plots (bar, count, box, violin, strip, swarm) put a categorical variable against (usually) a continuous one.

So before choosing a chart, label each variable in your head: continuous or categorical? That one word per column tells you which family to open.

Your turn

Challenge
Python 3.13.2
Treat a number as a category

In tips, the party size column is stored as an integer but is really an ordered category. Draw a box plot with sns.catplot that shows the distribution of total_bill for each party size:

  • x="size", y="total_bill", kind="box".

Assign the result to g. Because catplot treats x as categorical, you'll get one box per party size.

Check your understanding

QuestionSelect one

Which list contains only continuous variables?

species, island, sex

body_mass_g, flipper_length_mm, bill_length_mm

day, time, smoker

species, body_mass_g, day

QuestionSelect one

Why can't you take the mean of a nominal categorical variable like species?

Because pandas stores it as text, which is slow to average.

Because its values are labels for groups, not quantities, so arithmetic on them has no meaning.

Because Seaborn forbids it.

Because it has fewer than ten unique values.

QuestionSelect one

Your DataFrame has a month column stored as integers 1–12. You make a plot and Seaborn spreads it along a smooth continuous axis, but you wanted twelve discrete month slots. What's the cleanest fix?

Re-download the data in a different format.

Map month to hue instead of x.

Use a categorical plot (catplot), pass an explicit order=, or convert the column with astype("category").

Sort the DataFrame by month.

QuestionSelect one

You give hue a column with three text labels ("low", "med", "high"). How will Seaborn encode it?

As a continuous gradient with a colorbar.

As three distinct colors with a category legend.

It will raise an error because hue requires numbers.

It will average the three groups into one color.

With the continuous/categorical lens in hand, you're ready to actually load data and start drawing. Next we'll meet Seaborn's built-in datasets and set a theme — then learn the one structural idea behind its whole API: the split between figure-level and axes-level functions.

On this page