Line Charts
The chart of *change* — when to use lines, when to add markers, and how to keep them readable
A line chart is the right tool when your data has a meaningful order along the x-axis — almost always time, but sometimes another continuous or ordered variable. Lines connect successive points to show how a value evolves.
When to use a line chart
Use a line chart when:
- The x-axis represents time, or another ordered, continuous variable.
- You want to show trend, change, or rate of change.
- You have multiple series to compare over the same axis (one line per group).
Never use a line chart when the x-axis is a set of unordered categories like "Apple, Banana, Cherry." Lines drawn between them imply a relationship that isn't there.
A simple time series
That's the simplest possible line chart: one x, one y, one line. Hover over the line — you'll see Plotly's tooltip pop up showing exact dates and values.
Multiple series with color
To show several lines, reshape your data to long form and use
color to encode the series.
Now you have several lines, each in a different color, with an interactive legend you can click to hide/show series.
Markers: show the actual data points
Lines are great for showing the shape of a trend, but sometimes
you want to also see exactly where each measurement was taken.
Add markers=True:
The markers help when:
- You have sparse data (just a handful of points).
- The reader needs to know which are the measured points.
- The chart will be exported as a static PNG.
If you have dense data (thousands of points), omit markers — they'll just clutter the line.
Many lines? Avoid the spaghetti
A common failure mode is the spaghetti chart — twenty or more overlapping lines that no one can read.
There are 30 lines on that chart and even with interactive toggling it's hard to read. Fixes:
- Filter to a smaller set of series (top 5 countries by some metric).
- Use facets (small multiples) — one mini-chart per country.
- Highlight a few lines and gray out the rest.
We'll cover faceting in its own chapter. For now, the rule: if you have more than ~7 lines, plan how you'll keep them readable.
Linear and log scales
When values span many orders of magnitude (e.g., country GDPs, COVID case counts), a log scale for y reveals structure that a linear scale flattens out.
On the linear chart, the USA's huge values squash everyone else. On the log chart, you can compare the relative growth rates of all four countries.
Log scales are powerful but require labeling — always say "log scale" in your axis label, because viewers do not assume it.
A common subtle mistake
Plotly Express orders line points by their order in the DataFrame, not by the x-axis value. If your data isn't sorted by x, you'll get a tangled chart:
# Suspicious! Sort before plotting:
df = df.sort_values("date")
px.line(df, x="date", y="price")If a line chart ever looks bizarrely scribbled, the first thing to check is whether the data is sorted by x.
Check your understanding
When is a line chart the right choice?
For comparing values across a handful of unrelated categories.
For showing the distribution of one variable.
For showing how a value changes over an ordered axis — most commonly time.
For showing a part-of-a-whole breakdown.
Why should you almost never draw a line chart with x as a set of unordered categories like cities or product names?
Plotly will refuse to render it.
It uses too much memory.
The line implies a meaningful order/connection between successive x-values; with unordered categories, the line is misleading.
Lines can only show one series at a time.
Your line chart of 30 countries has become an unreadable spaghetti chart. Which is NOT a good fix?
Filter to the top 5 most relevant countries.
Facet into one mini-chart per country.
Highlight 3 countries in color and gray out the rest.
Increase the line thickness for all 30 countries.
When is a log y-axis especially useful for a line chart?
When values are all between 0 and 1.
When the chart only has one line.
When values span many orders of magnitude (e.g., 100 vs 1,000,000) and you want to compare rates of growth across series.
When the data has fewer than 10 points.