Dataslope logoDataslope

Decomposing the Signals: Dissecting Trend, Seasonality, and Residuals

Splitting a series into trend, seasonal, and residual components with seasonal_decompose — additive vs multiplicative models, why AirPassengers needs multiplicative (or a log), reading the residual as a diagnostic, and deseasonalizing to reveal true growth.

A real time series is several stories told at once: a slow climb, a yearly rhythm, and a haze of randomness on top. Decomposition pulls those apart so you can study each on its own. It's both a diagnostic (what is this series actually made of?) and a preparation step (strip the seasonality so you can see the real growth). The mental model is one line:

See it recover components we built ourselves

The honest way to trust a tool is to feed it data whose answer you already know. We'll construct a series from a known trend, a known seasonal sine, and known noise, then ask seasonal_decompose to hand the pieces back.

Code Block
Python 3.13.2

Four panels: the original on top, then the smooth trend it extracted, the repeating seasonal wave, and the residual — what's left when you subtract the other two. For data we built additively, the residual is shapeless noise, exactly as it should be.

How seasonal_decompose actually works (the short version)

  1. Trend — estimate it with a centered moving average whose window is the seasonal period (so the season averages out). This is why the trend has NaNs at both ends: the moving average needs a full window.
  2. Seasonal — subtract the trend, then average the leftovers at each position in the cycle (all Januaries together, all Februaries, ...). That average shape is repeated across the whole span.
  3. Residual — whatever remains after removing trend and seasonal.

It's a deliberately simple recipe, which makes it fast, transparent, and a great first look — not a forecasting model.

Additive or multiplicative? Read the seasonal swings

The choice between the two models isn't a coin flip — the data tells you:

  • Additive (Observed = Trend + Seasonal + Residual): use it when the seasonal swings are roughly the same size regardless of the level. The summer bump is "+30" whether the series sits at 100 or at 500.
  • Multiplicative (Observed = Trend x Seasonal x Residual): use it when the seasonal swings grow with the level. The summer bump is "+30%" — so it's small when the series is low and large when it's high.

The airline series is the textbook multiplicative case: its summer-to-winter swing is tiny in 1949 and huge by 1960. Watch what additive does with that.

Code Block
Python 3.13.2

The additive residual carries visible leftover structure — its swings are largest where the level sits farthest from its average — because an additive model insists the seasonal swing is a fixed size and can't represent a swing that grows with the level. That structure is the model telling you "wrong assumption." The multiplicative residual, by contrast, is a flat, featureless band around 1.0: the right model leaves behind nothing but noise.

The residual is your report card

A good decomposition leaves a residual that looks like structureless noise — no trend, no repeating wave, no funnel. Any leftover pattern means a component was mis-estimated or the wrong model was chosen. You'll use this exact instinct later to judge forecasting models: fit is good when the residuals are boring.

QuestionSelect one

A monthly series sits near 200 in its early years with a summer peak about +20 above trend, and near 1000 in its later years with a summer peak about +100 above trend. Additive or multiplicative?

Additive, because the peaks are always above the trend

Multiplicative, because the seasonal swing grows in proportion to the level (~10% of the level in both eras)

Neither; the series has no seasonality

It doesn't matter which you pick

The log trick: turn multiplicative into additive

There's an elegant shortcut. Taking the logarithm of a multiplicative series makes it additive, because log(T x S x R) = log T + log S + log R. A log compresses the big late-period swings and stretches the small early ones until they're the same size — exactly what an additive model wants.

Code Block
Python 3.13.2

Why we'll keep reaching for the log

Stabilizing a growing variance with a log shows up again and again: it's step one for the airline series before differencing, and it's why forecasters so often model log(sales) instead of sales. A log turns "multiplies by" into "adds to," which is the linear world our classical models live in.

Deseasonalizing: revealing the real growth

One of decomposition's most practical payoffs is seasonal adjustment — removing the seasonal component so the underlying trend isn't drowned out by the yearly wave. "Are sales really up, or is it just December?" is a deseasonalizing question.

Code Block
Python 3.13.2

The decomposition 'trend' is not a forecast

Just like a rolling mean, the trend component is a backward-looking smoother of data you already have — it even has NaNs at both ends where the moving average runs out. It describes the past; it does not project the future. To forecast, you still need a model (ARIMA is coming). Seeing the trend curve and imagining it "continuing" is the same misconception we flagged for moving averages, in a new costume.

Practice

Challenge
Python 3.13.2
Decompose and rebuild (additive reconstruction)

An additively-built monthly series y is loaded. Run an additive seasonal_decompose with period=12 and store it in dec. Then verify the additive identity by reconstructing the series:

  • recon = dec.trend + dec.seasonal + dec.resid
  • max_err = the maximum absolute difference between recon and y, computed only where recon is not NaN (the trend/residual are NaN at the ends).

For a true additive decomposition, the pieces must add back to the original, so max_err should be tiny (below 1e-6).

Challenge
Python 3.13.2
Detect multiplicative seasonality from the data

The decision between additive and multiplicative comes down to one question: do the seasonal swings grow with the level? Measure that directly on the airline air series.

For each calendar year, compute two numbers: the year's mean level and the year's range (its max minus its min — a proxy for the size of the seasonal swing). Then compute the correlation between the yearly means and the yearly ranges.

Produce:

  • yearly_mean — a Series of each year's mean (use air.resample("YE").mean())
  • yearly_range — a Series of each year's (max - min)
  • swing_level_corr — the correlation between yearly_mean and yearly_range (a float)
  • is_multiplicativeTrue if swing_level_corr > 0.7

A strong positive correlation means the swing scales with the level, so the series is multiplicative (which the airline data clearly is).

Check your understanding

QuestionSelect one

In the equation Observed = Trend + Seasonal + Residual, what should the residual ideally look like?

A clean repeating wave

A steady upward slope

Structureless noise with no visible trend, cycle, or funnel

Exactly zero everywhere

QuestionSelect one

Why does the classical seasonal_decompose trend component have NaN values at the very start and end of the series?

Because the data is missing there

Because the trend is a centered moving average over the seasonal period, and a full window isn't available at the edges

Because seasonality cannot be computed at the edges

Because of a bug in statsmodels

QuestionSelect one

Taking log of a series before an additive decomposition is equivalent to what?

Removing the trend entirely

A multiplicative decomposition of the original series, because log(T x S x R) = log T + log S + log R

Converting the data to percentages

Making the series perfectly stationary

Key takeaways

  • Decomposition splits a series into Trend + Seasonal + Residual (additive) or Trend x Seasonal x Residual (multiplicative).
  • Use additive when seasonal swings are a constant size, multiplicative when they grow with the level (e.g. AirPassengers).
  • seasonal_decompose(series, model=..., period=...) estimates the trend by a centered moving average (hence the end NaNs), the seasonal by averaging per cycle-position, and the residual as the remainder.
  • A log turns a multiplicative series additive and tames growing variance — a step we'll reuse for stationarity.
  • The residual is a diagnostic: a funnel or leftover wave means the wrong model or a mis-estimated component.
  • Deseasonalizing (subtract or divide out the seasonal) reveals the true underlying growth — but the trend component is still a smoother, not a forecast.

We keep bumping into the same words — the airline series' growing variance and persistent trend make it "non-stationary," and we keep promising to fix that. It's time to make that idea precise: what stationarity is, why classical models demand it, and how to test for it.

On this page