Detrending Data: Mastering Differencing to Make a Series Stationary

The Augmented Dickey-Fuller test kept handing us the same prescription: difference the series. Differencing is the workhorse transformation that turns a trending, non-stationary series into a stationary one — and it's the operation hiding behind the d in ARIMA. This page is about why it works, how far to take it, and the classic mistake of taking it one step too far.

What differencing is

First differencing replaces each value with the change since the previous value:

y'(t) = y(t) - y(t-1)

You stop modeling the level of the series and start modeling its step-to-step change. That single shift in perspective is what removes a trend — because a series can wander far from its starting level while its changes stay small and well-behaved.

y.diff() is exactly y - y.shift(1). The first entry is NaN because the very first observation has no predecessor — differencing always costs you one row per difference.

Why differencing kills a trend

Here's the intuition made exact: the difference of a straight line is a constant. If a series climbs by the same amount every step, then its changes are all identical — a flat, stationary series. Differencing converts "a steadily rising level" into "a steady rate," and a steady rate has no trend.

A linear trend vanishes after one difference; a curved (quadratic) trend needs two. In practice almost every real series is stationary after d = 1 or d = 2 differences — you rarely need more.

The trend is gone after one difference, and the ADF p-value collapses. But look closely: a seasonal wobble remains. First differencing removes the trend, not the seasonality — those need a different cut.

Seasonal differencing: subtract one season ago

To remove a seasonal cycle of period m, subtract the value from one full season earlier: y(t) - y(t-m). For monthly data with a yearly cycle that's y.diff(12) — each month minus the same month last year. The repeating pattern cancels against its own copy.

This is exactly what SARIMA automates

The recipe "log, then a regular difference, then a seasonal difference" is what a seasonal ARIMA (SARIMA) encodes in its orders. We'll keep our hands on the wheel with plain ARIMA and pre-difference manually, but know that the d (regular differences) and D (seasonal differences) parameters are just this page, parameterized. Differencing isn't a side-trick — it's half of what ARIMA is.

The over-differencing trap

If one difference is good, are two always better? No. Differencing a series that's already stationary doesn't help — it injects artificial structure and inflates the variance. The fingerprint of over-differencing is a strong negative lag-1 autocorrelation (around -0.5) and a variance that went up instead of down.

Use the minimum number of differences

More differencing is not safer. Each unnecessary difference adds noise, raises the variance, and stamps a spurious -0.5 lag-1 autocorrelation onto the series that your model will then waste effort "explaining." The goal is the smallest d that makes the series stationary (usually 0, 1, or 2) — not the d that gives the tiniest ADF p-value. If a series is already stationary, the correct d is 0.

QuestionSelect one

After differencing, a series shows a variance higher than before and a lag-1 autocorrelation of about -0.5. What likely happened?

The series became more stationary; this is ideal

It was over-differenced — differencing an already-stationary series injected a spurious negative autocorrelation and inflated the variance

The seasonal period is wrong

The data must be re-logged

Undoing it: integration (the 'I' in ARIMA)

If you difference a series to model it, you must eventually undo the difference to get a forecast back on the original scale. The inverse of differencing is a cumulative sum plus the starting value — a process called integration. That's literally what the "I" in ARIMA stands for: Integrated, meaning the model works on a differenced series and integrates its forecasts back up.

QuestionSelect one

What does the "I" (Integrated) in ARIMA refer to?

That the model integrates data from multiple sources

That the model is fit on a differenced series and its forecasts are integrated (cumulatively summed) back to the original scale

That it numerically integrates a differential equation

That all components are combined into one number

Practice

The airline air series is loaded. Without using .diff(), build the seasonal difference at period 12 using shift: seasonal(t) = air(t) - air(t-12). Call it manual_sdiff.

Then confirm it equals the built-in air.diff(12) by setting matches to True if they're equal everywhere both are defined (drop NaNs before comparing).

A series s is loaded that is stationary after exactly one difference. For d in 0, 1, 2, 3, compute the ADF p-value of the d-times-differenced series and the lag-1 autocorrelation. Choose best_d: the smallest d whose ADF p-value is below 0.05 (the minimum differences that achieve stationarity — do NOT just pick the smallest p-value).

Also set over_diff_ac to the lag-1 autocorrelation at d = 2 (one difference too many), which should be clearly negative — the over-differencing signature.

Check your understanding

QuestionSelect one

Why does taking the first difference of a series remove a linear trend?

Because it deletes the largest values

Because the difference of a straight line is a constant, so a steadily-rising level becomes a flat (trend-free) series of changes

Because it converts the data to percentages

Because it makes all values positive

QuestionSelect one

Your monthly series has both a trend and a strong yearly cycle. First differencing removed the trend but a 12-month wobble remains. What should you do?

Difference again with diff() (a second regular difference)

Apply a seasonal difference, diff(12), to subtract each month from the same month a year earlier

Increase the rolling-window size

Re-run the ADF test until it passes

QuestionSelect one

A series tests stationary (ADF p = 0.01) with no differencing. What is the appropriate d?

d = 0 — it's already stationary, so differencing would only add noise

d = 1, because every series needs at least one difference

d = 2, to be safe

Whatever gives the smallest p-value

Key takeaways

Differencing (y.diff() = y - y.shift(1)) models the change rather than the level, which removes a trend because the difference of a line is a constant.
A linear trend needs one difference; a curved trend needs two — rarely more.
Seasonal differencing (y.diff(m)) removes a cycle of period m (diff(12) for monthly-yearly). The airline recipe is log → diff(1) → diff(12).
This is the d (and seasonal D) of ARIMA, made explicit.
Over-differencing is real: it inflates variance and stamps a -0.5 lag-1 autocorrelation. Use the minimum d that reaches stationarity; if already stationary, d = 0.
Integration (cumulative sum + starting value) inverts differencing — the "I" in ARIMA, used to put forecasts back on the original scale.

We now have a stationary series. But "stationary" only means the rules are fixed — it doesn't tell us what those rules are. To choose a model we need to read the series' internal echoes: how strongly each value relates to the ones before it. That's the job of the ACF and PACF plots.

What differencing is

Why differencing kills a trend

Seasonal differencing: subtract one season ago

The over-differencing trap

Undoing it: integration (the 'I' in ARIMA)

Practice

Check your understanding

Detrending Data: Mastering Differencing to Make a Series Stationary

What differencing is

Why differencing kills a trend

Seasonal differencing: subtract one season ago

The over-differencing trap

Undoing it: integration (the 'I' in ARIMA)

Practice

Check your understanding

On this page