Reading the Echoes: Interpreting Autocorrelation (ACF) and Partial Autocorrelation (PACF) Plots
What the ACF and PACF measure, how 'tails off' vs 'cuts off' distinguishes AR from MA models, the visual cheat-sheet, why you must difference to stationarity before reading them, and how to propose ARIMA orders by eye.
A stationary series still has a personality: each value echoes the ones
before it in a particular way. The ACF and PACF plots are how we
listen to those echoes — and reading them is how forecasters propose
model orders without brute-force search. Master these two plots and the
otherwise-mysterious (p, d, q) of ARIMA becomes something you can often
read off a chart.
Two kinds of correlation with the past
- The ACF (Autocorrelation Function) measures the correlation between
the series and itself
ksteps earlier, for each lagk. It captures the total relationship at that lag — direct and indirect. - The PACF (Partial Autocorrelation Function) measures the correlation
at lag
kafter removing the influence of all the shorter lags in between. It captures only the direct relationship.
Why does "direct vs total" matter so much? Picture an AR(1) series where
each value is 0.7 × the previous one. Then y(t) is directly tied to
y(t-1). But y(t-1) is tied to y(t-2), so y(t) is indirectly
correlated with y(t-2) too — through the chain — even though there's no
direct link. The ACF at lag 2 sees that indirect echo and is nonzero;
the PACF at lag 2 strips the chain away and reveals the truth: no direct
lag-2 relationship.
The intuition in one line
ACF asks "how related are values k apart, however that relationship
arises?" PACF asks "how related are they directly, once the middlemen are
removed?" The gap between those two questions is exactly what lets us tell an
AR process from an MA process.
The cheat-sheet
Here is the single most useful table in classical forecasting. For a stationary series:
| Pattern | ACF | PACF | Read it as |
|---|---|---|---|
| AR(p) | tails off (gradual decay) | cuts off after lag p | order p = number of PACF spikes |
| MA(q) | cuts off after lag q | tails off (gradual decay) | order q = number of ACF spikes |
| ARMA(p,q) | tails off | tails off | neither cuts cleanly — try small p, q |
| white noise | all ~0 (within bands) | all ~0 | nothing to model |
| non-stationary | decays very slowly / lingers | (irrelevant) | difference first, then re-read |
The mnemonic that sticks
PACF identifies the AR order; ACF identifies the MA order. The plot that
cuts off sharply names its model, and the number of spikes before the cut
is the order. AR → PACF cut → p. MA → ACF cut → q.
Watch the cheat-sheet hold on data we built
Let's manufacture an AR(1), an AR(2), and an MA(1) with known coefficients, then confirm their ACF/PACF match the table exactly.
AR(1): PACF cuts off at lag 1
The ACF decays gradually (the indirect echoes fade out), while the PACF has a
single spike at lag 1 and collapses inside the significance band afterward.
One PACF spike → p = 1. Exactly the cheat-sheet.
AR(2): PACF cuts off at lag 2
MA(1): the mirror image — ACF cuts off at lag 1
The ACF of a stationary series tails off gradually, while the PACF shows two strong spikes (lags 1 and 2) and then drops inside the bands. Which model does the cheat-sheet suggest?
MA(2)
AR(2)
ARMA(2,2)
White noise
The significance bands: which spikes are real?
Every ACF/PACF plot draws a shaded band, roughly ±1.96/√n. Spikes inside
the band are statistically indistinguishable from zero — noise. Spikes
poking outside are "significant." When we say a plot "cuts off after lag
p," we mean the spikes are significant up through lag p and then duck
inside the band.
Don't over-read the bands
With many lags plotted, a few spikes will breach the band by chance alone (plot 40 lags and ~2 false alarms at the 5% level are expected). So don't seize on a lone significant spike at lag 17. Look for the overall pattern — a clean cut-off or a smooth decay — and treat an isolated far-out spike with suspicion unless it's at a meaningful lag (like the seasonal period).
You must difference FIRST
The cheat-sheet's fine print — "of the stationary series" — is not optional. On a non-stationary (trending) series, the ACF decays very slowly, lingering near 1 across many lags. That slow decay is not an AR signature; it's the plot screaming "I'm non-stationary — difference me." Reading model orders off an undifferenced series is the most common ACF/PACF mistake.
The raw ACF's slow linear-ish decay is a non-stationarity alarm, not a model order. Only after making the series stationary do the spikes mean what the cheat-sheet says — and notice the lingering spike near lag 12, the fingerprint of seasonality you'd address with a seasonal term.
You plot the ACF of a series and almost every lag out to 30 is large and positive, decaying only very gradually. What is the plot most likely telling you?
The series is an AR(30) process
The series is non-stationary; you should difference it before interpreting ACF/PACF
The series is white noise
The data has been over-differenced
Practice
Implement the autocorrelation yourself and check it against statsmodels. Given the loaded array y, write acf_at(lag) returning the lag-k autocorrelation using the standard definition (mean-centered, normalized by the lag-0 variance term):
For lag k: sum((y[t]-ybar)*(y[t-k]-ybar) for t=k..n-1) / sum((y[t]-ybar)**2 for t=0..n-1).
Fill a list my_acf with acf_at(k) for k = 0..5, and set matches to True if it agrees (within 1e-6) with statsmodels.tsa.stattools.acf(y, nlags=5). By construction acf_at(0) must equal 1.0.
A stationary series s was generated as a pure AR process. Use the PACF to identify its order. Compute the PACF at lags 1..8 (via statsmodels.tsa.stattools.pacf(s, nlags=8, method="ywm")) and the significance band ±1.96/sqrt(len(s)). Then produce:
sig_lags— the list of lags in 1..8 whose|PACF|exceeds the bandp_hat— the AR order, taken as the number of the leading lags (1 and 2) that are significant
The PACF here is significant at lags 1 and 2 and then cuts off (later lags fall within the band, give or take the occasional chance spike), so this is an AR(2) and p_hat should be 2.
Check your understanding
What does the partial autocorrelation at lag k measure that the ordinary autocorrelation does not?
The correlation including seasonal effects only
The direct correlation at lag k with the influence of the intermediate lags (1..k-1) removed
The correlation after differencing the series
The average of all autocorrelations up to lag k
For an MA(q) process, what do the ACF and PACF look like?
ACF tails off; PACF cuts off after lag q
ACF cuts off after lag q; PACF tails off
Both cut off after lag q
Both tail off
Why must you difference a trending series to stationarity before reading its ACF/PACF for model orders?
Because statsmodels refuses to plot non-stationary data
On a non-stationary series the ACF decays very slowly and stays high across many lags, masking the real short-lag structure the cheat-sheet relies on
Because differencing always improves forecast accuracy
Because the PACF cannot be computed otherwise
Key takeaways
- The ACF measures total correlation at each lag (direct + indirect); the PACF measures only the direct correlation, with intermediate lags removed.
- Cheat-sheet: AR(p) → ACF tails off, PACF cuts off at
p. MA(q) → ACF cuts off atq, PACF tails off. ARMA → both tail off. - Mnemonic: PACF → AR order; ACF → MA order. The plot that cuts off names the model; the number of spikes is the order.
- Spikes inside the
±1.96/√nband are noise; don't over-read a lone far-out spike (some are false alarms). - Always difference to stationarity first — a slowly-decaying ACF means "non-stationary," not "high-order AR."
You can now propose (p, d, q) by eye: d from how many differences reach
stationarity, p from the PACF cut-off, q from the ACF cut-off. Time to
hand those orders to an actual model and make a forecast — AR, MA, and their
union, ARIMA.
Detrending Data: Mastering Differencing to Make a Series Stationary
How subtracting consecutive values removes a trend, why the difference of a line is a constant, seasonal differencing to kill a yearly cycle, the over-differencing trap, and integration as the inverse — the 'I' in ARIMA.
Classic Forecasting: A Step-by-Step Guide to AR, MA, and ARIMA Models
Building AR, MA, ARMA, and ARIMA models with statsmodels — what each part means, why an MA model is not a moving average, choosing (p,d,q) from ACF/PACF, fitting and forecasting with widening uncertainty, and reading residual diagnostics.