Approximation and Error

Truncation, discretization, and the art of trading one source of error for another

Most quantities a scientist wants to compute are limits — derivatives, integrals, infinite series, fixed points. A computer can only afford a finite version of a limit. Numerical analysis is largely the study of how to choose that finite version so the answer is good enough.

This page introduces the two flavors of error that show up in every approximation method (truncation error from "stopping early" and round-off error from finite precision), how they interact, and how to think about the order of convergence of a method.

Truncation vs round-off

Pick any approximation parameter $h$ that controls the "resolution" — a step size, the number of terms in a series, the spacing of a grid. Then the total error of a method usually has two contributions:

$E(h) \;=\; \underbrace{C \, h^p}_{\text{truncation}} \;+\; \underbrace{D / h^q}_{\text{round-off}}$

Truncation error dominates when $h$ is large. Use a smaller $h$ to fix it.
Round-off error dominates when $h$ is small. Use a larger $h$ to fix it.

You cannot escape both at once. The minimum total error is the sweet spot, and finding it is a recurring design problem.

The cleanest example is numerical differentiation, which we already glimpsed in the floating-point chapter. Let us draw the full picture.

The curves form a "V". The left side of each V is truncation: a straight slope of $1$ for the forward difference, $2$ for the centered one. The right side is round-off: the errors grow as $h$ shrinks because we are dividing two nearly-equal floats by a nearly-zero $h$ . The bottom of the V is the best you can do.

Theoretical analysis says the sweet spots are roughly:

Method	Truncation order	Optimal $h$	Optimal error
Forward difference $\frac{f(x+h)-f(x)}{h}$	$O(h)$	$\sqrt{\epsilon}$	$\sqrt{\epsilon} \approx 10^{-8}$
Centered difference $\frac{f(x+h)-f(x-h)}{2h}$	$O(h^2)$	$\sqrt[3]{\epsilon}$	$\epsilon^{2/3} \approx 10^{-11}$
Richardson extrapolation	$O(h^4)$ and higher	larger	even smaller

A higher-order method gives you more accuracy and lets you use a larger $h$ — a double win, paid for with a slightly more complex update formula.

Convergence and order

A numerical method converges of order $p$ if its error shrinks like $h^p$ as $h \to 0$ . Higher $p$ means dramatically faster convergence: halving $h$ for an order-1 method halves the error, but halves an order-4 method's error by a factor of $16$ .

We can empirically measure convergence order. Compute the error at two step sizes $h_1$ and $h_2 = h_1 / 2$ :

$p \approx \log_2 \frac{E(h_1)}{E(h_2)}$

The third column hovers around $4$ , because Simpson's rule is $O(h^4)$ . Doubling the number of intervals divides the error by about $16$ until round-off finally takes over near $n \approx 512$ .

Convergence rate matters more than constants

A first-order method needs $10^6$ steps to gain $6$ digits of accuracy. A fourth-order method needs only $30$ . Order beats constants for any problem accurate enough to matter.

This is why SciPy's solve_ivp defaults to RK45 (a 4th–5th order Runge–Kutta pair) rather than to plain Euler.

Series acceleration: when convergence is slow

Some series converge mathematically but agonizingly slowly. The Leibniz formula

$\frac{\pi}{4} \;=\; 1 - \frac{1}{3} + \frac{1}{5} - \frac{1}{7} + \cdots$

needs $10^6$ terms to get $6$ digits right. That is wasteful when series acceleration techniques can squeeze the same digits out of a handful of terms.

The plain partial sum after 30 terms is off by about $0.03$ . The same 30 terms fed into a repeated-average acceleration give $\pi$ to ten digits. The numerical method is far more important than the number of terms.

Adaptive methods: choose $h$ at runtime

When the answer changes quickly in one region and slowly in another, a fixed step size is wasteful. Adaptive methods estimate the local error on each step and shrink or grow $h$ to keep the error near a user-specified tolerance.

SciPy's quad (general 1-D integration) is adaptive by default. Try integrating something with a sharp spike.

quad converges fast because it concentrates samples near the spike. The uniform Simpson rule wastes most of its samples on the flat regions. Use adaptive methods unless you have a reason not to.

A short challenge

Implement estimate_order(errors) that takes a list of errors produced at step sizes $h, h/2, h/4, \dots$ and returns an estimate of the convergence order $p$ such that error $\propto h^p$.

For consecutive errors $E_1, E_2$ at step sizes $h$ and $h/2$, $p \approx \log_2(E_1 / E_2)$. Return the mean of those estimates across consecutive pairs.

Check your understanding

QuestionSelect one

A finite-difference derivative shows error decreasing as $h$ shrinks until $h \approx 10^{-6}$ , after which the error starts to grow. What is happening?

The function is non-differentiable

Truncation error is shrinking but round-off error is growing; below the optimal $h$ , round-off dominates because $f(x+h) - f(x)$ loses most of its significant digits to cancellation

The CPU is malfunctioning

The numpy library is broken

QuestionSelect one

You have two numerical integrators. Method A has truncation error $O(h)$ ; method B has truncation error $O(h^4)$ . To achieve the same accuracy that method B reaches with $32$ steps, roughly how many steps does method A need?

About 32

About 128

About $32^4 \approx 10^6$

About $32 \cdot 4 = 128$