Dataslope logoDataslope

Approximation and Error

Truncation, discretization, and the art of trading one source of error for another

Most quantities a scientist wants to compute are limits — derivatives, integrals, infinite series, fixed points. A computer can only afford a finite version of a limit. Numerical analysis is largely the study of how to choose that finite version so the answer is good enough.

This page introduces the two flavors of error that show up in every approximation method (truncation error from "stopping early" and round-off error from finite precision), how they interact, and how to think about the order of convergence of a method.

Truncation vs round-off

Pick any approximation parameter hh that controls the "resolution" — a step size, the number of terms in a series, the spacing of a grid. Then the total error of a method usually has two contributions:

E(h)  =  Chptruncation  +  D/hqround-offE(h) \;=\; \underbrace{C \, h^p}_{\text{truncation}} \;+\; \underbrace{D / h^q}_{\text{round-off}}

  • Truncation error dominates when hh is large. Use a smaller hh to fix it.
  • Round-off error dominates when hh is small. Use a larger hh to fix it.

You cannot escape both at once. The minimum total error is the sweet spot, and finding it is a recurring design problem.

The cleanest example is numerical differentiation, which we already glimpsed in the floating-point chapter. Let us draw the full picture.

Code Block
Python 3.13.2

The curves form a "V". The left side of each V is truncation: a straight slope of 11 for the forward difference, 22 for the centered one. The right side is round-off: the errors grow as hh shrinks because we are dividing two nearly-equal floats by a nearly-zero hh. The bottom of the V is the best you can do.

Theoretical analysis says the sweet spots are roughly:

MethodTruncation orderOptimal hhOptimal error
Forward difference f(x+h)f(x)h\frac{f(x+h)-f(x)}{h}O(h)O(h)ϵ\sqrt{\epsilon}ϵ108\sqrt{\epsilon} \approx 10^{-8}
Centered difference f(x+h)f(xh)2h\frac{f(x+h)-f(x-h)}{2h}O(h2)O(h^2)ϵ3\sqrt[3]{\epsilon}ϵ2/31011\epsilon^{2/3} \approx 10^{-11}
Richardson extrapolationO(h4)O(h^4) and higherlargereven smaller

A higher-order method gives you more accuracy and lets you use a larger hh — a double win, paid for with a slightly more complex update formula.

Convergence and order

A numerical method converges of order pp if its error shrinks like hph^p as h0h \to 0. Higher pp means dramatically faster convergence: halving hh for an order-1 method halves the error, but halves an order-4 method's error by a factor of 1616.

We can empirically measure convergence order. Compute the error at two step sizes h1h_1 and h2=h1/2h_2 = h_1 / 2:

plog2E(h1)E(h2)p \approx \log_2 \frac{E(h_1)}{E(h_2)}

Code Block
Python 3.13.2

The third column hovers around 44, because Simpson's rule is O(h4)O(h^4). Doubling the number of intervals divides the error by about 1616 until round-off finally takes over near n512n \approx 512.

Convergence rate matters more than constants

A first-order method needs 10610^6 steps to gain 66 digits of accuracy. A fourth-order method needs only 3030. Order beats constants for any problem accurate enough to matter.

This is why SciPy's solve_ivp defaults to RK45 (a 4th–5th order Runge–Kutta pair) rather than to plain Euler.

Series acceleration: when convergence is slow

Some series converge mathematically but agonizingly slowly. The Leibniz formula

π4  =  113+1517+\frac{\pi}{4} \;=\; 1 - \frac{1}{3} + \frac{1}{5} - \frac{1}{7} + \cdots

needs 10610^6 terms to get 66 digits right. That is wasteful when series acceleration techniques can squeeze the same digits out of a handful of terms.

Code Block
Python 3.13.2

The plain partial sum after 30 terms is off by about 0.030.03. The same 30 terms fed into a repeated-average acceleration give π\pi to ten digits. The numerical method is far more important than the number of terms.

Adaptive methods: choose hh at runtime

When the answer changes quickly in one region and slowly in another, a fixed step size is wasteful. Adaptive methods estimate the local error on each step and shrink or grow hh to keep the error near a user-specified tolerance.

SciPy's quad (general 1-D integration) is adaptive by default. Try integrating something with a sharp spike.

Code Block
Python 3.13.2

quad converges fast because it concentrates samples near the spike. The uniform Simpson rule wastes most of its samples on the flat regions. Use adaptive methods unless you have a reason not to.

A short challenge

Challenge
Python 3.13.2
Estimate convergence order

Implement estimate_order(errors) that takes a list of errors produced at step sizes $h, h/2, h/4, \dots$ and returns an estimate of the convergence order $p$ such that error $\propto h^p$.

For consecutive errors $E_1, E_2$ at step sizes $h$ and $h/2$, $p \approx \log_2(E_1 / E_2)$. Return the mean of those estimates across consecutive pairs.

Check your understanding

QuestionSelect one

A finite-difference derivative shows error decreasing as hh shrinks until h106h \approx 10^{-6}, after which the error starts to grow. What is happening?

The function is non-differentiable

Truncation error is shrinking but round-off error is growing; below the optimal hh, round-off dominates because f(x+h)f(x)f(x+h) - f(x) loses most of its significant digits to cancellation

The CPU is malfunctioning

The numpy library is broken

QuestionSelect one

You have two numerical integrators. Method A has truncation error O(h)O(h); method B has truncation error O(h4)O(h^4). To achieve the same accuracy that method B reaches with 3232 steps, roughly how many steps does method A need?

About 32

About 128

About 32410632^4 \approx 10^6

About 324=12832 \cdot 4 = 128

On this page