Dataslope logoDataslope

Numerical Stability and Conditioning

The difference between a hard problem and a fragile algorithm — and why both can ruin your answer

The previous chapter showed that every floating-point operation carries a tiny error. This chapter asks the deeper question: what happens when you chain millions of them together?

The answer depends on two independent properties:

  • Conditioning is a property of the problem. A problem is well-conditioned if small input changes cause small output changes. Ill-conditioned problems amplify error no matter how cleverly you compute.
  • Stability is a property of the algorithm. A stable algorithm produces an answer that is the correct answer to a slightly perturbed problem. An unstable algorithm produces garbage even on well-conditioned problems.

You always need to ask both questions.

Conditioning, intuitively

If ff is a function and you evaluate f(x+δx)f(x + \delta x) for small δx\delta x, how big is f(x+δx)f(x)f(x + \delta x) - f(x)? The ratio of relative output change to relative input change is the condition number of ff at xx.

For evaluating ff:

κf(x)xf(x)f(x)\kappa_f(x) \approx \left| \frac{x f'(x)}{f(x)} \right|

For a linear system Ax=bA x = b:

κ(A)=AA1\kappa(A) = \|A\| \, \|A^{-1}\|

A condition number near 11 is excellent. A condition number near 1/ϵ10161/\epsilon \approx 10^{16} means no algorithm can give you more than a couple of correct bits.

Code Block
Python 3.13.2

Near x=πx = \pi, sin(x)\sin(x) is nearly zero, so the relative output change for a tiny input change is enormous. Evaluating sin\sin near a root is intrinsically ill-conditioned — there is no clever algorithm that fixes this.

Stability, intuitively

An algorithm is backward stable if the result it returns is the exact answer to a slightly different problem. The standard example is Gaussian elimination with partial pivoting: the answer you get to Ax=bAx = b is the exact answer to (A+E)x=b(A + E)x = b, with E/A=O(ϵ)\|E\| / \|A\| = O(\epsilon). That is the best you could possibly hope for at finite precision.

An algorithm is unstable if it can produce arbitrarily wrong answers on well-conditioned problems. The classic example: solving a 2×22\times2 linear system with Cramer's rule on a matrix whose determinant is near zero but well-conditioned.

Let us watch both happen.

Code Block
Python 3.13.2

The condition number κ(A)4×1013\kappa(A) \approx 4 \times 10^{13} tells us we have lost roughly 13 digits of precision no matter what. solve recovers the remaining 3 digits cleanly; inv is acceptable but worse; cramer is the worst of the three because it does extra cancellation. None of them can do better than κ\kappa allows.

A canonical instability: forward vs backward recursion

Consider the integrals

In=01xnx+10dxI_n = \int_0^1 \frac{x^n}{x + 10} \, dx

These satisfy a tidy recurrence In=1n10In1I_n = \frac{1}{n} - 10 \, I_{n-1} with I0=ln(1.1)I_0 = \ln(1.1). The forward recurrence is numerically unstable — every step multiplies the error by 10-10 — even though the integrals themselves are perfectly well-defined small positive numbers.

Code Block
Python 3.13.2

The recursion produces negative values, then enormous oscillations — all wrong. The integrand is positive and the integral has to be less than 1/(10n)1/(10n). The error introduced at step 1 is multiplied by 10 at every later step.

The cure: run the recurrence backwards, where the multiplier is 1/101/10 instead of 1010. Start from a guess at large nn — even a wildly wrong guess — and iterate downward. The error shrinks by 1010 per step.

Code Block
Python 3.13.2

Same equation. Same data. Two algorithms; one is useless, the other is correct to all 16 digits. This is the entire content of "numerical stability."

Norms, error bounds, and "how wrong can it be?"

In numerical linear algebra, we measure error with norms. The most useful identities are:

QuantitySymbolUse
Vector 2-normv2\lVert v \rVert_2Euclidean length
Matrix induced 2-normA2\lVert A \rVert_2Largest stretch factor
Frobenius normAF\lVert A \rVert_FSquare root of sum of squares
Condition numberκ(A)=AA1\kappa(A) = \lVert A \rVert \lVert A^{-1} \rVertError amplification

For Ax=bA x = b, the relative error in xx is bounded by:

x^xx    κ(A)(δAA+δbb)\frac{\lVert \hat x - x \rVert}{\lVert x \rVert} \;\le\; \kappa(A) \left( \frac{\lVert \delta A \rVert}{\lVert A \rVert} + \frac{\lVert \delta b \rVert}{\lVert b \rVert} \right)

A condition number of 101410^{14} and a perturbation of 101610^{-16} gives a relative output error of 10210^{-2} — two correct digits. That number is your truth ceiling, set by the problem, not the algorithm.

Code Block
Python 3.13.2

Watch how the relative error scales with κ\kappa. Once κ(H)>1016\kappa(H) > 10^{16}, you have lost all your precision.

Designing stable algorithms

A handful of rules turn most numerical code from fragile to robust.

  1. Avoid subtracting nearly equal numbers. Rewrite the formula if you can (the 1 - cos x example), or compute in higher precision.
  2. Sum in increasing order of magnitude, or use a compensated sum like Kahan summation, when adding many tiny contributions.
  3. Prefer factorizations over inverses. np.linalg.solve(A, b) beats np.linalg.inv(A) @ b every time.
  4. Scale your inputs. If one column of your matrix has values in [109][10^{-9}] and another in [109][10^{9}], condition number explodes for no real reason; rescale them.
  5. Pick algorithms with proven backward error bounds. LAPACK, SciPy, Eigen, and PETSc routines all come with these. Custom code usually does not.

Here is a beautiful illustration of point 2 — Kahan summation recovers most of the digits a naive sum throws away.

Code Block
Python 3.13.2

NumPy's .sum() uses pairwise summation, which is much better than a naive left-to-right loop and almost as good as Kahan, for a fraction of the cost. Trust the library; reach for Kahan only when you really need it.

Check your understanding

QuestionSelect one

A problem has condition number κ1012\kappa \approx 10^{12}. You compute the answer with a backward-stable algorithm at 64-bit precision. Roughly how many correct decimal digits should you expect?

All 16

About 12

About 4

Zero — the problem is unsolvable

QuestionSelect one

Which statement most cleanly distinguishes conditioning from stability?

They mean the same thing

Conditioning is a property of the problem; stability is a property of the algorithm. An ill-conditioned problem cannot be saved by a clever algorithm; an unstable algorithm can ruin a perfectly well-conditioned problem

Conditioning is about speed and stability is about accuracy

Both are properties of the algorithm

On this page