Numerical Stability Guide — Floating Point Analyzer

Understand why floating point arithmetic causes errors, compute condition numbers for any matrix, visualize catastrophic cancellation, detect ill-conditioned systems, and compare single vs double precision. Essential for anyone writing numerical code, training neural networks, or solving linear systems.

1. Condition Number Calculator

Enter a square matrix to compute its condition number, singular values, and stability assessment. The condition number κ(A) = σmax / σmin tells you how many digits of accuracy you can lose when solving Ax = b.


2. Floating Point Precision Demo

See how floating point arithmetic introduces errors. Enter an expression to compare the floating point result with the mathematically exact result.


3. Catastrophic Cancellation Detector

Enter two numbers that are close in value. The tool shows how many significant digits survive after subtraction, and visualizes the digit-by-digit cancellation.



4. Single vs Double Precision Comparison

Compare how the same computation performs in simulated single precision (float32, ~7 digits) vs double precision (float64, ~16 digits). See where single precision breaks down.


5. Stability Tips Per Operation

Click any operation to see its stability profile, common pitfalls, and recommended practices.

Why Numerical Stability Matters

Every number stored in a computer is an approximation. The real number line is continuous and infinite, but a 64-bit floating point number can only represent about 264 distinct values. The gap between representable numbers varies: near zero, values are packed tightly together (the smallest positive double is about 5 × 10-324), but near large numbers, the gaps are enormous (near 10308, adjacent representable values differ by about 10292). This non-uniform spacing is the root cause of most numerical instability.

IEEE 754 double precision floating point uses 64 bits: 1 sign bit, 11 exponent bits, and 52 mantissa bits. This gives approximately 15-16 significant decimal digits of precision. The machine epsilon, approximately 2.22 × 10-16, represents the smallest relative perturbation that can be detected: 1 + ε ≠ 1, but 1 + ε/2 = 1 in floating point. Every arithmetic operation (addition, subtraction, multiplication, division) introduces a relative error of at most ε. Over thousands or millions of operations, these tiny errors can accumulate to produce results that are completely wrong.

The Condition Number: Quantifying Sensitivity

The condition number of a problem measures the inherent sensitivity of the output to perturbations in the input. For a linear system Ax = b, the condition number κ(A) = ||A|| · ||A-1|| quantifies the worst-case amplification of relative error: if the input b is perturbed by a relative amount δ, the output x can be perturbed by as much as κ(A) · δ. The condition number is a property of the mathematical problem, not the algorithm used to solve it. No algorithm can produce a result more accurate than the condition number allows.

For a matrix, the condition number computed using the 2-norm equals the ratio of the largest to smallest singular value: κ2(A) = σmax / σmin. A well-conditioned matrix has a condition number near 1 (the identity matrix has κ = 1). An ill-conditioned matrix has a large condition number. The infamous Hilbert matrix, where H(i,j) = 1/(i+j-1), is a classic example: the 5×5 Hilbert matrix has a condition number of about 476,607, and the 10×10 Hilbert matrix has a condition number of about 1.6 × 1013, meaning you lose 13 of your 16 digits of accuracy when solving a linear system with it.

Catastrophic Cancellation: The Silent Killer

Subtraction of nearly equal numbers is the most dangerous basic operation in floating point arithmetic. When two numbers agree in their first k digits, subtraction eliminates those k digits, leaving only the remaining (16-k) digits. If the numbers agree in 14 of 16 digits, only 2 significant digits survive. This is called catastrophic cancellation, and it appears in many common formulas.

The quadratic formula x = (-b ± sqrt(b² - 4ac)) / (2a) suffers from catastrophic cancellation when b² is much larger than 4ac. In that case, sqrt(b² - 4ac) ≈ |b|, and one of the two roots involves subtracting two nearly equal numbers. The fix is to compute the root that avoids cancellation first, then use the identity x1 · x2 = c/a to get the other root. Similarly, the naive variance formula Var = E[X²] - (E[X])² is numerically catastrophic when the mean is large relative to the standard deviation, because E[X²] and (E[X])² are nearly equal large numbers. Welford's online algorithm avoids this entirely by computing the variance incrementally.

Stability of Matrix Operations

Different matrix decomposition algorithms have different stability properties. Gaussian elimination without pivoting is unstable: it can amplify rounding errors exponentially. Gaussian elimination with partial pivoting (LU with row permutations) is stable in practice, though pathological examples exist where it grows exponentially (such growth has never been observed in practice for random matrices). QR decomposition via Householder reflections is unconditionally backward stable: the computed Q and R satisfy (A + E) = QR where ||E|| is at most a small multiple of ε ||A||. SVD computed by the Golub-Kahan bidiagonalization is also backward stable.

For eigenvalue problems, the symmetric eigenvalue problem (A = AT) is well-conditioned: each eigenvalue has a condition number equal to 1. The non-symmetric eigenvalue problem can be arbitrarily ill-conditioned: eigenvalues of non-symmetric matrices can be extremely sensitive to perturbations. The condition number of an eigenvalue λ is 1 / |yTx|, where x and y are the right and left eigenvectors. When x and y are nearly orthogonal (as happens for defective or near-defective matrices), the eigenvalue is extremely sensitive.

Numerical Stability in Machine Learning

Machine learning practitioners encounter numerical stability issues constantly, often without realizing it. The softmax function softmax(x)i = exp(xi) / sum(exp(xj)) overflows when any xi exceeds about 709 (the limit for double precision exp). The standard fix is to subtract max(x) from all components before exponentiating: this does not change the mathematical result but prevents overflow. The log-sum-exp function, log(sum(exp(x))), uses the same trick: LSE(x) = max(x) + log(sum(exp(x - max(x)))).

Cross-entropy loss, -sum(y log(p)), produces -infinity when any predicted probability p = 0. Clipping p to [ε, 1-ε] prevents this. In practice, frameworks like PyTorch combine softmax and cross-entropy into a single numerically stable operation (LogSoftmax + NLLLoss) that avoids computing intermediate probabilities. Gradient clipping (clamping gradient norms to a maximum value) prevents exploding gradients in deep networks and recurrent architectures. Batch normalization, layer normalization, and weight normalization all improve numerical stability by keeping activations and gradients in a reasonable range throughout training.

Best Practices for Numerically Stable Code

First, avoid subtracting nearly equal numbers whenever possible. Reformulate expressions to use addition, multiplication, or mathematically equivalent forms that avoid cancellation. Second, use compensated summation (Kahan summation) when accumulating many small numbers: it maintains a running compensation term that captures the rounding error of each addition. Third, always use pivoting in Gaussian elimination (partial pivoting at minimum, complete pivoting for maximum safety). Fourth, prefer QR or SVD over normal equations (ATAx = ATb) for least-squares problems, because normal equations square the condition number. Fifth, compute condition numbers before trusting your results: if κ(A) × ε is close to 1, your answer has no reliable digits. Sixth, use appropriate precision: float64 for scientific computing, float32 or float16 with loss scaling for deep learning inference, and arbitrary precision libraries when double precision is insufficient.

Frequently Asked Questions

What is the condition number of a matrix and why does it matter?

The condition number κ(A) = σmax / σmin measures how much errors in the input are amplified in the output when solving Ax = b. A condition number near 1 is stable; κ = 10k means you lose approximately k digits of accuracy. It is the single most important indicator of numerical reliability.

What is catastrophic cancellation in floating point arithmetic?

When subtracting two nearly equal numbers, the leading digits cancel and only the least significant (least accurate) digits remain. For example, 1.234567890123456 - 1.234567890123400 = 5.6e-14, which has only 2 significant digits despite 16-digit inputs. Fixes include reformulating expressions to avoid subtraction of similar values.

What is machine epsilon and how does it relate to floating point precision?

Machine epsilon is the smallest ε such that 1 + ε ≠ 1 in floating point. For float64: ε ≈ 2.22e-16 (~16 digits). For float32: ε ≈ 1.19e-7 (~7 digits). It sets the fundamental precision limit: no algorithm can beat ε × κ(A) accuracy for a problem with condition number κ(A).

How do you detect and fix ill-conditioned systems?

Detect via SVD: if σmax / σmin > 106, the system is ill-conditioned. Fixes: Tikhonov regularization (add λI), preconditioning, higher precision arithmetic, QR or SVD instead of LU, and truncated SVD to discard near-zero singular values.

What are the most common numerical stability pitfalls in machine learning?

Log-sum-exp overflow (fix: subtract max), softmax overflow (fix: subtract max), log(0) in cross-entropy (fix: clamp probabilities), vanishing/exploding gradients (fix: normalization + gradient clipping), ill-conditioned covariance matrices (fix: regularization), and float16 gradient underflow (fix: loss scaling).

Related Tools

Built by Michael Lip. Try the ML3X Matrix Calculator for interactive step-by-step solutions.