Numerical Stability Guide — Floating Point Analyzer

Q: What is the condition number of a matrix and why does it matter?

The condition number of a matrix A measures how sensitive the solution of the linear system Ax = b is to small changes in A or b. It is defined as cond(A) = ||A|| * ||A-inverse||, or equivalently as the ratio of the largest singular value to the smallest singular value: cond(A) = sigma_max / sigma_min. A condition number near 1 means the matrix is well-conditioned and the solution is stable. A large condition number (say 10^6 or more) means the matrix is ill-conditioned: a tiny perturbation in the input can cause a massive change in the output. When solving Ax = b with an ill-conditioned matrix, you can lose approximately log10(cond(A)) digits of accuracy. For example, if cond(A) = 10^8 and you use double precision (about 16 digits), you may only have 8 correct digits in the solution. The condition number is the single most important indicator of whether a numerical computation will produce reliable results.

Q: What is catastrophic cancellation in floating point arithmetic?

Catastrophic cancellation occurs when two nearly equal floating point numbers are subtracted, causing a dramatic loss of significant digits. For example, if a = 1.234567890123456 and b = 1.234567890123400, both stored with about 16 digits of precision, then a - b = 0.000000000000056 = 5.6e-14. The result has only 2 significant digits even though the inputs had 16 each. This happens because the leading digits cancel, leaving only the least significant (and least accurate) digits. Catastrophic cancellation is one of the most common sources of numerical instability. It appears in the quadratic formula when b^2 is much larger than 4ac, in computing variance using the naive formula sum(x^2)/n - (sum(x)/n)^2, in finite difference approximations with too-small step sizes, and in evaluating functions near removable singularities. The fix is usually to reformulate the computation: use the stable quadratic formula, use Welford's algorithm for variance, or use analytically equivalent but numerically different expressions.

Q: What is machine epsilon and how does it relate to floating point precision?

Machine epsilon is the smallest number epsilon such that 1 + epsilon is distinguishable from 1 in floating point arithmetic. For IEEE 754 double precision (64-bit), machine epsilon is approximately 2.22e-16 (about 16 decimal digits of precision). For single precision (32-bit), it is approximately 1.19e-7 (about 7 decimal digits). Machine epsilon represents the relative rounding error: any real number x is represented as fl(x) = x(1 + delta) where |delta| <= epsilon. This means every floating point operation introduces a relative error of at most epsilon. Over many operations, these errors accumulate. In the best case (well-conditioned problems), the final error is proportional to n * epsilon where n is the number of operations. In the worst case (ill-conditioned problems), the error is amplified by the condition number. Machine epsilon sets the fundamental precision limit: no algorithm can achieve accuracy better than epsilon times the condition number, regardless of how clever the algorithm is.

Q: How do you detect and fix ill-conditioned systems?

Detection: Compute the condition number using SVD (sigma_max / sigma_min). A condition number above 10^6 is a warning; above 10^12 means the system is effectively singular in double precision. Alternatively, check if the determinant is very small relative to the matrix entries, but the condition number is more reliable. Other signs include: the solution changes dramatically with small perturbations, iterative solvers converge slowly or oscillate, and residuals are small but the solution is wrong. Fixes: (1) Regularization: add a small multiple of the identity matrix (Tikhonov regularization), which bounds the condition number at the cost of slight bias. (2) Preconditioning: multiply by a matrix that reduces the condition number before solving. (3) Use higher precision arithmetic (e.g., 128-bit or arbitrary precision). (4) Reformulate the problem: sometimes a different mathematical formulation of the same problem has a lower condition number. (5) Use QR or SVD instead of LU decomposition, as they are more numerically stable for ill-conditioned systems. (6) Truncated SVD: discard singular values below a threshold, effectively reducing the rank and avoiding amplification of noise.

Q: What are the most common numerical stability pitfalls in machine learning?

The most common pitfalls are: (1) Log-sum-exp overflow: computing log(sum(exp(x))) directly overflows when x values are large. The fix is the log-sum-exp trick: subtract max(x) before exponentiating. (2) Softmax overflow and underflow: same issue as log-sum-exp, solved by subtracting the max. (3) Cross-entropy with log(0): when a predicted probability is exactly 0, log(0) = -infinity. The fix is to clamp probabilities to [epsilon, 1-epsilon]. (4) Vanishing and exploding gradients: in deep networks, gradients can shrink to zero or grow to infinity across many layers. Fixes include batch normalization, residual connections, gradient clipping, and careful weight initialization (Xavier/He). (5) Covariance matrix inversion: sample covariance matrices are often ill-conditioned or singular. Use regularized covariance (shrinkage) or pseudoinverse. (6) Learning rate too high: causes numerical divergence in gradient descent. Use learning rate warmup, schedulers, or adaptive methods like Adam. (7) Float32 vs Float16: mixed-precision training uses float16 for speed but requires loss scaling to prevent gradient underflow.

Understand why floating point arithmetic causes errors, compute condition numbers for any matrix, visualize catastrophic cancellation, detect ill-conditioned systems, and compare single vs double precision. Essential for anyone writing numerical code, training neural networks, or solving linear systems.

1. Condition Number Calculator

Enter a square matrix to compute its condition number, singular values, and stability assessment. The condition number κ(A) = σ_max / σ_min tells you how many digits of accuracy you can lose when solving Ax = b.

Matrix (rows on lines, spaces between entries):

2. Floating Point Precision Demo

See how floating point arithmetic introduces errors. Enter an expression to compare the floating point result with the mathematically exact result.

Select a classic example:

3. Catastrophic Cancellation Detector

Enter two numbers that are close in value. The tool shows how many significant digits survive after subtraction, and visualizes the digit-by-digit cancellation.

Number a:

Number b:

4. Single vs Double Precision Comparison

Compare how the same computation performs in simulated single precision (float32, ~7 digits) vs double precision (float64, ~16 digits). See where single precision breaks down.

Select test:

5. Stability Tips Per Operation

Click any operation to see its stability profile, common pitfalls, and recommended practices.

Last updated: May 25, 2026

Why Numerical Stability Matters

Every number stored in a computer is an approximation. The real number line is continuous and infinite, but a 64-bit floating point number can only represent about 2⁶⁴ distinct values. The gap between representable numbers varies: near zero, values are packed tightly together (the smallest positive double is about 5 × 10^-324), but near large numbers, the gaps are enormous (near 10³⁰⁸, adjacent representable values differ by about 10²⁹²). This non-uniform spacing is the root cause of most numerical instability.

IEEE 754 double precision floating point uses 64 bits: 1 sign bit, 11 exponent bits, and 52 mantissa bits. This gives approximately 15-16 significant decimal digits of precision. The machine epsilon, approximately 2.22 × 10^-16, represents the smallest relative perturbation that can be detected: 1 + ε ≠ 1, but 1 + ε/2 = 1 in floating point. Every arithmetic operation (addition, subtraction, multiplication, division) introduces a relative error of at most ε. Over thousands or millions of operations, these tiny errors can accumulate to produce results that are completely wrong.

The Condition Number: Quantifying Sensitivity

The condition number of a problem measures the inherent sensitivity of the output to perturbations in the input. For a linear system Ax = b, the condition number κ(A) = ||A|| · ||A^-1|| quantifies the worst-case amplification of relative error: if the input b is perturbed by a relative amount δ, the output x can be perturbed by as much as κ(A) · δ. The condition number is a property of the mathematical problem, not the algorithm used to solve it. No algorithm can produce a result more accurate than the condition number allows.

For a matrix, the condition number computed using the 2-norm equals the ratio of the largest to smallest singular value: κ₂(A) = σ_max / σ_min. A well-conditioned matrix has a condition number near 1 (the identity matrix has κ = 1). An ill-conditioned matrix has a large condition number. The infamous Hilbert matrix, where H(i,j) = 1/(i+j-1), is a classic example: the 5×5 Hilbert matrix has a condition number of about 476,607, and the 10×10 Hilbert matrix has a condition number of about 1.6 × 10¹³, meaning you lose 13 of your 16 digits of accuracy when solving a linear system with it.

Catastrophic Cancellation: The Silent Killer

Subtraction of nearly equal numbers is the most dangerous basic operation in floating point arithmetic. When two numbers agree in their first k digits, subtraction eliminates those k digits, leaving only the remaining (16-k) digits. If the numbers agree in 14 of 16 digits, only 2 significant digits survive. This is called catastrophic cancellation, and it appears in many common formulas.

The quadratic formula x = (-b ± sqrt(b² - 4ac)) / (2a) suffers from catastrophic cancellation when b² is much larger than 4ac. In that case, sqrt(b² - 4ac) ≈ |b|, and one of the two roots involves subtracting two nearly equal numbers. The fix is to compute the root that avoids cancellation first, then use the identity x1 · x2 = c/a to get the other root. Similarly, the naive variance formula Var = E[X²] - (E[X])² is numerically catastrophic when the mean is large relative to the standard deviation, because E[X²] and (E[X])² are nearly equal large numbers. Welford's online algorithm avoids this entirely by computing the variance incrementally.

Stability of Matrix Operations

Different matrix decomposition algorithms have different stability properties. Gaussian elimination without pivoting is unstable: it can amplify rounding errors exponentially. Gaussian elimination with partial pivoting (LU with row permutations) is stable in practice, though pathological examples exist where it grows exponentially (such growth has never been observed in practice for random matrices). QR decomposition via Householder reflections is unconditionally backward stable: the computed Q and R satisfy (A + E) = QR where ||E|| is at most a small multiple of ε ||A||. SVD computed by the Golub-Kahan bidiagonalization is also backward stable.

For eigenvalue problems, the symmetric eigenvalue problem (A = A^T) is well-conditioned: each eigenvalue has a condition number equal to 1. The non-symmetric eigenvalue problem can be arbitrarily ill-conditioned: eigenvalues of non-symmetric matrices can be extremely sensitive to perturbations. The condition number of an eigenvalue λ is 1 / |y^Tx|, where x and y are the right and left eigenvectors. When x and y are nearly orthogonal (as happens for defective or near-defective matrices), the eigenvalue is extremely sensitive.

Numerical Stability in Machine Learning

Machine learning practitioners encounter numerical stability issues constantly, often without realizing it. The softmax function softmax(x)_i = exp(x_i) / sum(exp(x_j)) overflows when any x_i exceeds about 709 (the limit for double precision exp). The standard fix is to subtract max(x) from all components before exponentiating: this does not change the mathematical result but prevents overflow. The log-sum-exp function, log(sum(exp(x))), uses the same trick: LSE(x) = max(x) + log(sum(exp(x - max(x)))).

Cross-entropy loss, -sum(y log(p)), produces -infinity when any predicted probability p = 0. Clipping p to [ε, 1-ε] prevents this. In practice, frameworks like PyTorch combine softmax and cross-entropy into a single numerically stable operation (LogSoftmax + NLLLoss) that avoids computing intermediate probabilities. Gradient clipping (clamping gradient norms to a maximum value) prevents exploding gradients in deep networks and recurrent architectures. Batch normalization, layer normalization, and weight normalization all improve numerical stability by keeping activations and gradients in a reasonable range throughout training.

Best Practices for Numerically Stable Code

First, avoid subtracting nearly equal numbers whenever possible. Reformulate expressions to use addition, multiplication, or mathematically equivalent forms that avoid cancellation. Second, use compensated summation (Kahan summation) when accumulating many small numbers: it maintains a running compensation term that captures the rounding error of each addition. Third, always use pivoting in Gaussian elimination (partial pivoting at minimum, complete pivoting for maximum safety). Fourth, prefer QR or SVD over normal equations (A^TAx = A^Tb) for least-squares problems, because normal equations square the condition number. Fifth, compute condition numbers before trusting your results: if κ(A) × ε is close to 1, your answer has no reliable digits. Sixth, use appropriate precision: float64 for scientific computing, float32 or float16 with loss scaling for deep learning inference, and arbitrary precision libraries when double precision is insufficient.

Frequently Asked Questions

What is the condition number of a matrix and why does it matter?

The condition number κ(A) = σ_max / σ_min measures how much errors in the input are amplified in the output when solving Ax = b. A condition number near 1 is stable; κ = 10^k means you lose approximately k digits of accuracy. It is the single most important indicator of numerical reliability.

What is catastrophic cancellation in floating point arithmetic?

When subtracting two nearly equal numbers, the leading digits cancel and only the least significant (least accurate) digits remain. For example, 1.234567890123456 - 1.234567890123400 = 5.6e-14, which has only 2 significant digits despite 16-digit inputs. Fixes include reformulating expressions to avoid subtraction of similar values.

What is machine epsilon and how does it relate to floating point precision?

Machine epsilon is the smallest ε such that 1 + ε ≠ 1 in floating point. For float64: ε ≈ 2.22e-16 (~16 digits). For float32: ε ≈ 1.19e-7 (~7 digits). It sets the fundamental precision limit: no algorithm can beat ε × κ(A) accuracy for a problem with condition number κ(A).

How do you detect and fix ill-conditioned systems?

Detect via SVD: if σ_max / σ_min > 10⁶, the system is ill-conditioned. Fixes: Tikhonov regularization (add λI), preconditioning, higher precision arithmetic, QR or SVD instead of LU, and truncated SVD to discard near-zero singular values.

What are the most common numerical stability pitfalls in machine learning?

Log-sum-exp overflow (fix: subtract max), softmax overflow (fix: subtract max), log(0) in cross-entropy (fix: clamp probabilities), vanishing/exploding gradients (fix: normalization + gradient clipping), ill-conditioned covariance matrices (fix: regularization), and float16 gradient underflow (fix: loss scaling).

Related Tools

Matrix Multiplication Calculator — see step-by-step matrix products with dimension checking
Eigenvector Step Solver — compute eigenvalues and eigenvectors with stability analysis
Determinant Calculator — compute determinants via cofactor expansion and LU
Matrix Operations Reference — interactive cheatsheet of 20+ matrix operations

Built by Michael Lip. Try the ML3X Matrix Calculator for interactive step-by-step solutions.