Original Research

Matrix Operation Complexity — Time and Space for Every Common Operation

A complete reference of 24 matrix operations covering time complexity, space complexity, numerical stability, and when to use each one. From basic arithmetic to advanced decompositions.

By Michael Lip · Updated April 2026

Methodology

Complexity bounds are sourced from standard numerical linear algebra references (Golub & Van Loan, Trefethen & Bau) and verified against LAPACK documentation. Community relevance was assessed via StackOverflow engagement data from the StackExchange API v2.3 (e.g., "sparse matrix multiplication complexity" with 12.5K views). All complexities assume dense matrices of size n x n unless noted. Stability ratings: High = backward stable, Medium = conditionally stable, Low = known sensitivity to conditioning.

# Operation Category Time Complexity Space Stability Notes
1Addition / SubtractionArithmeticO(n^2)O(n^2)HighElement-wise; trivially stable
2Scalar MultiplicationArithmeticO(n^2)O(1) in-placeHighSingle pass over elements
3Matrix-Vector MultiplyArithmeticO(n^2)O(n)HighBLAS Level 2 (GEMV)
4Matrix-Matrix MultiplyArithmeticO(n^3)O(n^2)HighBLAS Level 3 (GEMM); Strassen O(n^2.807)
5TransposeArithmeticO(n^2)O(1) in-placeHighCache-oblivious algorithms for large n
6TraceArithmeticO(n)O(1)HighSum of diagonal elements
7Determinant (via LU)DerivedO(2n^3/3)O(n^2)MediumProduct of LU diagonal; can overflow
8Determinant (Cofactor)DerivedO(n!)O(n^2)LowOnly for n <= 5; educational use
9Matrix InverseDerivedO(n^3)O(n^2)LowAvoid; use LU solve instead for Ax=b
10LU DecompositionDecompositionO(2n^3/3)O(n^2)High*With partial pivoting; LAPACK DGETRF
11LU Solve (after factoring)SolveO(n^2)O(n)HighForward + back substitution
12Cholesky DecompositionDecompositionO(n^3/3)O(n^2/2)HighSPD matrices only; 2x faster than LU
13QR Decomposition (Householder)DecompositionO(2mn^2 - 2n^3/3)O(mn)HighBackward stable; used for least squares
14QR Decomposition (Gram-Schmidt)DecompositionO(2mn^2)O(mn)LowClassical GS loses orthogonality; use modified GS
15SVD (Full)DecompositionO(4n^2m + 8nm^2 + 9m^3)O(mn)HighGold standard; LAPACK DGESVD
16Truncated SVD (rank-k)DecompositionO(mnk)O((m+n)k)HighRandomized algorithms; scikit-learn TruncatedSVD
17Eigenvalues (Symmetric)EigenvalueO(4n^3/3)O(n^2)HighQR algorithm; LAPACK DSYEV
18Eigenvalues (General)EigenvalueO(10n^3)O(n^2)MediumQR iteration after Hessenberg reduction
19Hessenberg ReductionDecompositionO(10n^3/3)O(n^2)HighPreprocessing step for eigenvalue computation
20Schur DecompositionDecompositionO(10n^3)O(n^2)HighUpper triangular form; basis for eigenvalues
21Moore-Penrose PseudoinverseDerivedO(mn^2)O(mn)HighVia SVD; for least-squares on rank-deficient systems
22Sparse Matrix-Vector (CSR)SparseO(nnz)O(nnz + n)Highnnz = number of non-zeros
23Sparse Matrix-MatrixSparseO(n * nnz)O(nnz_result)HighOutput sparsity depends on fill-in
24Kronecker ProductArithmeticO(m^2 * n^2)O(m^2 * n^2)HighResult is mn x mn; used in quantum computing

Practical Guidance

Never compute the inverse to solve Ax = b. LU factorization followed by forward-back substitution is both faster (same O(n^3) but lower constant) and more numerically stable. Computing A^-1 explicitly doubles the work and squares the condition number's effect on error.

Choose the right decomposition. For general linear systems, use LU with partial pivoting. For symmetric positive definite matrices (covariance, kernel, Gram), use Cholesky — it is 2x faster and inherently stable. For least-squares problems, use QR (Householder). For rank-revealing analysis or when you need the condition number, use SVD.

Strassen is rarely faster in practice. Despite its O(n^2.807) complexity, Strassen's algorithm has higher constants, worse numerical stability, and poor cache behavior compared to optimized BLAS implementations. The crossover point where Strassen wins is typically n > 1000-4000 depending on the hardware. Libraries like OpenBLAS and Intel MKL automatically select the best strategy.

Sparse matrices change everything. When a matrix has fewer than ~10% non-zero entries, sparse formats (CSR, CSC, COO) dramatically reduce both storage and computation. A sparse matrix-vector product costs O(nnz) instead of O(n^2), which for a 10,000 x 10,000 matrix with 1% density means 100x fewer operations.

Frequently Asked Questions

What is the time complexity of matrix multiplication?

Naive matrix multiplication of two n x n matrices is O(n^3). Strassen's algorithm reduces this to O(n^2.807) and is practical for large matrices (n > ~1000). The current theoretical lower bound is O(n^2.3729) but these algorithms are impractical. Real-world performance depends more on cache optimization and BLAS library choice than asymptotic complexity.

Why is computing the matrix inverse considered numerically unstable?

Matrix inversion amplifies floating-point errors proportionally to the condition number. For solving Ax = b, LU decomposition with forward-back substitution is both faster and more stable. The only time you should compute an explicit inverse is when you need to apply A^-1 to many different right-hand sides and the matrix is well-conditioned.

What is the difference between LU, QR, and SVD decomposition complexity?

LU is O(2n^3/3) — fastest, used for square linear systems. QR is O(2mn^2 - 2n^3/3) — more stable, used for least-squares. SVD is the most expensive at O(mn^2) but reveals complete geometric structure: singular values, rank, null space, and condition number. Choose based on your problem structure.

When should I use Cholesky decomposition instead of LU?

Use Cholesky whenever your matrix is symmetric positive definite — common in covariance matrices, kernel matrices, and normal equations. Cholesky is 2x faster than LU, uses half the storage, needs no pivoting, and provides a uniqueness guarantee. If Cholesky fails, your matrix is not positive definite.

How does sparse matrix multiplication complexity differ from dense?

For sparse matrices with nnz non-zero elements, matrix-vector multiplication drops from O(n^2) to O(nnz), and matrix-matrix from O(n^3) to O(n * nnz). For a matrix with 1% density, this is roughly a 100x speedup. The key trade-off is element access overhead in formats like CSR/CSC and potential fill-in during factorization.