Matrix Multiplication Calculator
Multiply two matrices with a full step-by-step breakdown of every dot product. Understand dimension rules, see intermediate calculations, and learn the math behind neural networks and linear transformations.
How Matrix Multiplication Works
Matrix multiplication combines two matrices into a single result by computing dot products between the rows of the first matrix and the columns of the second. Given matrix A with dimensions m × n and matrix B with dimensions n × p, the product C = A × B has dimensions m × p. Each entry in the result is computed as:
C[i][j] = ∑k=1..n A[i][k] · B[k][j]
This formula tells us that element C[i][j] is the dot product of row i of matrix A and column j of matrix B. For example, to compute C[1][2], you take the first row of A, the second column of B, multiply corresponding entries, and sum the results. The requirement that A has n columns and B has n rows is non-negotiable — without matching inner dimensions, the dot products are undefined.
Step-by-Step Example: 2×3 × 3×2
Consider A = [[1, 2, 3], [4, 5, 6]] and B = [[7, 8], [9, 10], [11, 12]]. Matrix A is 2×3 and B is 3×2, so the result C will be 2×2.
C[1][1] = 1·7 + 2·9 + 3·11 = 7 + 18 + 33 = 58
C[1][2] = 1·8 + 2·10 + 3·12 = 8 + 20 + 36 = 64
C[2][1] = 4·7 + 5·9 + 6·11 = 28 + 45 + 66 = 139
C[2][2] = 4·8 + 5·10 + 6·12 = 32 + 50 + 72 = 154
The final result is C = [[58, 64], [139, 154]]. Each entry required exactly three multiplications and two additions (matching the inner dimension of 3). You can verify this on the ML3X calculator by entering these matrices and selecting the A × B operation.
Applications in Machine Learning
Matrix multiplication is the fundamental operation in modern machine learning. In a feedforward neural network, each layer performs the computation y = Wx + b, where W is the weight matrix, x is the input vector (or batch matrix), and b is the bias. This single matrix multiplication transforms the input from one feature space to another. For a layer mapping 784 inputs (such as a flattened 28×28 image) to 128 hidden neurons, W is a 128×784 matrix, and the multiplication produces a 128-dimensional output vector.
In transformer architectures, the attention mechanism computes Q·KT (queries times transposed keys) to determine attention weights, then multiplies by V (values) to produce the output. These matrix products are the computational bottleneck of modern large language models. Understanding matrix multiplication dimensions is essential for debugging shape errors in PyTorch and TensorFlow.
The computational complexity of naive matrix multiplication is O(n³) for square n×n matrices. Each of the n² entries requires a dot product of length n. Strassen's algorithm reduces this to approximately O(n2.81) by cleverly reducing the number of recursive multiplications from 8 to 7. GPU hardware is specifically optimized for this operation, with tensor cores performing mixed-precision matrix multiplications at extraordinary throughput — a single NVIDIA H100 can execute over 1,000 trillion FP8 multiply-accumulate operations per second.
Beyond neural networks, matrix multiplication appears in computer graphics (combining transformation matrices for rotation, scaling, and translation), control theory (state-space models), economics (input-output models), and graph theory (counting paths in adjacency matrices). The number of paths of length k between vertices i and j in a graph equals the (i,j) entry of the adjacency matrix raised to the power k, which requires repeated matrix multiplication.
Frequently Asked Questions
What are the dimension requirements for matrix multiplication?
To multiply matrix A (m × n) by matrix B (p × q), the number of columns in A must equal the number of rows in B, meaning n must equal p. The resulting matrix C will have dimensions m × q. For example, a 3×2 matrix can multiply a 2×4 matrix, producing a 3×4 result.
Is matrix multiplication commutative?
No. Matrix multiplication is not commutative: A × B does not generally equal B × A. In fact, even when A × B is defined, B × A may not be defined at all if the dimensions are incompatible. When both products exist, they usually produce different results. This is a fundamental difference from scalar multiplication.
What is the computational complexity of matrix multiplication?
The naive algorithm for multiplying two n × n matrices runs in O(n³) time because each of the n² entries in the result requires a dot product of length n. Strassen's algorithm reduces this to approximately O(n2.81). The best known theoretical bound is around O(n2.37), though these faster algorithms are rarely used in practice for small matrices.
How is matrix multiplication used in machine learning?
Matrix multiplication is the core operation in neural networks. Each layer computes y = Wx + b, where W is a weight matrix, x is the input vector, and b is a bias vector. During training, gradients are also computed via matrix multiplication through backpropagation. Attention mechanisms in transformers rely on matrix products of query, key, and value matrices.
Can I multiply non-square matrices?
Yes, as long as the inner dimensions match. A matrix with m rows and n columns can multiply a matrix with n rows and p columns, producing an m × p result. For example, multiplying a 2×3 matrix by a 3×5 matrix gives a 2×5 matrix. Only the inner dimensions (the 3 in this case) must be equal.
Related Tools
- Determinant Calculator — compute determinants with cofactor expansion
- Inverse Matrix Calculator — find the inverse using Gauss-Jordan elimination
- Matrix Transpose Calculator — swap rows and columns
Built by Michael Lip. Try the ML3X Matrix Calculator for interactive step-by-step solutions.