Why least squares is used

An overdetermined system has more equations than unknowns, so an exact solution usually does not exist. Least squares finds the vector that makes the total residual as small as possible.

The standard formula is x=(ATA)1ATbx = (A^TA)^{-1}A^Tb, built from the normal equations.

How to read the residuals

Each residual is the difference between an observed right-hand side and the fitted value from the best-fit vector. Smaller bars mean the model tracks that equation more closely.

If the system becomes singular, change the coefficients so the equations are not all describing the same direction.