This might be a very stupid question, but I do not seem to understand why I would multiple a matrix with its transpose. I am not a mathematician, but I am very interested in understanding the practical usage of equations:

Imagine I have three products sales Apple, Orange and Pear for the last 3 days in a matrix form called A: $$ A= \begin{bmatrix} Apple & Orange & Pear \\ 10 & 2 & 5 \\ 5 & 3 & 10 \\ 4 & 3 & 2 \\ 5 & 10 & 5 \\ \end{bmatrix}$$

What will $AA^{\rm T}$ tell me?

I have seen this long answer link: Is a matrix multiplied with its transpose something special?, but I did not get it at all.

I see that a lot of equations use the product $AA^{\rm T}$ and I really hope that someone will give a very simple answer.

Boro Dega
  • 303
  • 1
  • 2
  • 6
  • The point discussed in accepted answer of your link is that $AA^{T}$ is a symmetric matrix. i.e. it is a matrix $B$ such that $B^{T}=B$, and this kind of matrices has some very nice property (e.g they are what's called self-adjoint matrices in inner product spaces). In your particular example, it doesn't really tell anything much – user160738 Jun 17 '17 at 02:21
  • 2
    It doesn't have any particular meaning without context. – copper.hat Jun 17 '17 at 03:00
  • Not an answer but orthogonal matrices may be of interest. – Karl Jun 17 '17 at 07:56
  • 2
    Gilbert Strang discusses the pattern $A^T A$ (and $A^T C A$) a lot in his books, such as Introduction to Applied Math. – littleO May 30 '19 at 08:04

5 Answers5


There are great answers by fellow members. I would like to visualize just this particular problem. Lets say there are $4$ companies $A$,$B$,$C$ and $D$ and all of them sell three fruits Apples, Oranges and Pears. Because the numbers are less, I will assume that we want to see the daily sales in numbers of all companies.

Create the table for daily sales: $$\begin{bmatrix} &\text {Apples} & \text{Oranges}&\text{Pears} \\\text{Company 1}&10&2&5\\\text{Company 2} &5&3&10\\\text{Company 3} &4&3&2\\\text{Company 4} &5&10&5\\\end{bmatrix}$$

Just ignore the words and look at the numbers. The first row and first column are just for understanding. The numerical values of the table represent your matrix $A$. This table tells you the daily sales of each company for apples, oranges and pears.

$$A=\begin{bmatrix}10 & 2&5 \\5 & 3&10 \\4 & 3&2\\5 & 10&5 \end{bmatrix}$$ If we just write the table in another way, to see just the sales of a particular fruit from all the companies we will write, $$\begin{bmatrix} &\text {Company 1} & \text{Company 2}&\text{Company 3}&\text{Company 4} \\\text{Apples}&10&5&4&5\\\text{Oranges} &2&3&3&10\\\text{Pears} &5&10&2&5\\\end{bmatrix}$$ This can be written as: $$A^T=\begin{bmatrix}10 & 5&4&5 \\2 &3& 3&10 \\5 & 10&2&5\\ \end{bmatrix}$$ Now we keep both the tables together, $$\begin{bmatrix} &\text {Apples} & \text{Oranges}&\text{Pears} \\\text{Company 1}&10&2&5\\\text{Company 2} &5&3&10\\\text{Company 3} &4&3&2\\\text{Company 4} &5&10&5\\\end{bmatrix}\begin{bmatrix} &\text {Company 1} & \text{Company 2}&\text{Company 3}&\text{Company 4} \\\text{Apples}&10&5&4&5\\\text{Oranges} &2&3&3&10\\\text{Pears} &5&10&2&5\\\end{bmatrix}$$ If by some case there is a partnership between two companies say Company A and Company B, then what will be the total fruit sales? $$\text{Total fruit sales for the partnership} = \text{No of total apples + No of total oranges + No of total pears}$$

Total fruit sales for the partnership = Company 1 Apples X Company 2 Apples + Company 1 Oranges X Company 2 Oranges + Company 1 Pears X Company 2 Pears $$\text{Total fruit sales for the partnership} = 10X5 + 2X3 + 5X10=106$$ So the total sales of fruits for the partnership of Company A and Company B is $106$. This is nothing but the second element of the product $AA^T$.

$$AA^T=\begin{bmatrix}10 & 2&5 \\5 & 3&10 \\4 & 3&2\\5 & 10&5 \end{bmatrix}\begin{bmatrix}10 & 5&4&5 \\2 &3& 3&10 \\5 & 10&2&5\\ \end{bmatrix}=\begin{bmatrix}129&106&56&85 \\106&134&49&105 \\56&49&29&60\\ 85&105&60&150\end{bmatrix}$$

What does this product show? This product can be visualized as the total sales chart of each company as well as the total sales of mutual parnterships of companies. $$\begin{bmatrix} &\text {Company 1} & \text{Company 2}&\text{Company 3}&\text{Company 4} \\\text{Company 1}&129&106&56&85\\\text{Company 2} &106&134&49&105\\\text{Company 3} &56&49&29&60\\\text{Company 4} &85&105&60&150\\\end{bmatrix}$$

Crucial points to observe:

  1. The diagonal elements of the matrix $AA^T$ are all just the squared sum of individual companies. For example the first element is the strength of sales of Company 1 and so on.

  2. Each non diagonal element shows the total sales that would result due to the partnership between two companies. For example the second element of $AA^T$ is the total sales produced due to the partnership between Company 1 and Company 2.

  3. The matrix $AA^T$ is symmetric, which can be visualized using the fact that the total sales due to the partnership of Company 1 and Company 2 is same as that of Company 2 and Company 1.

  4. Useful insight from $AA^T$is that check the diagonal elements , whichever is the maximum, you can confirm that Company is stronger in sales. Another useful insight is you can check whether partnership with a particular company is beneficial or not. For example, Company 3 is having the lowest sales individually, so it is beneficial for Company 3 to form a partnership with Company 4 because the total sales would be 60 which is more than double of what Company 3 can have. So, we can check which partnerships would be most beneficial.

  5. Diagonal elements: (A measure of) Individual strengths, Non Diagonal Elements: Partnership strengths.

Hope this helps...

  • 1,010
  • 6
  • 12

In your case, $AA^T$ just sitting on a park bench doesn't tell you anything of great interest.

Hyprfrcb's answer talks about units. One of the elements has units of squared apples. Another has units of pear-oranges. Another has units of orange-pears! This by itself should be a red flag that values don't mean anything by themselves.

It can be a means to get to a least-squares solution if you were looking to model, say, how much of each fruit you'd expect to sell on a particular day. (This was one of the answers in the linked question.)

But by itself? It's just a fruit hybridization experiment gone terribly wrong.

  • 25,575
  • 3
  • 35
  • 59

Lets consider the matrix $A$ characterizing the values of some variables $a_{ij}$, $j=1...m$ with values at different times $i=1...n$, as in the OP example, but transposed.

If the variables are normalized in mean, the matrix $\frac 1m A^TA$ is the estimator of the covariances $s_{j_1j_2}=\mathbb{E}(a_{\cdot j_1}a_{\cdot j_2}) \approx \frac 1m \sum a_{j_1}a_{j_2}$ for the set of random variables $a_{\cdot j=1...m}$.

If the entries $a_{ij}$ of $A$ have units of $[a]$, then the entries of $AA^T$ will have units of $[a^2]$. This is consistent with the abovementioned.

When solving the problem $Ax=B$, the solution $x=(A^TA)^{-1}A^TB$ is the best estimator (LS), provided that the covariance as defined above, is enough variable to be invertible.

  • 2,700
  • 8
  • 17
  • 4
    Thanks, but this is not a dummy explanation. Your answer may perfectly make sense if I was a mathematician, which I am not. How would that relate to the apple, orange example? – Boro Dega Jun 17 '17 at 09:50
  • Unfortunately this is the only association i could figure for the requested construction. And no, this do not relate with the example in the sense you cannot have -2.5 apples per day. Or we can? – Brethlosze Jun 17 '17 at 13:28

It can be a part of a bigger question (context needed, for example, linear regression as mentioned by lhf in the linked post).

Assume that the vector (Q) of quantity apple sales and the vector (R) of revenues on four days are: $$Q=\begin{pmatrix}10\\ 5\\ 4 \\ 8\end{pmatrix}; \ \ R=\begin{pmatrix}20\\ 10\\ 8\\ 16\end{pmatrix}$$ And now we want to find the linear revenue function $R=aQ+b$. Obviously, it is $R=2Q$: $$R=\begin{pmatrix}R_1\\ R_2\\ R_3\\ R_4\end{pmatrix}=2\begin{pmatrix}10\\ 5\\ 4\\ 8\end{pmatrix}=\begin{pmatrix}20\\ 10\\ 8\\ 16\end{pmatrix}$$

Keeping in mind that the predicted function is not always perfectly linear fit and for demonstration purpose, assume that we want to find the linear revenue function $y=b_0+b_1x$ using linear regression, where $y=R,x=Q$ and $b_0,b_1$ are the parameters to be found. Then the linear function can be written in the matrix form as: $$Y=\begin{pmatrix}y_1\\ y_2\\ y_3\\ y_4\end{pmatrix}=\begin{pmatrix}1&x_1\\ 1&x_2\\ 1&x_3\\ 1&x_4\end{pmatrix}\begin{pmatrix}b_0\\ b_1\end{pmatrix}=Xb$$ Now we need to solve the matrix equation: $$Y=Xb \Rightarrow \\ X^TY=X^TXb \Rightarrow \\ b=(X^TX)^{-1}X^TY=\\ \left[\begin{pmatrix}1&1&1&1\\x_1&x_2&x_3&x_4\end{pmatrix}\begin{pmatrix}1&x_1\\ 1&x_2\\ 1&x_3\\ 1&x_4\end{pmatrix}\right]^{-1}\begin{pmatrix}1&1&1&1\\x_1&x_2&x_3&x_4\end{pmatrix}\begin{pmatrix}y_1\\ y_2\\ y_3\\ y_4\end{pmatrix}=\\ \left[\begin{pmatrix}1&1&1&1\\10&5&4&8\end{pmatrix}\begin{pmatrix}1&10\\ 1&5\\ 1&4\\ 1&8\end{pmatrix}\right]^{-1}\begin{pmatrix}1&1&1&1\\10&5&4&8\end{pmatrix}\begin{pmatrix}20\\ 10\\ 8\\ 16\end{pmatrix}=\\ \left[\begin{pmatrix}4&27\\27&205\end{pmatrix}\right]^{-1}\begin{pmatrix}54\\410\end{pmatrix}=\\ \frac{1}{91}\begin{pmatrix}205&-27\\ -27&4\end{pmatrix}\begin{pmatrix}54\\410\end{pmatrix}=\begin{pmatrix}0\\2\end{pmatrix}=\begin{pmatrix}b_0\\b_1\end{pmatrix}$$ as expected: $y=0+2x \iff R=2Q$.

  • 30,556
  • 2
  • 16
  • 49

I've been trying to figure this out my myself recently, and I think I understand it.

John is right, in your example, it doesn't make sense. But I'll attempt to explain it in a simpler example than what Brethlosze said... (bare with me and please tell me where I am wrong).

Let's say you are measuring something like acceleration. But your measurement of acceleration always has an error. Each time you measure acceleration 'a', you have an error of +- 0.1*a.

This error propagates as you continue your calculations. If you want to calculate velocity from that acceleration, then that acceleration error is going to propagate into the velocity calculations. If you continue to calculate distance, than that acceleration error is going to also continue to propagate into the distance calculation.

Now this is a very simple example, but it shows the relationship between this measurement (and its error) and all the calculates values (state variables). Now imagine a more complicated scenario where you need more than 1 measurement to get the desired state variable that you are looking for. You will have an equation with 2 variables to show the relationship between the measurements and the state variable. What if you are looking for 4 state variable that each are composed of 2 other state variables that those in itself take 2 or 3 or 4 or however many measurements to calculate. Now you can use a matrix to show the relationships between all these measurements and state variables.

So now, if we transpose the matrix and multiply it by the original matrix, look at how those equations in the matrix are being multiplied with all the other variables (and itself). Try the math of a simple 2x2 times the transpose of the 2x2. This is the covariance. Wikipedia: In probability theory and statistics, covariance is a measure of the joint variability of two random variables. Too me, its like the covariance matrix mixes all the variables and measurements in every way possible to show how all of them vary against and with each other.

This is very important in inertial measurement units (IMUs). There are many error states (sometimes more than a dozen) in an IMU in order to calculate the state variables (position, velocity, body orientation, etc.). These error states and state variables are defined in a matrix. When you find the covariance of the matrix, you can define the joint variability between all these error states and state variables. This is important in order to know how these errors and measurements can affect your accuracy of position, velocity, and body orientation. By knowing your covariance, you pretty know how accurate your IMU is at defining your position, velocity, orientation, etc.