12

I'm looking for an easily understandable interpretation for a transpose of a square matrix A. An intuitive visual demonstration, how $A^{T}$ relates to A. I want to be able to instantly visualize in my mind what I'm doing to the space when transposing the vectors of a matrix.

From experience, understanding linear algebra concepts in two dimensions is often enough to understand concepts in any higher dimension, so an explanation for two dimensional spaces should be enough I think.

All explanations I found so far were not intuitive enough, as I want to be able to instantly imagine (and draw) how $A^{T}$ looks like given A. I'm not a mathematician btw.

Here is what I found so far (but not intuitive enough for me)

  1. (Ax)$\cdot$y=$(Ax)^{T}$y=$x^{T}A^{T}$y=x$\cdot$$A^{T}$y

As far I understand dot product is a projection (x onto y, y onto x, both interpretations have the same result) followed by a scaling by the length of the other vector.

This would mean that mapping x into space A and projecting y onto the result is the same as mapping y into the space of $A^{T}$, then projecting the unmapped x into $A^{T}$y

So $A^{T}$ is the specific space B for any pair of vectors (x,y) such that Ax$\cdot$y=x$\cdot$By

This doesn't tell me instantly how $A^{T}$ drawn as vectors would look like based on A drawn as vectors.

  1. "reassigning dimensions"

This one is hard to explain so let me do this with a drawing:

parallel projections

This explanation is much more visual, but far too messy to do it in my head instantly. There are also multiple ways I could have rotated and arranged the vectors around the result $A^{T}$ which is represented in the middle. Also, it doesn't feel like it makes me truly understand the transposing of matrices, especially in higher dimensions.

  1. some kind of weird rotation

Symmetrical matrices can be decomposed into a rotation, scaling along eigenvectors $\Lambda$ and a rotation back

A=R$\Lambda$$R^{T}$

So in this specific case, the transpose is a rotation in the opposite direction of the original. I don't know how to generalize that into arbitrary matrices. I'm wildly guessing that if A is not symmetric any more, $R^{T}$ must also include some additional operations besides rotation.

Can anyone help me to find a way to easily and instantly imagine/draw how $A^{T}$ looks like given A in two dimensional space? (In a way of understanding that is generalizable into higher dimensions)

Edit 1: While working on the problem I was curious to see what B in

$BA=A^{T}$

looks like. B would describe what needs to be done to A in order to geometrically transpose it. My temporary result looks interesting but I'm still trying to bring it to an interpretable form. If we assume the following indexing order

$$A= \begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \\ \end{bmatrix} $$

and $det(A)\neq0$ then

$$B=\frac{1}{det(A)} \begin{bmatrix} a_{11} a_{22} - a_{21}^2 & a_{11} (a_{21} - a_{12}) \\ a_{22} (a_{12} - a_{21}) & a_{11} a_{22} - a_{12}^2 \\ \end{bmatrix} $$

What's visible on the first sight is that $\frac{1}{det(A)}$ causes scaling such that the area becomes exactly 1 (before applying the actual matrix).

B must also preserve the area as $det(A^{T})=det(A)$. It means that the matrix

$B'=\begin{bmatrix} a_{11} a_{22} - a_{21}^2 & a_{11} (a_{21} - a_{12}) \\ a_{22} (a_{12} - a_{21}) & a_{11} a_{22} - a_{12}^2 \\ \end{bmatrix}$

squares the area while transposing.

Edit 2:

The same matrix can be written as

$B'=\begin{bmatrix} \begin{bmatrix} a_{11} & a_{21} \\ \end{bmatrix} \begin{bmatrix} a_{22} \\ -a_{21} \\ \end{bmatrix} & \begin{bmatrix} a_{11} & a_{21} \\ \end{bmatrix} \begin{bmatrix} -a_{12} \\ a_{11} \\ \end{bmatrix} \\ \begin{bmatrix} a_{21} & a_{22} \\ \end{bmatrix} \begin{bmatrix} a_{22} \\ -a_{21} \\ \end{bmatrix} & \begin{bmatrix} a_{12} & a_{22} \\ \end{bmatrix} \begin{bmatrix} -a_{12} \\ a_{11} \\ \end{bmatrix} \\ \end{bmatrix}$

Which is

$B'=\begin{bmatrix} a_{1}^{T} \begin{bmatrix} a_{22} \\ -a_{21} \\ \end{bmatrix} & a_{1}^{T} \begin{bmatrix} -a_{12} \\ a_{11} \\ \end{bmatrix} \\ a_{2}^{T} \begin{bmatrix} a_{22} \\ -a_{21} \\ \end{bmatrix} & a_{2}^{T} \begin{bmatrix} -a_{12} \\ a_{11} \\ \end{bmatrix} \\ \end{bmatrix}= \begin{bmatrix} a_{1}\cdot \begin{bmatrix} a_{22} \\ -a_{21} \\ \end{bmatrix} & a_{1}\cdot \begin{bmatrix} -a_{12} \\ a_{11} \\ \end{bmatrix} \\ a_{2}\cdot \begin{bmatrix} a_{22} \\ -a_{21} \\ \end{bmatrix} & a_{2}\cdot \begin{bmatrix} -a_{12} \\ a_{11} \\ \end{bmatrix} \\ \end{bmatrix}$

I find the vectors $c_{1}=\begin{bmatrix} a_{22} \\ -a_{21} \\ \end{bmatrix}$ and $c_{2}=\begin{bmatrix} -a_{12} \\ a_{11} \\ \end{bmatrix}$ interesting. When I draw them it looks like I only need to rotate each by 90 degress in different directions to end up with the transpose column vectors.

Edit 3:

Maybe I fool myself, but I think I'm getting closer. The column space

$C= \begin{bmatrix} c_{1} & c_{2} \\ \end{bmatrix} = \begin{bmatrix} a_{22} & -a_{12} \\ -a_{21} & a_{11} \\ \end{bmatrix}$

is related to $A^{-1}$ because:

$AC=\begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \\ \end{bmatrix} \cdot \begin{bmatrix} a_{22} & -a_{12} \\ -a_{21} & a_{11} \\ \end{bmatrix} = \begin{bmatrix} det(A) & 0 \\ 0 & det(A) \\ \end{bmatrix} =det(A) I$

So

$C=A^{-1}det(A)$

B' can be written as well like this:

$B'=\begin{bmatrix} \begin{bmatrix} \begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \\ \end{bmatrix} & \begin{bmatrix} a_{22} \\ -a_{21} \\ \end{bmatrix} \end{bmatrix} \\ \begin{bmatrix} \begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \\ \end{bmatrix} & \begin{bmatrix} -a_{12} \\ a_{11} \\ \end{bmatrix} \end{bmatrix} \end{bmatrix} = \begin{bmatrix} \begin{bmatrix} \begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \\ \end{bmatrix} & c_{1} \end{bmatrix} \\ \begin{bmatrix} \begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \\ \end{bmatrix} & c_{2} \end{bmatrix} \end{bmatrix}$

or like this

$B'=\begin{bmatrix} \begin{bmatrix} a_{11} & a_{21} \\ \end{bmatrix} & \begin{bmatrix} a_{22} & -a_{12} \\ -a_{21} & a_{11} \\ \end{bmatrix} \\ \begin{bmatrix} a_{21} & a_{22} \\ \end{bmatrix} & \begin{bmatrix} a_{22} & -a_{12} \\ -a_{21} & a_{11} \\ \end{bmatrix} \\ \end{bmatrix} = \begin{bmatrix} a_1^{T} & \begin{bmatrix} c_{1} & c_{2} \\ \end{bmatrix} \\ a_2^{T} & \begin{bmatrix} c_{1} & c_{2} \\ \end{bmatrix} \\ \end{bmatrix} = \begin{bmatrix} a_1^{T}C \\ a_2^{T}C \\ \end{bmatrix} = det(A) \begin{bmatrix} a_1^{T}A^{-1} \\ a_2^{T}A^{-1} \\ \end{bmatrix}$

Therefore for $BA=A^{T}$ we have

$B=\begin{bmatrix} a_1^{T}A^{-1} \\ a_2^{T}A^{-1} \\ \end{bmatrix}$

Edit 4:

I think I will post my own answer soon. Going down the path of $A^{-1}$ had the idea that one can exploit the symmetry of of $AA^{T}$. Symmetry means that $AA^{T}$ decomposes nicer:

$AA^{T} = R_{AA^{T}} \Lambda_{AA^{T}} (R^{-1})_{AA^{T}}$

Now if you multiply both sides with $A^{-1}$ you'll get

$A^{T} = A^{-1} R_{AA^{T}} \Lambda_{AA^{T}} (R^{-1})_{AA^{T}}$

When I do an example with numbers I can also see that in my example $R_{AA^{T}} = (R^{-1})_{AA^{T}}$

$R_{AA^{T}}$ mirrors the space along y axis and then rotates by some angle $\alpha$ So my suspicion right now is:

$A^{T}=A^{-1} R_{AA^{T}} \Lambda_{AA^{T}} R_{AA^{T}}$

Now if I define

$R_{AA^{T}}^{'} = \begin{bmatrix} cos \alpha & -sin \alpha \\ sin \alpha & cos \alpha \\ \end{bmatrix}$

to get the mirroring out of the matrix $R_{AA^{T}}$ then I get

$A^{T}=A^{-1} R_{AA^{T}}^{'} \begin{bmatrix} -1 & 0 \\ 0 & 1 \\ \end{bmatrix} \Lambda_{AA^{T}} R_{AA^{T}}^{'} \begin{bmatrix} -1 & 0 \\ 0 & 1 \\ \end{bmatrix} $

So generally

$A^{T}=A^{-1} R_{\alpha} M_y \Lambda R_{\alpha} M_y$

With $M_y$ being the mirroring along the y axis, $R_{\alpha}$ some counter-clockwise rotation by $\alpha$ and $\Lambda$ some scaling

evolution
  • 251
  • 2
  • 7
  • The most natural thing that I can think of has to do with dualizing objects, but that doesn't make much geometric sense. For your third idea, that might only make sense for orthogonal matrices. – Michael Burr Mar 19 '17 at 01:36
  • 4
    Instead of looking at the actions of the matrix and its transpose directly, the relationships of the fundamental subspaces might give you the geometrical insights you’re looking for: the column space of the transpose is the orthogonal complement of the null space and the null space of the transpose is the orthogonal complement of the column space. Recall, too, that the column space of a matrix is the image of the linear map that it represents. – amd Mar 19 '17 at 02:30
  • https://www.youtube.com/watch?v=XkY2DOUCWMU&list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE_ab&index=5 – R.W Mar 19 '17 at 02:30
  • well, matrix themselves are not specially intuitive after all... – Masacroso Mar 19 '17 at 02:36
  • @amd: Your comment sounds insightful yet still hard to understand. Need some time to digest it what it means, so I'm not sure how to react yet. The videos you linked to, I already watched all of them weeks ago as they were pretty good, however none of the videos from this playlist addressed the transpose. – evolution Mar 19 '17 at 02:49
  • Following up on @amd's comment, Gilbert Strang does a good job of explaining and emphasizing this "four subspace" picture. – littleO Mar 19 '17 at 02:51
  • You could check out Strang's linear algebra textbooks such as Introduction to Linear Algebra. Also Strang has video lectures for his linear algebra class online. I don't have a link at the moment but you'll find them easily. – littleO Mar 19 '17 at 02:56
  • Found the lecture, will watch it and also contemplate amd's answer tomorrow morning (it's quite late in my country right now) https://ocw.mit.edu/courses/mathematics/18-06-linear-algebra-spring-2010/video-lectures/lecture-10-the-four-fundamental-subspaces/ Till then, looking forward to new developments here :) – evolution Mar 19 '17 at 03:03
  • 1
    Here is one viewpoint to be aware of, although it's probably not what you're looking for. Matrices exist as a concise way to describe linear transformations, and usually it's best to think in terms of linear transformations rather than matrices. If $U$ and $V$ are finite dimensional vector spaces and $T:U\to V$ is a linear transformation, then it is somehow "natural" to define a linear transformation $T^*$ from the dual space $V^*$ to the dual space $U^*$ as follows: $T^*(z)(u) = z(T(u))$. The matrix representation of $T$ (using the dual basis) is the transpose of the matrix of $T$. – littleO Mar 19 '17 at 03:09
  • 5
    You want to "instantly visualize" something quite abstract... Have you considered the possibility that that is maybe a bit too much to ask for? – Mariano Suárez-Álvarez Mar 19 '17 at 08:59
  • Just because it's hard to figure out doesn't mean it's not worth knowing. – evolution Mar 20 '17 at 02:55
  • Unless the matrix has any particular purpose it does not have to mean anything but a book-keeping operation switching places of whatever is stored in it - A permutation that happens to be it's own inverse. But together with the way matrix multiplication is defined it does get some properties, but levap covers it quite nicely. – mathreadler Mar 21 '17 at 20:19
  • 4
    Possible duplicate of [What is the geometric interpretation of the transpose?](http://math.stackexchange.com/questions/37398/what-is-the-geometric-interpretation-of-the-transpose) – rych Mar 22 '17 at 06:11
  • Levap's answer is the best I could have already found without asking here. I'm looking for something even more intuitive than that (not that I'm not thankful for his participation). I don't think it's a duplicate as rych suggests and I still hope that someone (or me) will come up with something more creative and more intuitive. The vectors that I see in B from $BA=A^{T}$ have interesting properties, also I still want to dive deeper into the behavior of eigenvectors and null spaces. – evolution Mar 22 '17 at 22:49

2 Answers2

6

One geometric description of $A^T$ can be obtained from the SVD decomposition (this will be similar to your third point). Any square matrix $A \in M_n(\mathbb{R})$ can be written as a product $A = S \Lambda R^T$ where $\Lambda$ is diagonal with non-negative entries and both $S,R$ are orthogonal matrices. The diagonal entries of $\Lambda$ are called the singular values of $A$ while the columns of $S$ and $R$ are called left singular vectors of $A$ and right singular vectors of $A$ respectively and they can be computed explicitly (or at least as explicitly as one can compute eigenvalues and eigenvectors). Using this decomposition, we can describe $A^T$ as

$$ A^T = (S\Lambda R^T)^T = R \Lambda S^T. $$

What does this mean geometrically? Assume for simplicity that $n = 2$ (or $n = 3$) and that $\det S = \det R = 1$ so $R,S$ are rotations. If $A$ is symmetric, we can write $A = R \Lambda R^T$ where $R$ is a rotation and $\Lambda$ is diagonal. Geometrically, this describes the action of $A$ as the composition of three operations:

  1. Perform the rotation $R^T$.
  2. Stretch each of the coordinate axes $e_i$ by a factor $\lambda_i$ (which is the $(i,i)$-entry of $\Lambda$).
  3. Finally, perform the rotation $R$ which is the inverse of the rotation $R^T$.

In other words, $A$ acts by rotating, stretching the standard basis vectors and then rotating back.

When $A$ is not symmetric, we can't have such a description but the decomposition $A = S \Lambda R^T$ gives us the next best thing. it describes the action of $A$ as the composition of three operations:

  1. First, perform the rotation $R^T$.
  2. Stretch each of the coordinate axes $e_i$ by a factor $\sigma_i$ (which is the $(i,i)$-entry of $\Lambda$).
  3. Finally, perform a different rotation $S$ which is not necessarily the inverse of $R^T$.

Unlike the case when $A$ was symmetric, here $R \neq S$ so the action of $A$ is a rotation, followed by stretching and then by another rotation. The action of $A^T = R\Lambda S^T$ is then obtained by reversing the roles of $R,S$ while keeping the same stretch factors. Namely, $A$ rotates by $R^T$, stretches by $\Lambda$ and rotates by $S$ while $A^T$ rotates by $S^T$, stretches by $\Lambda$ and rotates by $R$.

levap
  • 62,085
  • 5
  • 69
  • 107
  • Good explanation although the unitary matrices don't have to be rotations in a geometrical sense, they could be combinations of rotations and reflections and maybe other things as well. The only thing we know about them is that they have $R^tR = I$, $S^tS=I$ – mathreadler Mar 19 '17 at 12:51
  • @mathreadler: Yeah, that's why I restricted the discussion to $n=2,3$ and $\det R = \det S = 1$. In this case the corresponding orthogonal matrices are really rotations as we imagine them. – levap Mar 19 '17 at 20:28
  • I'm not sure what S in the general case ends up being as it is not just a rotation, right? Or is it? If $S^{T}S=I$ then they are orthogonal but... hmm... – evolution Mar 20 '17 at 02:40
  • Yep, It's probably a rotation as any orthogonal matrix is some kind of rotation (although it's not normalized and can still stretch which makes it harder to imagine again). I also find it hard to find how $S^{T}$ and $R$ relate to each other – evolution Mar 20 '17 at 02:58
  • 1
    @evolution: An orthogonal matrix acts as an isometry (with respect to the standard notion of length and angle in $\mathbb{R}^n$) so it preserves lengths and angles of vectors - it can't stretch vectors. In general, an orthogonal matrix will look like some kind of rotation, followed possibly by a reflection. Regarding your second question, if you know one of $S,R$ and the stretch factors $\lambda$ (and $A$) then the other orthogonal matrix is uniquely determined. If $A$ is symmetric then $R=S$. – levap Mar 23 '17 at 00:53
  • I finally figured out what makes the difference between the symmetric and non-symmetric matrix. By exploiting the symmetry of $AA^{T}$ one can show that non-symmetric matrices decompose into $A^{-1}R \Lambda R^{T}$ (but now R and $\Lambda$ relate to $AA^{T}$ instead of directly to A). This pretty much explains the difference between R and S. S is like R in the symmetric case, but in addition it subsequently maps into $A^{-1}$ (so definitely not just a rotation in total). One could also say that $A^{-1}$ translates between the symmetric and non-symmetric case. – evolution Mar 23 '17 at 06:03
  • @evolution: That doesn't look right. If $A = A^{-1} R \Lambda R^T$ then $A^2$ should be symmetric and it is not symmetric in the general case. You can write $A = (A^{-1})^T R \Lambda^2 R^T$ (or $A^T = A^{-1} R \Lambda^2 R^T$) but this gives in my opinion no meaningful geometric interpretation. – levap Mar 23 '17 at 10:38
  • No, $AA_{T}$ is symmetric. R and $\Lambda$ relate then to $AA_{T}$, but still, in this way you can decompose $AA_{T}$ like that. If that makes it more clear: $A^{T} = A^{-1} R_{AA_{T}} \Lambda_{AA_{T}} R_{AA_{T}} R^{-1}_R_{AA_{T}}$ – evolution Mar 23 '17 at 13:32
  • I mean $A^{T} = A^{-1} R_{AA^{T}} \Lambda_{AA^{T}} (R^{-1})_{AA^{T}}$ I think the next challenge may be to relate A and $AA^{T}$ geometrically. But yes, ${A^{-1}}$ is not the only thing that changes (I think the way I said it was wrong/misleading), there is a new R and a new $\Lambda$ from decomposing $AA^{T}$ – evolution Mar 23 '17 at 13:41
3

When trying to grasp the relation between $A$, $A^T$ and $A^{-1}$, I created the attached plot
For $A^T$ this reads:

  • $\mathcal{r}_{U^T}$ is the rotation performed by $U^T$
  • $\mathcal{s}_{\Sigma}$ is the scaling performed by $\Sigma$
  • $\mathcal{r}_{V}$ is the rotation performed by $V$

The three axes show the SVD-decomposition of the three incarnations of $A$.

  • A green line between two axes indicates equality.
  • A red line indicates a contraposition.

In short, this says
"$A^T$ scales like $A$, but rotates like $A^{-1}$."
So, $A^T$ has more in common with $A^{-1}$ then it has in common with $A$.

Not all matrices have an inverse.
If the inverse does not exist, the plot can still be made, replacing $A^{-1}$ with $A^{\dagger}$ and $\Sigma^{-1}$ with $\Sigma^{\dagger}$.
$A^{\dagger}$ is the generalized inverse of $A$.

Some more detail can be found on: www.heavisidesdinner.com

relation between SVD of <span class=$A$, $A^T$ and $A^-1$">