I have been re-studying Linear Algebra and have been curious about the real significance of the Transpose Matrix. It is easy to do, but there must be a reason for doing it right?

If a transformation vector $X$ is not multipliable with transformation Matrix A, what is the geometric, realistic meaning of $A^T X$? I mean in terms of spans, basis, column space and utility. So for the sake of argument, say A does some sheering. What would $A^T$ be doing to $X$?

The second question was with respect to left Null spaces, I can see how say the left null space is perpendicular to the column space. So that might give a good complete estimate of the geometry of the range.

And similarly, I am guessing the Row space and Null spaces do the same for the domain.

My real question is how did all this come up? I feel like there is a little gap or a hole in how I understand these things. Maybe an explanation of how someone thought to actually define row spaces and how it backtracked to being related to a transpose matrix.

Another interesting definition I am curious about is how if A isn't even invertible, but its columns are independant, then $A^TA$ happens to be invertable. What does that mean to the range of $A^TA$ And how is it different from the Range of $A^T$ or the Domain of $A$