I'm trying to understand basic tensor analysis. I understand the basic concept that the valency of the tensor determines how it is transformed, but I am having trouble visualizing the difference between different valencies when it comes to higher order tensors.

I have this picture in my mind for the lower order tensors

$X^i = \left(\begin{array}{x} x^1 \\\\ x^2 \\\\ x^3\end{array}\right)$

$X_i = \left(\begin{array}{ccc} x_1 & x_2 & x_3\end{array}\right)$

$X^i_j = \left(\begin{array}{ccc} x^1_1 & x^1_2 & x^1_3 \\\\ x^2_1 & x^2_2 & x^2_3 \\\\ x^3_1 & x^3_2 & x^3_3\end{array} \right)$

for $X^{ij}$ and $X_{ij}$ they are represented in the same 2d array, but the action on a vector isn't defined in the same way as with matrices.

What I am having trouble with is intuitively understanding the difference between $X^{ijk}$, $X_{k}^{ij}$, $X_{jk}^{i}$, and $X_{ijk}$ (other permutations of the valence $(2,1)$ and $(1,2)$ omitted for brevity).

ADDED After reading the responses and their comments I came up with this new picture in my head for higher order tensors.

Since I am somewhat comfortable with tensor products in quantum mechanics, I can draw a parallel with the specific tensor space I'm used to.

If we consider a rank-5 tensor with a valence of (2,3) then we can consider it in the braket notation as

$ \langle \psi_i \mid \otimes \ \langle \psi_j \mid \otimes \ \langle \psi_k \mid \otimes \mid \psi_l \rangle \ \otimes \mid \psi_m \rangle = X_{ijk}^{lm} $

Now if we operate with this tensor on rank-3 contravariant tensor, we are-left with a constant (from the inner product) and a rank-2 contravariant tensor, unmixed tensor product $\begin{eqnarray}(\langle \psi_i \mid \otimes \ \langle \psi_j \mid \otimes \ \langle \psi_k \mid \otimes \mid \psi_l \rangle \ \otimes \mid \psi_m \rangle)(\mid \Psi_i \rangle \ \otimes \mid \Psi_j \rangle \ \otimes \mid \Psi_k \rangle) &=& c \mid \psi_l \rangle \ \otimes \mid \psi_m \rangle \\\\ &=& X_{ijk}^{lm}\Psi^{ijk} = cX'^{lm}\end{eqnarray}$

If we were to further operate with a rank-2 covariant tensor (from the right, per convention that a covector and vector facing each other is an implied direct product) we would simply get a number out.

One thing I am confused about though, is that in one of the answer to this question there was a point made that we are taking tensor products of a Vector space with itself (and possibly it's dual), however in the quantum mechanics picture (although I didn't rely on it in this example) we often take tensor products between different, often disjoint, subspaces of the enormous Hilbert space that describes the quantum mechanical universe. Does the tensor picture change in this case?

Any comments on my example would be appreciated.

  • 449
  • 4
  • 13
  • 4,541
  • 6
  • 29
  • 29
  • 2
    If $V$ and $W$ are different vector spaces, an element in the tensor product space $V^* \otimes W^*$ (for example) is a bilinear map which takes one vector from $V$ and one from $W$ and returns a scalar. You can still write mixed tensors like this in terms of components, if you introduce a basis $(e_1,\ldots,e_n)$ in $V$ and a basis $(f_1,\ldots,f_n)$ in $W$ and corresponding dual bases in $V^*$ and $W^*$. For example, $V \otimes W$ will have basis elements $e_i \otimes f_j$. Does that make things clearer? I don't know exactly what you mean by "tensor picture". – Hans Lundmark Oct 28 '10 at 21:59
  • Isn't thinking of $X^i$ as an e.g. 3x1 vector and $X_i$ as a corresponding 1x3 vector a bit misleading? Although it makes sense for calculating $X_iX^i$ using the matrix form, the important relation between them is $X^i=g^{ij}X_j$, where $g$ is the [metric tensor](https://en.m.wikipedia.org/wiki/Metric_tensor), right? So they're not generally tranposes of each other; it depends on what calculation you're doing, I think (e.g. [this question on physics SE](http://physics.stackexchange.com/questions/76640/how-do-you-show-from-the-index-notation-that-the-change-of-frame-formula-for-a-m)) – binaryfunt May 22 '16 at 12:21
  • 1
    if you represent $X^{ij}$ and $X_{ij}$ like [HERE](https://www.vttoth.com/CMS/physics-notes/139-on-tensors-and-their-matrix-representations) then you will be able to use matrix multiplication to operate on vector - you will also find there matrix representation of $X^i_{jk}$ – Kamil Kiełczewski Feb 19 '20 at 14:54

5 Answers5


Since you asked for an intuitive way to understand covariance and contravariance, I think this will do.

First of all, remember that the reason of having covariant or contravariant tensors is because you want to represent the same thing in a different coordinate system. Such a new representation is achieved by a transformation using a set of partial derivatives. In tensor analysis, a good transformation is one that leaves invariant the quantity you are interested in.

For example, we consider the transformation from one coordinate system $x^1,...,x^{n}$ to another $x^{'1},...,x^{'n}$:

$x^{i}=f^{i}(x^{'1},x^{'2},...,x^{'n})$ where $f^{i}$ are certain functions.

Take a look at a couple of specific quantities. How do we transform coordinates? The answer is:

$dx^{i}=\displaystyle \frac{\partial x^{i}}{\partial x^{'k}}dx^{'k}$

Every quantity which under a transformation of coordinates, transforms like the coordinate differentials is called a contravariant tensor.

How do we transform some scalar $\Phi$?

$\displaystyle \frac{\partial \Phi}{\partial x^{i}}=\frac{\partial \Phi}{\partial x^{'k}}\frac{\partial x^{'k}}{\partial x^{i}}$

Every quantity which under a coordinate transformation, transforms like the derivatives of a scalar is called a covariant tensor.

Accordingly, a reasonable generalization is having a quantity which transforms like the product of the components of two contravariant tensors, that is

$A^{ik}=\displaystyle \frac{\partial x^{i}}{\partial x^{'l}}\frac{\partial x^{k}}{\partial x^{'m}}A^{'lm}$

which is called a contravariant tensor of rank two. The same applies to covariant tensors of rank n or mixed tensor of rank n.

Having in mind the analogy to coordinate differentials and derivative of a scalar, take a look at this picture, which I think will help to make it clearer:

From Wikipedia:

alt text

The contravariant components of a vector are obtained by projecting onto the coordinate axes. The covariant components are obtained by projecting onto the normal lines to the coordinate hyperplanes.

Finally, you may want to read: Basis vectors

By the way, I don't recommend to rely blindly on the picture given by matrices, specially when you are doing calculations.

Robert Smith
  • 2,884
  • 27
  • 39
  • Thanks for the response. I wasn't implying I use the matrix picture for calculations, it was just my intuitive way to understand what a mixed rank-2 tensor represents. – crasic Oct 28 '10 at 19:24
  • Yes, I understood what you mean. It was just a side note. – Robert Smith Oct 28 '10 at 20:13
  • Late reply, but what determines whether $e^i$ points "outwards" or "inwards" from the surface? – AkariAkaori May 06 '17 at 22:51
  • @AkariAkaori Based on my understanding, the factor that determines the direction of a basis vector is the handedness of the set ${e^1,e^2,e^3}$. So you need to know the convention used in a given situation. Orthonormal bases use the right-hand rule by convention, but keep in mind that a basis in curvilinear coordinates doesn't need to be orthogonal and/or normal. – Robert Smith May 08 '17 at 19:55
  • 1
    @RobertSmith Did you mean $dx^{i}=\displaystyle \frac{\partial x^{i}}{\partial x^{'k}}dx^{'k}$? is it a typo? –  Jan 23 '19 at 23:47
  • @Navaro Yes, that's a typo. It is fixed now. – Robert Smith Jan 24 '19 at 17:37

I prefer to think of them as maps instead of matrices. When you move to tensor bundles over manifolds, you won't have global coordinates, so it might be preferable to think this way.

So $x_i$ is a map which sends vectors to reals. Since it's a tensor, you're only concerned with how it acts on basis elements. It's nice to think of them in terms of dual bases: then $x_i(x^j)=\delta_{ij}$, which is defined as $1$ when $i=j$ and $0$ otherwise.

Similarly, $x^i$ is a map which sends covectors to reals, and is defined by $x^i(x_j)=\delta_{ij}$.

If you have more indices, then you're dealing with a tensor product $V^*\otimes\dotsb\otimes V^*\otimes V\otimes\dotsb\otimes V$, say with $n$ copies of the vector space and $m$ copies of the dual. An element of this vector space takes in $m$ vectors and gives you back $n$, again in a tensorial way. So, for example, $X_{ijk}$ is a trilinear map; $X^{ijk}$ is a trivector (an ordered triple of vectors up to linearity); $X_{ij}^k$ is a bilinear map taking two vectors to one vector; and so on.

It's worth thinking about these in terms of the tensors you've seen already. The dot product, for example, is your basic (0,2)-tensor. The cross product is a (1,2)-tensor. If you study Riemannian manifolds, it turns out you can use the metric to "raise and lower indices"; so the Riemannian curvature tensor, for example, is alternately defined as a (1,3)-tensor and a (0,4)-tensor, depending on the author's needs.

Paul VanKoughnett
  • 5,104
  • 2
  • 27
  • 27
  • I agree, except that you seem to have swapped upper and lower indices: $X^i$ usually denotes (the components of) a (1,0)-tensor, that is a vector, that is a map which sends _covectors_ to numbers. And so on. (But in the last paragraph you are back to standard usage again.) – Hans Lundmark Oct 28 '10 at 10:14
  • 1
    Would it be fair to say that a (m,n) tensor $X_{*m}^{*n}$ takes n vectors to and m-tuple of vectors? What does this map do with covectors? – crasic Oct 28 '10 at 10:38
  • Bleh, you're right. Indices bug me. I'll fix it. – Paul VanKoughnett Oct 28 '10 at 10:38
  • An $(n,m)$ tensor takes $m$ vectors to an $n$-tuple of vectors. It would do the opposite on covectors, namely, take $n$ covectors to an $m$-tuple of covectors. – Paul VanKoughnett Oct 28 '10 at 10:41
  • On the other hand, your "dual basis" example suggests that what you wrote were not components but rather basis vectors and covectors, in which case the indices should indeed be that way. But if that's the case, then I would suggest different notation. A basis for $V$ could for example be denoted by $(e_1,\ldots,e_n)$, so that the vector often written as $X^i$ is really $X=\sum_{i=1}^n X^i e_i$. (It makes sense to write the numbers on the $e$'s as subscripts, since this goes well with the Einstein summation convention, "$X=X^i e_i$".) – Hans Lundmark Oct 28 '10 at 10:56
  • For the corresponding dual basis for $V^*$ we could write $(\theta^1,\ldots,\theta^n)$ (with superscripts). Then $\theta^i(e_j)=\delta^i_j$. – Hans Lundmark Oct 28 '10 at 10:56
  • @crasic: $X_k^{ij}$ was a mistake; Paul has corrected it now. – Hans Lundmark Oct 28 '10 at 11:24
  • By the way, the cross product is not an optimal example since it doesn't quite transform as a vector should. (See for example the "pseudovector" article on Wikipedia; this is a very ugly concept which has been introduced because people insist on doing everything in vector language instead of using the full exterior algebra where a vector times a vector is a _bivector_, but now I'm drifting away from the main topic...) – Hans Lundmark Oct 28 '10 at 11:24
  • 2
    @crasic and @Paul: It's not correct to say that an $(n,m)$ tensor takes $m$ vectors to an $n$-tuple of vectors -- if you think of it this way, the output is actually an $n$-vector (an element of the $n$-fold tensor product of $V$ with itself), which can't be identified with an $n$-tuple. When $n$ is bigger than 1, in many ways it's easier to think of an $(m,n)$ tensor as taking $m$ vectors and $n$ covectors and yielding a number that depends linearly on each input separately. – Jack Lee Oct 28 '10 at 16:58
  • @Jack: That's sort of what I meant by saying "in a tensorial way," but you're right that it should be clearer. I'll fix it soon. – Paul VanKoughnett Oct 28 '10 at 17:32

The covariance or a contravariance of certain quantities tell you how to transform them to keep the result invariant from the choice of the coordinate system. You transform covariant quantities one-way while you do the inverse with the contravariant ones.

To describe a vector you need coordinates $v^j$ and basis vectors $\mathbf{e_j}$. So the linear combination of the two gives you the actual vector $v^j \mathbf{e_j}$.

But you are free to choose the basis so in a different basis the same vector maybe described as $w^j \mathbf{f_j}$.

So $v^j \mathbf{e_j} = w^j \mathbf{f_j}$

The basis vectors themselves can be expressed as a linear combinations of the other basis:

$\mathbf{e_j} = A^k_j \mathbf{f_k}$.

There $A$ is the basis transformation matrix. Let's have another matrix $B$. Which is inverse of the matrix $A$, so their product gives an identity matrix (Kronecker-delta):

$B^l_j A^k_l = \delta^k_j$

Let's take $w^j \mathbf{f_j}$ and multiply it with the identity, nothing changes:

$w^j \delta^k_j \mathbf{f_k}$

Expand the delta as a product of the two matrices, nothing changes:

$w^j B^l_j A^k_l \mathbf{f_k}$

Parenthesize it like this and you can see something:

$\left( w^j B^l_j \right) \left( A^k_l \mathbf{f_k} \right)$

In the right bracket you got back $\mathbf{e_j}$. While on the left bracket there must be $v^j$.

You can see the basis vectors are transformed with $A$, while the coordinates are the transformed with $B$. The basis vectors very in one way, while the coordinates vary exactly the opposite way. The basis vectors are covariant, the coordinates are contravariant.

Upper indexes and lower indexes just denote whether you need to use the basis change matrix or the inverse. So if you have a tensor let's say: ${F^{abc}}_{defg}$. Based on the index placement you already know you can transform it to a different coordinate system like this: ${F^{abc}}_{defg} B^h_a B^i_b B^j_c A^d_k A^e_l A^f_m A^g_n$.

Also if you care to always match the upper indexes with the lower ones when multiplying the result will be invariant and coordinate system independent. This is an opportunity to self check your work.

Index placement is also helpful to check whether and object is really tensor or just a symbol.

For example the metric tensor $g_{ij}$ have two covariant indexes that mean in a different coordinate system it must look like this: $\tilde g_{ij} A^i_k A^j_l$.

And indeed: $g_{ij} = \mathbf{e_i} \cdot \mathbf{e_j} = \left( \mathbf{f_k} A^k_i \right) \cdot \left( \mathbf{f_l} A^l_j \right) = \left( \mathbf{f_k} \cdot \mathbf{f_l} \right) A^k_i A^l_j = \tilde{g}_{kl} A^k_i A^l_j $

Similarly you can check the Christoffel-symbols $\Gamma^m_{jk}$ aren't tensors, because they aren't transform like that.

While the covariant derivative $\nabla_k v^m = \partial_k v^m + v^j \Gamma^m_{jk}$ does. But that would require more symbol folding.

  • 873
  • 8
  • 12

(cross-posting a variation of my answer from this)

Both co-variant and contra-variant vectors are just vectors (or more generaly 1-order tensors).

Furthermore they are vectors which relate to the same underlying space (for example euclidean space or generally, a manifold)

Furthermore they relate to the same space but in different but dual ways (as such they have different, but dual, transformation laws)

Co-variant vectors are part of what is called the tangent space, which for an euclidean space coincides or is isomorphic to the space itself. And contra-variant vectors are part of the dual tangent space, called co-tangent space and which for an euclidean space also coincides or is also isomorphic to the space itself.

These spaces (and their vectors) are dual, in the algebraic sense, related through the norm (inner product) of the space. Plus are isomorphic to each other regardless if they are isomorphic to the underlying manifold itself (what is called raising and lowering indices)

A question is what these (associated) spaces represent, how are they related and what is the intuition behind their use?

Historicaly tensors and tensor analysis was initiated as a by-product of the theory of invariants. A way was needed to express invariant quantities under a change of representation (or change of underlying basis). Thus tensors were used. tensors represent quantities which transform under a change of representation in such ways as to make various quantities expressed in terms of them invariant.

Note, the terminology association with co-variant/contra-variant indices is largely an convention, any consistent convention will do.

This also gives the (intuitive) relation between co-variant tensors (vectors) and contra-variant tensors (vectors). When a co-variant vector (components) transform in one way, for example by a scaling factor $s$. The (associated) contra-variant vector (components) will have to transform by the inverse scaling factor $1/s$ in order for invariant quantities (for example an inner product $a^ib_i$) to remain invariant.

If one goes to higher-rank tensors (e.g $g_{ij}$, $R^{i}_{jkl}$) one can see an additional dual association as such:

Lets say we have a covariant tensor (i.e metric) $g_{ij}$ and want the contra-variant part ($g^{ij}$). if one part of the metric (e.g for $g_{12}$) represents an area (say $A_{12}$), the dual tensor at that same indices ($g^{12}$) represents the dual area (say $A^{12}$) which is the area remaing after the are $A_{12}$ is subtracted form the whole area (say $A$).

Nikos M.
  • 2,011
  • 11
  • 22

A vector $v$ can be mapped to an number (value of ℝ), let's say 1. Linearly independent vectors $v_i$ map to values in independent ℝ's. We can specify $v$ with components $v=v^iv_i$ in the n-dimensional vector space.

The contravariant $v^i$ constitute a map: $v_i→v$. With $v$ represented as 1 (which is the map target and viewpoint by choice, here), $v_i$ is the inverse of $v^i$.

  • Making the covariant parts $v_i$ (lower index) division's denominator/divisor (= vectors ∈ tangent space) ...

  • ... makes the contravariant parts $v^i$ (upper index) maps from $v_i$ to $v$ by multiplication factors (= 1-forms ∈ cotangent space, if one has no inner product or metric yet).

$v^i$ corresond to $dx^i$ and $v_i$ correspond to $\frac{∂}{∂x_i}$ in differential geometry, which fits to the said.

In coordinate transformations the targets can be upper indices (contravariant nature) or lower indices (covariant nature) or both. Target indices in the transformation matrix stay, while source indices get contracted with the inverse index of the source by Einstein summation. Lower index is the inverse index of the upper index and vice versa (See Raising and lowering indices).