238

When I first took linear algebra, we never learned about dual spaces. Today in lecture we discussed them and I understand what they are, but I don't really understand why we want to study them within linear algebra.

I was wondering if anyone knew a nice intuitive motivation for the study of dual spaces and whether or not they "show up" as often as other concepts in linear algebra? Is their usefulness something that just becomes more apparent as you learn more math and see them arise in different settings?


Edit

I understand that dual spaces show up in functional analysis and multilinear algebra, but I still don't really understand the intuition/motivation behind their definition in the standard topics covered in a linear algebra course. (Hopefully, this clarifies my question)

WWright
  • 5,190
  • 4
  • 28
  • 33

7 Answers7

232

Let $V$ be a vector space (over any field, but we can take it to be $\mathbb R$ if you like, and for concreteness I will take the field to be $\mathbb R$ from now on; everything is just as interesting in that case). Certainly one of the interesting concepts in linear algebra is that of a hyperplane in $V$.

For example, if $V = \mathbb R^n$, then a hyperplane is just the solution set to an equation of the form $$a_1 x_1 + \cdots + a_n x_n = b,$$ for some $a_i$ not all zero and some $b$. Recall that solving such equations (or simultaneous sets of such equations) is one of the basic motivations for developing linear algebra.

Now remember that when a vector space is not given to you as $\mathbb R^n$, it doesn't normally have a canonical basis, so we don't have a canonical way to write its elements down via coordinates, and so we can't describe hyperplanes by explicit equations like above. (Or better, we can, but only after choosing coordinates, and this is not canonical.)

How can we canonically describe hyperplanes in $V$?

For this we need a conceptual interpretation of the above equation. And here linear functionals come to the rescue. More precisely, the map

$$\begin{align*} \ell: \mathbb{R}^n &\rightarrow \mathbb{R} \\ (x_1,\ldots,x_n) &\mapsto a_1 x_1 + \cdots + a_n x_n \end{align*}$$

is a linear functional on $\mathbb R^n$, and so the above equation for the hyperplane can be written as $$\ell(v) = b,$$ where $v = (x_1,\ldots,x_n).$

More generally, if $V$ is any vector space, and $\ell: V \to \mathbb R$ is any non-zero linear functional (i.e. non-zero element of the dual space), then for any $b \in \mathbb R,$ the set

$$\{v \, | \, \ell(v) = b\}$$

is a hyperplane in $V$, and all hyperplanes in $V$ arise this way.

So this gives a reasonable justification for introducing the elements of the dual space to $V$; they generalize the notion of linear equation in several variables from the case of $\mathbb R^n$ to the case of an arbitrary vector space.

Now you might ask: why do we make them a vector space themselves? Why do we want to add them to one another, or multiply them by scalars?

There are lots of reasons for this; here is one: Remember how important it is, when you solve systems of linear equations, to add equations together, or to multiply them by scalars (here I am referring to all the steps you typically make when performing Gaussian elimination on a collection of simultaneous linear equations)? Well, under the dictionary above between linear equations and linear functionals, these processes correspond precisely to adding together linear functionals, or multiplying them by scalars. If you ponder this for a bit, you can hopefully convince yourself that making the set of linear functionals a vector space is a pretty natural thing to do.

Summary: just as concrete vectors $(x_1,\ldots,x_n) \in \mathbb R^n$ are naturally generalized to elements of vector spaces, concrete linear expressions $a_1 x_1 + \ldots + a_n x_n$ in $x_1,\ldots, x_n$ are naturally generalized to linear functionals.

sunspots
  • 689
  • 8
  • 16
Matt E
  • 118,032
  • 11
  • 286
  • 447
  • 4
    So I imagine that the the gist of your answer the following: Various "elimination" procedures for solving systems of equations are naturally happening in the dual space. $$ $$ Is there a similar way to think about double duals, which fits in with the canonical isomorphism? –  Nov 13 '10 at 13:08
  • 51
    @George: Dear George, I think that the easiest way to think about the double dual in the framework of my answer is that in the equation $a_1 x_1 + \cdots + a_n x_n = b,$ the roles of $(a_1,\ldots,a_n)$ and $(x_1,\ldots,x_n)$ are completely symmetrical, so that either collection of variables can be thought of as being dual to the other. – Matt E Nov 13 '10 at 13:35
  • 2
    This answer illuminated me. Thank you. – Marco Lecci Jan 31 '20 at 14:18
53

Since there is no answer giving the following point of view, I'll allow myself to resuscitate the post.

The dual is intuitively the space of "rulers" (or measurement-instruments) of our vector space. Its elements measure vectors. This is what makes the dual space and its relatives so important in Differential Geometry, for instance. This immediately motivates the study of the dual space. For motivations in other areas, the other answers are quite well-versed.

This also happens to explain intuitively some facts. For instance, the fact that there is no canonical isomorphism between a vector space and its dual can then be seen as a consequence of the fact that rulers need scaling, and there is no canonical way to provide one scaling for space. However, if we were to measure the measure-instruments, how could we proceed? Is there a canonical way to do so? Well, if we want to measure our measures, why not measure them by how they act on what they are supposed to measure? We need no bases for that. This justifies intuitively why there is a natural embedding of the space on its bidual. (Note, however, that this fails to justify why it is an isomorphism in the finite-dimensional case).

Aloizio Macedo
  • 32,160
  • 5
  • 56
  • 126
  • 5
    That is the most interesting and useful explanation of duals I have come across, its becoming clear why this applies to tensors. thanks – daven11 Jul 08 '18 at 11:41
26

There are some very beautiful and easily accessible applications of duality, adjointness, etc. in Rota's modern reformulation of the Umbral Calculus. You'll quickly gain an appreciation for the power of such duality once you see how easily this approach unifies hundreds of diverse special-function identities, and makes their derivation essentially trivial. For a nice introduction see Steven Roman's book "The Umbral Calculus".

Bill Dubuque
  • 257,588
  • 37
  • 262
  • 861
  • 6
    Anyone interested in this subject should read this short and elegant paper of Rota: http://www.jstor.org/stable/2312585 – Qiaochu Yuan Sep 01 '10 at 03:01
  • 5
    No doubt that's an elegant paper. But, alas, in that paper Rota doesn't explicitly emphasize the underlying duality and adjointness - which is brought to the fore in Roman's treatment. – Bill Dubuque Sep 01 '10 at 04:03
  • So are duality, adjointness, etc. important in functional analysis or representation theory or topology or real analysis? I am curious. – Noppawee Apichonpongpan Jan 09 '22 at 16:50
17

Absolutely yes to the last question. There are already many applications listed in the Wikipedia article. I will give a few more.

First of all, together with tensor products, dual spaces can be used to talk about linear transformations; the vector space $\text{Hom}(V, W)$ of linear transformations from a vector space $V$ to a vector space $W$ is canonically isomorphic to the tensor product $V^{\ast} \otimes W$. I assume you care about linear transformations, so you should care about tensor products and dual spaces as well. (A really important feature of this decomposition is that it is true in considerable generality; for example, it holds for representations of groups and is a natural way to prove the orthogonality relations for characters.)

Dual spaces also appear in geometry as the natural setting for certain objects. For example, a differentiable function $f : M \to \mathbb{R}$ where $M$ is a smooth manifold is an object that produces, for any point $p \in M$ and tangent vector $v \in T_p M$, a number, the directional derivative, in a linear way. In other words, a differentiable function defines an element of the dual to the tangent space (the cotangent space) at each point of the manifold.

Qiaochu Yuan
  • 359,788
  • 42
  • 777
  • 1,145
  • 2
    Your terminology seems non-standard. The cotangent space is normally composed of the 1-forms, not the 0-forms. – yasmar Oct 18 '10 at 14:39
  • It has got me confused though. A one form is an element of $T^*_pM = \text{Hom}_\mathbb{R}(T_pM,\mathbb{R})$, but it seems like you've described the 0-forms this way too. What is the difference? – yasmar Oct 18 '10 at 14:51
  • 4
    Okay, I have got it (with the help of Wikipedia: http://en.wikipedia.org/wiki/Differential_form ). Your terminology is fine, it was my poor reading skills that were at fault. You said a differentiable function *defines* an element of the cotangent space (i.e., df), but I read *is*. Sorry. – yasmar Oct 18 '10 at 15:00
  • Is the object described in the third paragraph related to the gradient? – user123124 Jul 11 '18 at 11:18
  • @user1: yes, the differential form is a version of the gradient which is defined even in the absence of a Riemannian metric. It's a differential 1-form, whereas the gradient is a vector field, and you need something like a metric to pass between the two. – Qiaochu Yuan Jul 11 '18 at 18:36
4

Here is something to add to all the answers and discussions up to now.

I will explain with an example. Let us think of the space of polynomials of degree $n$ in the real interval $[0,1]$. We want to consider a basis. For example the monomials $\{1, x, x^2, \cdots, x^n \}$. Another basis could be the Bernoulli polynomials $\{B_0(x), B_1(x), \cdots , B_n(x)\}$.

The dual basis $\{B^*(x)\}$ satisfies the property:

\begin{equation} \langle B^*_i(x) , B_j(x) \rangle = \delta_{ij}. \end{equation} Given a polynomial $p(x)$ we would like to find a representation of it in terms of its bases $\{B_i\}$. We could do Gram-Schmidt decomposition but this is hard to do (well, maybe not so hard, look for Legendre Polynomials). Another way is to find the dual basis of $\{B_i\}$.

The dual basis of the Bernoulli polynomials is derived here

What is the advantage of using the dual basis?

Let us see:

  1. Functional Analysis

Given any polynomial $p(x)$ in the basis $B_j$ we have that there exists coefficients $c_i$ such that $p(x)$ is spanned by the Bernoulli polynomials. That is, \begin{eqnarray*} p(x) = \sum_i c_i B_i(x) = \sum_i \sum_j \delta_{ij} c_j B_i(x) = \sum_i \sum_j \left \langle B^*_i(y) \; , \;B_j(y) \right \rangle c_j B_i(x) = \sum_i \langle B_i^*(y) , \sum_j c_j B_j (y) \rangle B_i(x), \end{eqnarray*} that is,

\begin{eqnarray} p(x) = \sum_i c_i B_i(x) = \sum_i \left \langle B_i^*(y) \; , \; p(y) \right \rangle B_i(x), \quad \quad (1) \end{eqnarray} Since usually inner products are written as integrals we write \begin{eqnarray*} \int_0^1 \sum_i B_i(x) B_i^*(y) p(y) dy = p(x) \end{eqnarray*} and then we can interpret the sum of functionals $B^*_i$ weighted by Bernoulli polynomials $B_i$ as \begin{eqnarray} \sum_i B_i(x) B_i^*(y) = \delta(x-y). \end{eqnarray}

The power of dual basis is that we do no have to do anything like a Gram-Schmidt orthogonalization to have a nice representation of a function in terms of its bases. We see that the Bernoulli polynomials are not orthonormal on the monomial basis $\{1, x, \cdots x^i, \cdots \}$. However when representing $p(x)=\sum_i c_i B_i(x)$, we see that the Fourier coefficients $c_i$ are represented by

\begin{eqnarray*} c_i = \left \langle B_i^*(y) \; , \; p(y) \right \rangle \end{eqnarray*} due to the fact that the vectors $B_i(x)$ are linearly independent in equation (1). That is, the coordinates of the vector $p(x)$ on the bases $\{ B_i(x) \}$ are the projections of the dual base vectors $\{B^*_i\}$ into the vector $p(x)$.

  1. Tensor Analysis:

The concept of dual is closely related to the concept of covariant tensor. A contra-variant tensor can be seen as a vector and its covariant counter-part as the dual. If your base is orthonormal the representation is the same in the original basis and the dual bases. Cartesian tensors do not need super-indices. On the other hand, if your basis is not orthonormal, then the dual basis is neither orthogonal and the representation is different. Here you need to distinguish between covariant and contravariant. That is why you need notations such as $\delta_{ij}^{mn}$, a second rank contravariant, second rank covariant. The inner product of a contravariant with a covariant (the regular inner product) produces an invariant. Without duality these concepts would not even exist.

Herman Jaramillo
  • 2,568
  • 21
  • 25
4

To take the case in $ \mathbb{R}^n $

If $ A $ are the Transformation Matrix from the Natural Basis to an Arbitrary Standard Basis then $ A^{-1} $ are the Map from the Natural Basis to the Dual Basis. That $ A^{-1} $ exist is guaranteed as long as the Standard Basis Span the same Vector Space as the Natural Basis and are Linear Independent (which it by definition should be).

Dual Basis <---> Natural Basis <---> Standard Basis

Let $ \vec{e}_{\alpha} = \vec{e}^{\ \alpha} $ be Natural Basis

Let $ \vec{e}_{\widetilde{\alpha}} $ be the Standard Basis

Let $ \vec{e}^{\ \widetilde{\alpha}} $ be the Dual Basis

$ \vec{e}_{\widetilde{\alpha}} = \sum_{\alpha} A^{\alpha}_{\ \ \widetilde{\alpha}} \ \vec{e}_{\alpha} $

$ \vec{e}_{\widetilde{\alpha}} = \sum_{\alpha} \sum_{\widetilde{\beta}} A^{\alpha}_{\ \ \widetilde{\alpha}} \ A^{\alpha}_{\ \ \widetilde{\beta}} \vec{e}^{\ \widetilde{\beta}} $

Note that the two A's in the middle can be substituted with the Metric Tensor.

$ \vec{e}_{\widetilde{\alpha}} = \sum_{\widetilde{\beta}} g_{\widetilde{\alpha}\widetilde{\beta}} \vec{e}^{\ \widetilde{\beta}} $

The "Engineer's Approach" to Dual Space would be:

1) Some Vector Components $ (x^{\alpha}) $ have the unit $[m]$

2) Some Vector Components $ (x_{\alpha}) $ have the unit $[1/m] \ $ (e.g. gradient)

3) When Multiplying Components with Bases the resulting Vector live in $ \mathbb{R}^n $ and have the unit $ [1] $

4) Therefore; Need two different Bases

4

I still have very little mathematical experience (mainly linear algebra and a bit or real analysis) so the only answer I can give you is that dual spaces are already useful in linear algebra itself.

LEMMA dimension of the annihilator Let $V$ be a vector space on the field $\mathbb{F}$ with $\operatorname{dim}V = n$ and $W \subseteq V$ a subspace, if $W^{0} = \{ \phi \in V^{*} | \phi_{|W} = 0 \}$ is the annihilator of $W$, then $\operatorname{dim}W + \operatorname{dim}W^{0} = n$

Let $(w_1, \ldots, w_k)$ be a basis of $W$ and $(w_1,\ldots,w_k,v_{k+1},\ldots,v_n)$ a completion to basis of $V$, if $(w_1^{*},\ldots,w_k^{*},v_{k+1}^{*},\ldots,v_n*)$ is the corresponding dual base of $V^{*}$ then it's easy to prove by double inclusion that $W^{0}=\langle v_{k+1}^{*},...,v_n^{*}\rangle$ and thus $\operatorname{dim}W^{0}= n-k$

THEOREM dimension of the orthogonal space Let $V$ be a vector space on the field $\mathbb{F}$ with $\operatorname{dim}V = n$, endowed with a nondegenerate scalar product $\langle , \rangle$, then if $W \subseteq V$ is a subspace $\operatorname{dim}W + \operatorname{dim}W^{\bot} = n$

First, notice that the statement is not obvious. In fact if a scalar product is not positive-definite $V = W \oplus W^{\bot}$ could be false (you could have $v \in W$ such that $\langle v, v\rangle$ = 0 and therefore $W\cap W^{\bot}\neq \{ 0\}$). Now by mean of the scalar product you can define a very nice isomorphism $\psi : V \to V^{*}$ such that $\psi (v) = \langle \cdot , v\rangle$ for all $v \in V$, you can think about it as the product by $v$. This is an isomorphism, in fact $\operatorname{dim}V = \operatorname{dim}V^{*}$ and $\operatorname{ker}\psi = 0$ because $\langle , \rangle$ is nondegenerate. Now observe that $\psi (W^{\bot}) = W^{0}$ and thus $\operatorname{dim}W^{\bot}=\operatorname{dim}W^{0}=n-\operatorname{dim}W$ by the lemma.

  • I don't know how to do curly brackets, \{ and \} don't work. Also v_n^{*} seems to cause problems to the parser. –  Sep 02 '10 at 11:21
  • 1
    you can get curly brackets by putting double backslashes in front of { and }. Not sure about the v_n^{*} thing though. – WWright Sep 06 '10 at 19:13