If $f(x)$ is a density function and $F(x)$ is a distribution function of a random variable $X$ then I understand that the expectation of x is often written as:

$$E(X) = \int x f(x) dx$$

where the bounds of integration are implicitly $-\infty$ and $\infty$. The idea of multiplying x by the probability of x and summing makes sense in the discrete case, and it's easy to see how it generalises to the continuous case. However, in Larry Wasserman's book *All of Statistics* he writes the expectation as follows:

$$E(X) = \int x dF(x)$$

I guess my calculus is a bit rusty, in that I'm not that familiar with the idea of integrating over functions of $x$ rather than just $x$.

- What does it mean to integrate over the distribution function?
- Is there an analogous process to repeated summing in the discrete case?
- Is there a visual analogy?

**UPDATE:**
I just found the following extract from Wasserman's book (p.47):

The notation $\int x d F(x)$ deserves some comment. We use it merely as a convenient unifying notation so that we don't have to write $\sum_x x f(x)$ for discrete random variables and $\int x f(x) dx$ for continuous random variables, but you should be aware that $\int x d F(x)$ has a precise meaning that is discussed in a real analysis course.

Thus, I would be interested in any insights that could be shared about **what is the precise meaning that would be discussed in a real analysis course?**