Originally, "differentials" and "derivatives" were intimately connected, with derivative being *defined* as the ratio of the differential of the function by the differential of the variable (see my previous discussion on the Leibnitz notation for the derivative). Differentials were simply "infinitesimal changes" in `whatever`

, and the derivative of $y$ with respect to $x$ was the ratio of the infinitesimal change in $y$ relative to the infinitesimal change in $x$.

For integrals, "differentials" came in because, in Leibnitz's way of thinking about them, integrals were the sums of infinitely many infinitesimally thin rectangles that lay below the graph of the function. Each rectangle would have height $y$ and base $dx$ (the infinitesimal change in $x$), so the area of the rectangle would be $y\,dx$ (height times base), and we would add them all up as $S\; y\,dx$ to get the total area (the integral sign was originally an elongated $S$, for "summa", or sum).

Infinitesimals, however, cause all sorts of headaches and problems. A lot of the reasoning about infinitesimals was, well, let's say not entirely rigorous (or logical); some differentials were dismissed as "utterly inconsequential", while others were taken into account. For example, the product rule would be argued by saying that the change in $fg$ is given by
$$(f+df)(g+dg) -fg = fdg + gdf + df\,dg,$$
and then ignoring $df\,dg$ as inconsequential, since it was made up of the product of two infinitesimals; but if infinitesimals that are really small can be ignored, why do we *not* ignore the infinitesimal change $dg$ in the first factor? Well, you can wave your hands a lot of huff and puff, but in the end the argument essentially broke down into nonsense, or the problem was ignored because things worked out regardless (most of the time, anyway).

Anyway, there was a need of a more solid understanding of just what derivatives and differentials actually are so that we can really reason about them; that's where limits came in. Derivatives are no longer ratios, instead they are limits. Integrals are no longer infinite sums of infinitesimally thin rectangles, now they are limits of Riemann sums (each of which is finite and there are no infinitesimals around), etc.

The notation is left over, though, because it is very useful notation and is very suggestive. In the integral case, for instance, the "dx" is no longer *really* a quantity or function being multiplied: it's best to think of it as the "closing parenthesis" that goes with the "opening parenthesis" of the integral (that is, you are integrating whatever is between the $\int$ and the $dx$, just like when you have $2(84+3)$, you are multiplying by $2$ whatever is between the $($ and the $)$ ). But it is very useful, because for example it helps you keep track of what changes need to be made when you do a change of variable. One can justify the change of variable without appealing at all to "differentials" (whatever they may be), but the *notation* just leads you through the necessary changes, so we treat them as if they were actual functions being multiplied by the integrand because they help keep us on the right track and keep us honest.

But here is an ill-kept secret: we mathematicians tend to be lazy. If we've already come up with a valid argument for situation A, we don't want to have to come up with a new valid argument for situation B if we can just explain how to get from B to A, even if solving B directly would be easier than solving A (old joke: a mathematician and an engineer are subjects of a psychology experiment; first they are shown into a room where there is an empty bucket, a trashcan, and a faucet. The trashcan is on fire. Each of them first fills the bucket with water from the faucet, then dumps it on the trashcan and extinguishes the flames. Then the engineer is shown to another room, where there is again a faucet, a trashcan on fire, and a bucket, but this time the bucket is already filled with water; the engineer takes the bucket, empties it on the trashcan and puts out the fire. The mathematican, later, comes in, sees the situation, takes the bucket, and empties it *on the floor*, and then says "which reduces it to a previously solved problem.")

Where were we? Ah, yes. Having to translate all those informal manipulations that work so well and treat $dx$ and $dy$ as objects in and of themselves, into formal justifications that don't treat them that way is a real pain. It can be done, but it's a real pain. Instead, we want to come up with a way of justifying all those manipulations that will be valid always. One way of doing it is by actually giving them a *meaning* in terms of the new notions of derivatives. And that is what is done.

Basically, we want the "differential" of $y$ to be the infinitesimal change in $y$; this change will be closely approximated to the change along the tangent to $y$; the tangent has slope $y'(a)$. But because we don't have infinitesimals, we have to say *how much* we've changed the argument. So we define "the differential in $y$ at $a$ when $x$ changes by $\Delta x$", $d(y,\Delta x)(a)$, as $d(y,\Delta x)(a) = y'(a)\Delta x$. This is exactly the change along the tangent, rather than along the graph of the function. If you take the limit of $d(y,\Delta x)$ over $\Delta x$ as $\Delta x\to 0$, you just get $y'$. But we tend to think of the limit of $\Delta x\to 0$ as being $dx$, so abuse of notation leads to "$dy = \frac{dy}{dx}\,dx$"; this is *suggestive*, but not quite true literally; instead, one then can show that arguments that treat differentials as functions tend to give the right answer under mild assumptions. Note that under this definition, you get $d(x,\Delta x) = 1\Delta x$, leading to $dx = dx$.

Also, notice an interesting reversal: originally, differentials came first, and they were used to define the derivative as a ratio. Today, derivatives come first (defined as limits), and differentials are defined *in terms* of the derivatives.

What is the practical difference, though? You'll probably be disappointed to hear "not much". Except one thing: when your functions represent actual quantities, rather than just formal manipulation of symbols, the derivative and the differential **measure different things.** The derivative measures a **rate** of change, while the differential measures **the change itself.**

So the units of measurement are different: for example, if $y$ is distance and $x$ is time, then $\frac{dy}{dx}$ is measured in distance over time, i.e., velocity. But the differential $dy$ is measured in units of distance, because it represents the change in distance (and the difference/change between two distances is still a distance, not a velocity any more).

Why is it useful to have the distinction? Because sometimes you want to know how something *is changing*, and sometimes you want to know how much something *changed*. It's all nice and good to know the rate of inflation (change in prices over time), but you might sometimes want to know how much more the loaf of bread is now (rather than the rate at which the price is changing). And because being able to manipulate derivatives as if they were quotients can be very useful when dealing with integrals, differential equations, etc, and differentials give us a way of making sure that these manipulations don't lead us astray (as they sometimes did in the days of infinitesimals).

I'm not sure if that answers your question or at least gives an indication of where the answers lie. I hope it does. *Added.* I see Qiaochu has pointed out that the distinction becomes much clearer once you go to higher dimensions/multivariable calculus, so the above may all be a waste. Still...

**Added.** As Qiaochu points out (and I mentioned in passing elsewhere), there *are* ways in which one can give *formal* definitions and meanings to infinitesimals, in which case we *can* define differentials as "infinitesimal changes" or "changes along infinitesimal differences"; and then use them to define derivatives as integrals just like Leibnitz did. The standard example of being able to do this is Robinson's non-standard analysis Or if one is willing to forgo looking at *all* kinds of functions and only at some restricted type of functions, then you can also give infinitesimals, differentials, and derivatives substance/meaning which is much closer to their original conception.