I've a confession to make. I've been using PDF's and PMF's without actually knowing what they are. My understanding is that density equals area under the curve, but if I look at it that way, then it doesn't make sense to refer to the "mass" of a random variable in discrete distributions. How can I interpret this? Why do we call use "mass" and "density" to describe these functions rather than something else?

P.S. Please feel free to change the question itself in a more understandable way if you feel this is a logically wrong question.

  • 133
  • 8
  • 2,463
  • 5
  • 23
  • 21
  • I'm not entirely sure I understand your question, but density does not equal area under the curve. If we take the area interpretation of probability, the density (i.e., the probability density function) is interpreted as a height. So the units aren't even the same. Maybe some of your confusion stems from that? – Mike Spivey Feb 23 '11 at 06:30
  • @Mike: Let me understand mass before going to density. Why to we call a point in the discrete distribution as mass ? Why can't we just call it a point ? – 0x0 Feb 23 '11 at 15:16

4 Answers4


(This answer takes as its starting point the OP's question in the comments, "Let me understand mass before going to density. Why do we call a point in the discrete distribution as mass? Why can't we just call it a point?")

We could certainly call it a point. The utility of the term "probability mass function," though, is that it tells us something about how the function in the discrete setting relates to the function in the continuous setting because of the associations we already have with "mass" and "density." And I think to understand why we use these terms in the first place we have to start with what we call the density function. (In fact, I'm not sure we would even be using "probability mass" without the corresponding "probability density" function.)

Let's say we have some function $f(x)$ that we haven't named yet but we know that $\int_a^b f(x) dx$ yields the probability that we see an outcome between $a$ and $b$. What should we call $f(x)$? Well, what are its properties? Let's start with its units. We know that, in general, the units on a definite integral $\int_a^b f(x) dx$ are the units of $f(x)$ times the units of $dx$. In our setting, the integral gives a probability, and $dx$ has units in say, length. So the units of $f(x)$ must be probability per unit length. This means that $f(x)$ must be telling us something about how much probability is concentrated per unit length near $x$; i.e., how dense the probability is near $x$. So it makes sense to call $f(x)$ a "probability density function." (In fact, one way to view $\int_a^b f(x) dx$ is that, if $f(x) \geq 0$, $f(x)$ is always a density function. From this point of view, height is area density, area is volume density, speed is distance density, etc. One of my colleagues uses an approach like this when he discusses applications of integration in second-semester calculus.)

Now that we've named $f(x)$ a density function, what should we call the corresponding function in the discrete setting? It's not a density function; its units are probability rather than probability per unit length. So what is it? Well, when we say "density" without a qualifier we are normally talking about "mass density," and when we integrate a density function over an object we obtain the mass of that object. With this in mind, the relationship between the probability function in the continuous setting to that of the probability function in the discrete setting is exactly that of density to mass. So "probability mass function" is a natural term to grab to apply to the corresponding discrete function.

Mike Spivey
  • 52,894
  • 17
  • 169
  • 272

Probability mass functions are used for discrete distributions. It assigns a probability to each point in the sample space. Whereas the integral of a probability density function gives the probability that a random variable falls within some interval.

  • 13,147
  • 7
  • 54
  • 74
  • 4
    I understand this. My question is why do we use the word "mass" and "density" for this ? What is the reason behind it ? – 0x0 Feb 23 '11 at 00:48
  • 8
    @Sunil: Think of the discrete distribution as having a mass at each point, where the probability of that point is how much of the total mass is there. Then the continuous case is linear density, where the mass is spread over an interval. – Ross Millikan Feb 23 '11 at 01:44
  • I get it but I was also interested in the history behind it if at all anybody knew about it. There must be a reason to have to use the notation right? I can also use each point as just a point and density as just the area. What prevents from using that ? – 0x0 Feb 23 '11 at 03:33
  • 2
    This cleared it to me very well! Thanks for your explanation. Even if I am 8 years late, it's still great! –  May 11 '19 at 16:58

The most basic difference between probability mass function and probability density function is that probability mass function concentrates on a certain point for example, if we have to find a probability of getting a number 2. Then our whole concentration is on 2. Hence we use pmf however in pdf our concentration our on the interval it is lying. For e.g.$ -\infty <= X <= \infty $. Always remember that discrete and continuous are dependent on the Range.

Nebo Alex
  • 1,908
  • 2
  • 19
  • 39
  • 51
  • 1
  • 1
  • 1
    do you mind explaining what you mean by "always remember that discrete and continuous are dependent on the Range"? – K.M. May 21 '19 at 22:31
  • 1
    @K.M. "Always remember that discrete and continuous are dependent on the Range", means if $f:S\rightarrow X$, where $X$ is finite or countably-infinite, then $f$ is a discrete function. Similarly, if $f:S\rightarrow X$ where $X$ is uncountably-infinite, then $f$ is not discrete. –  Oct 28 '19 at 19:32

If we have a probability distribution function $F_X(x)$ then its probability density function limits $-\infty$ to $+\infty$ $f_X(x) ~dx$; and it is for continuous variables.

And it is probability mass function is equal to $\sum xf(x)$ and it is for discrete variables.

  • 25,533
  • 7
  • 83
  • 140
  • 41
  • 1