Probability Distribution

The probability distribution describes the range of possible values that a random variable can attain and the probability that the value of the random variable is within any (measurable) subset of that range. A probability distribution identifies either the probability of each value of a random variable (when the variable is discrete), or the probability of the value falling within a particular interval (when the variable is continuous).

IABoK image normal_distribution

The majority of people don't have a good understanding of probability distributions. For example many people seem to believe that most measurements fall on a "Normal Distribution". This distribution is symmetrical and hence has the same mean and median values with a well defined drop off in probability away from this central value (the shape is similar to that of a bell which is why it is also known as the "Bell Curve").

This simplistic view leads to all sorts of specious reasoning (for an example see While the Normal distribution is a reasonable picture for some phenomena it is clearly not appropriate in other situations.

The normal distribution is typically seen where multiple independent random factors cumulatively contribute to a result. For example errors in measurement or heights of a random population.

IABoK image geom_distribution

But what if there are a number of events that combine in other ways? For example what if we plot the number of coin tosses until we get a tail? In that case the distribution cannot possibly be symmetric.

In this case we get a distribution that has its largest peak at the smallest result and then drops off as the number gets larger. This type of geometric distribution is a reasonable description when we are looking at events which are rare.

IABoK image pois_distribution

But the most commonly encountered situation is when we have a compromise, enough factors to give us a range of values, but not enough to make the result symmetric. This gives us the "Poisson Distribution", a form that can be adjusted between the Normal and Geometric depending on the relative frequencies of the events being studied.

This shows a more widely encountered probability distribution, for example encountered when the results have a minimum value (such as zero) but no maximum. For example the salaries of random people in a given population will look more like this.

