The Poisson distribution as an approximation of the

binomial distribution

Introduction

When I was reading up on radioactive decay,

I found out that the probability of decay could be expressed in terms of a

discrete probability distribution. I had to look it up further because I had

just recently learned about probability distributions in math. What I found out

was intriguing; the number of atoms decaying in a specific interval of time is

actually a discrete variable. It also adheres to Poisson’s postulates, talked

about later, so it can be expressed in terms of Poisson distribution.

However, when what I noticed was that that

some research papers and online resources would use Poisson distribution, and

some would use the binomial distribution. This intrigued me because if one

could be used over another, then that means the two should be synonymous in

nature. This goes against my previous knowledge on the two because I thought that

since they both have different formulas, they should be two very different

concepts.

After consultation with my physics teacher,

I was told that probability distributions would come up in higher education

engineering courses. Being an aspiring engineer, I had to explore the differences

between binomial and Poisson distribution and understand why they can be used

side by side.

Poisson

distribution

The Poisson distribution was first

introduced in the 18th century by mathematician Siméon Denis Poisson

as a way to research on the number of wrong court decisions over a period of

time. In a more general sense, this distribution tells us the probability of the sum of successful independent Bernoulli trials

given a fixed interval of time. A Bernoulli trial is a statistical

experiment where there are only two possible outcomes, either success or

failure. Graphically, it provides a discrete probability distribution function

where each point in the y-axis gives the numerical probability of a discrete random

variable X. It is typically denoted

as:

Discrete

probability

The Poisson distribution works with

discrete variables, so it is worth discussing what it is first. Discrete variables:

·

Have a finite set of data

·

Are obtained by counting (and

is countable)

·

Are non-mutually exclusive

·

Have a complete range of

numbers

·

Are represented by distinct,

isolated points in a graph

One of the simplest statistical examples of

the discrete variable is the probability of getting number of heads when a coin

is tossed times. Suppose we have a fair

coin, the probability of getting number of heads when the coin

is tossed 2 times is:

Explanation

Probability

(fraction)

Probability

(decimal)

0

TT

0.25

1

HT,

TH

0.50

2

HH

0.25

Because of the nature of the discrete

variable, its probability distribution can often just be expressed in a tabular

form.

Poisson’s

postulates

There are several assumptions to be made

when using the Poisson distribution. When these assumptions are met, then the

Poisson discrete probability distribution can be used.

1.

The probability of success is

the same throughout the whole experiment

2.

The probabilities are

independent of one another

3.

The probability of a success

happening over a small time period is essentially the value of that time period

4.

The probability of more than

one success in a small time period is essentially 0

5.

The rate of success is only

dependent on the length of the interval of time

6.

The experiment at hand is a

part of a Bernoulli trial

One thing worth adding when comparing the

probabilities of a random variable in real life and in calculations is that if

there is a large discrepancy between the two sets of data, there must be an external

factor coming into play when the statistical experiment was done.

Poisson

distribution as a derivation from binomial distribution

The binomial

distribution has a more specific definition, which is the probability of having successful outcomes out of Bernoulli trials. For the probability that

a discrete random variable happens times, it is denoted as:

,

What I found out after my exploration

amazed me; the Poisson distribution is actually just a derivation from the

binomial theorem for when and . It is a mathematical limit of the binomial distribution. The

Poisson distribution thus has to be derived from:

To come up with a correct derivation, the

calculations must adhere to Poisson’s postulates. Because the probability of

success is identical throughout the whole experiment, this implies that in number of trials, the

expected value is , which is also the definition of mean ().

Rearranging it:

Substituting into the equation:

Now the first two terms can be taken out

and manipulated. It can be rewritten as:

Because both the numerator and denominator

now have they can be cancelled out,

leaving the following:

From here, it can be noted that both the

numerator and denominator have number of terms. However, as reaches infinity, the value

of this whole fraction approaches 1. Therefore, it can be said that the value

of the first two terms is 1.

The last term can be divided into two

parts:

For the first part, an expression for Euler’s

number can be used:

For the second part, since is in the denominator and it

is approaching infinity, the value of the fraction will approach 1.

Putting them all together, including the

constants taken out earlier:

This simplifies to:

Hence the Poisson distribution is denoted

as:

In summary, the Poisson distribution is a

condition of the binomial theorem where the number of trials approaches

infinity and the probability of success approaches 0.

Euler’s

number in probability

What I find very interesting is the fact

that Euler’s number suddenly popped out when deriving for the Poisson

distribution. It is clearly one of the most fascinating, important, and

fundamental constants in mathematics. Unsurprisingly, it has its applications

in probability theory and it relates directly to the Poisson distribution. For a large , the probability of getting no

successful outcomes is approximately.

This expression can actually be proven to

be correct by inputting the parameters in both the Poisson and binomial

distributions. Let us take a look at how that could work with an example:

A student

is playing a random number generator, which has a range of numbers from 1 to 100;

the teacher says that if after 100 tries the number 65 does not appear, the

student can go home. What is the probability that the student can go home?

We have to first find out the probability

of success, which in this case is the number 65 appearing. The probability of

it happening is 1 in 100.

So now,

What we want to find is the probability

when there are no successful outcomes, so.

This is the same value of the limit shown

above. Even in other cases wherein , the probability should still be. This is because as and , the value for will be .

We can also try this with the binomial

theorem, where there are no limiting assumptions made.

Which is essentially:

However, the actual probability is , which is a very close approximation to Euler’s number. For even

larger values of, the probability should get close and closer to.

This result of this example justifies the

use of the Poisson distribution as the equation used in substitution to the

binomial theorem when reaches a very large number.

It also signifies just how important Euler’s number is in the limit theorem,

and the credibility of its usage in the Poisson distribution.

Approximation

of the Poisson distribution to the Binomial distribution

I also graphed these distributions so that

it can be visualized. However, because I did it in my laptop’s default graphing

software, there were some things that I had to change. I was not able to change

the y and x axes’ variables. Due to this, I had to change some variables so

that the equations would be able to be plotted. Because the probability is the

independent variable, P was changed to y. The dependent variable in the

equations is the number of successful Bernoulli trials, so was changed to x.

For the example above:

Binomial: red

Poisson: blue

Poisson:

For the binomial theorem, I had to expand

the first part of the equation because the software did not recognize it.

Binomial:

As seen from the graph above, the graphs

are almost identical. However at closer inspection:

The x-intercepts are the values that were

obtained earlier

It can now be clearly seen that there is

still some discrepancy between the two.

Mean,

variance, and mode