14 Introduction to Discrete Probability Distributions

Dr. Harmanpreet Singh Kapoor

epgp books

 

 

 

    Learning Objectives

  • Introduction
  • Discrete Uniform Distribution
  • Bernoulli Distribution
  • Binomial Distribution
  • Poisson Distribution
  • Geometric Distribution
  • Summary
  • Suggested Readings

    1. Learning Objectives

 

The main objective of this module is to give an introduction to some standard discrete probability distributions that are most widely used in statistical theory. We will also discuss about their mathematical expression, properties and its application with examples.

    2. Introduction

 

Based on the information that we already discussed in the modules of probability theory that enables one to fit a mathematical function on the observed data in terms of probability distribution function (PDF) of the random variable of a random experiment. If the random variable is to follow some properties of the particular probability distribution function, then it is easy to find out qualitative measure of the random variable like mean, variance, skewness and kurtosis. It is possible to formulate these type of random variable according to specified law either on the basis of condition or on the basis of results. Suppose outcome of random variable are equally likely, possible outcome of random variable are two, then it is very easy to find out the probability values but when the number of outcomes are more and it is difficult to find out the probability for a rare happening event in an experiment. These types of particular problem can be solved by using specified probability distribution functions.

 

In this module, discrete distribution function are discussed with their properties and mathematical expression of the probability distribution function. The distributions that we will discuss in this module will be only for univariate case. Univariate is basically refer to mathematical function or equation that is defined only for one variable. In literature, distribution for two random variables and more than two random variables are also discussed but we restrict our self in this module only to one variable. As we already discussed about the types of random variable discrete or continuous random variable in the module “Introduction to Random Variable and its Properties”. In this module, our main motive will be to give an introduction to distribution of discrete random variables. Some of the most commonly used discrete distributions in literature are given as: discrete uniform distribution, Bernoulli distribution, Binomial distribution, Poisson distribution, Negative Binomial distribution, multinomial distribution, geometric distribution and hyper geometric distribution etc.

 

Properties of these discrete distribution functions will be discussed like expectation (mean), variance, skewness, kurtosis and pmf function. In the module “Introduction to Random Variable and its Properties” we discussed about the mathematical formulation as well as properties of expectation, variance, skewness and kurtosis. We also discussed about the properties of a probability distribution function. In this module, we will again discussed about moments, skewness and kurtosis but now we will derive these properties for defined discrete distributions.

 

3.   Discrete Uniform Distribution

 

Discrete Uniform Distribution is used when outcome of random experiment is equally likely. For example: tossing a coin, rolling a dice, selecting a deck of card etc.

 

Suppose a random variable X is said to have a discrete uniform distribution with p.m.f. is given as:

where N is the number of possible outcomes.

z is the outcome of a random experiment.

 

Mean:

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Question

What is the probability of getting diamond ace from a standard deck of cards while selecting a card?

 

Answer

Total possible outcomes is 52.

Total ace in deck of card is 4.

Number of diamond ace is 1.

 

Then the probability of getting diamond ace is :

 

 

 

 

 

 

Since, probability of getting diamond ace is equally likely event so this shows the pmf of discrete uniform distribution.

 

Question 2

An example of a discrete uniform distribution is when throwing a six faced dice, what is the probability of getting one of the integer? Find mean and variance.

 

Answer

Suppose a dice has six faces and is a random variable that represent the outcome of a getting integer of them through function, is the outcome of the event.

 

Probability of getting one integer is:

 

 

 

This probability is equal to for all integer outcome of a six faced dice. Event getting an integer on the face of dice is considered as an equally likely events. Discrete uniform distribution is applicable for equally likely events. So we can consider the pmf of random variable X as discrete uniform distribution.

 

 

 

 

    4. Bernoulli Distribution:

 

Bernoulli distribution is applicable for those random experiment who have only two outcomes like success and failure, head and tail, odd and even, profit and loss etc. Two outcomes are considered as success and failure of the experiment. For example: tossing a coin is a random experiment and getting head is success and getting tail is failure, rolling a die is a random experiment and getting even number is success and getting odd number is failure.

 

Suppose p denotes the success of the experiment, q denotes the failure of the experiment, X is a random variable is said to have a Bernoulli Distribution with p.m.f. is given as :

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Question 3

Find the probability of getting sum of integer is even in a rolling of two dice. Find mean and variance of the random variable.

 

Answer

Total possible outcomes in a throwing two dice are 36.

Number of even sum of integer faces in rolling two dice are 18.

Number of odd sum of integer faces in rolling two dice are 18.

Probability of getting sum of integer when the sum is even. It is  given as:

Since, here are two outcomes, one is getting sum of even integer and second one is getting sum of integer that is odd. Success of the experiment depends upon for getting sum of integer is even and failure of the experiment depend upon for getting sum of integer is odd.

 

Random variable X is number of outcomes getting sum of even integer in rolling two dice. Random Experiment have two possible outcomes, p and q, success and failure respectively, pmf of random variable X is same as the Bernoulli distribution pmf.

 

PMF of random variable X is given as:

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

5. Binomial Distribution:

 

If a random experiment is performed repeatedly, each repetition is called as trial. Trial is called Bernoulli trials if number of trials are finite, trials are independent, probability of success is finite. Probability of success p is the occurrence of an event in a Bernoulli trial and q is probability of failure is the non-occurrence of an event in respective Bernoulli trial. Distribution of repeated Bernoulli trials is said to be Binomial Distribution. Binomial distribution was developed by James Bernoulli in year 1700.

 

Suppose a random variable X is said to follow a Binomial distribution with probability of success is p, probability of failure is q and n independent Bernoulli trials have probability mass function given as:

 

where z is the outcome of the event.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

6. Poisson Distribution

 

When event of a random experiment occurs at random point of time and space and our interest lies in the number of happening an event according to our interest, not in non-happening of event. Happening of an event is based on indefinite trials of an experiment. Suppose number of car accidents in particular time period, number of bus passing a crossing per minute during the festive season, number of faulty products in a lump of 100 etc. These type of problems can be solved by using Poisson distribution. Poisson distribution was developed by French mathematician Simeon Denis Poisson in 1837. Poisson distribution is used to count the happening of rare event in a random experiment.

 

Poisson distribution is a limiting case of Binomial distribution because Poisson distribution has number of trials indefinitely large, probability of success is very small.

 

Since ? → ∞ and ? → 0

 

Multiplication of ? and ? must be a constant. Let ? be a constant then

 

?? = ?;

 

Probability of success is given as:

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

?(?) = ?.

 

Remarks:

  1. If X is an random variable and ? is constant, X follows Poisson distribution with parameter ? is denoted as:
    ?~?(?).
  2. If X and Y two independent Poisson distributed with parameters ? and ? respectively then
    ?~?(?), ?~?(?), ? + ?~?(? + ?).This property of Poisson distribution is known as additive property of Poisson distribution. This property can be extended to n events.
  3. Poisson distribution is the only distribution whose mean and variance are same.
  4. Let’s say ? and ? are two independent Poisson variables, conditional distribution of X given ? +? is Binomial Distribution.
  5. Poisson distribution is a limiting case of Binomial Distribution.

    Question 7

A manufacturer company of bulbs knows that 7% of it’s bulbs are defective. If it sells bulbs in boxes of 500 and guarantees that not more than 50 bulbs will be defective, what is the probability of a box will fail to meet the guaranteed quality?

 

Answer

Given that:

n= 500

? is the Probability of success that defective bulb.

? = 7% = 0.07

Mean number of defective bulbs are ? = ?? = 35

Let us suppose X is a random variable and z is the outcome, then by probability law of Poisson

distribution probability of z defectives bulbs in a box is:

 

 

 

 

 

= 0.993466.

Hence the probability that box will fail to meet the guaranteed quality is 0.99346.

 

Question 8

A data analyst was able to complete 4 data set in a day on an average. Find the probability of complete 6 data set till the next day.

 

Answer

This question is an example of Poisson distribution. Given that some values:

? = 4, average of data set.

? = 6, required data set to be completed next day.

Let us suppose X is random variable and z is outcome, then by probability law of Poisson distribution probability of required data set to be completed next day:

 

 

 

 

 

 

 

 

 

Hence, the probability of complete data set the next day is 0.104.

 

Question 9

Suppose and random variable have Poisson parameter respectively = 5 and = 7. Find the probability of + ≤ 8.

 

Answer

Given that ? and ? have Poisson distribution.

?~?(?); ?~?(?).

Here ? = 5; ? = 7.

Then ?~?(5); ?~?(7).

By using the additive property of Poisson distribution, we have

? + ?~?(? + ?);

? + ?~?(12).

The required Probability is given as:

 

 

 

 

 

 

 

Hence the probability that sum of ? and ? is less than 8 is 0.155028.

 

6. Geometric Distribution

 

As Binomial and Poisson distribution, we have independent number of trials or repetition have constant value of probability of success with varying probability of failure. Suppose one wants to calculate the probability of failure preceding one’s success that can be calculated by using geometric distribution.

 

Geometric Distribution is a distribution that is used to find probability of failure preceding one success.

 

Suppose X is a random variable said to have geometric distribution with parameter ? and probability mass function (pmf) is given as:

 

?(? = ?) = ???; ? = 0,1, . . ; ? = 1 − ?

 

This function is known as pmf of geometric distribution.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Properties:

  1. ariance of geometric Distribution is always greater than mean.
  2. Let’s say ? and ? two independent random variable having geometric distribution than Conditional Distribution of ? given ? + ? is uniform.

Question 10

A representative from the Indian premier league marketing division randomly selected people on a random street, he finds a person who attended last home IPL game.

  1. Find the probability that marketing representative must select 5 people before he finds one who attended the last home IPL.
  2. Find the probability that marketing representative must not more than 8 people before he finds one who attended the last home IPL.
  3. How many people should we expect the marketing representative needs to select before he finds one who attended the last home IPL and what is the variance?

     Answer

Let’s say ? is a random variable having geometric distribution with probability of success ? is 0.40.

Probability Mass Function is given as:

 

?(? = ?) = ???; ? = 0,1, .. and ? = 1 − ?.

 

1. Probability that marketing representative must select 5 people before he finds one who already attended the last home IPL:

 

?(? = 5) = (0.60)5 ∗ (0.40);

?(? = 5) = 0.031104.

 

2. Probability that marketing representative must select more than 8 people before he finds one who already attended the last home IPL:

 

?(? > 8) = 1 − ?(? ≤ 8);

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

One observe that variance is greater than mean. Also the value of the mean is 1.5 and variance is 3.75 for random variable ?.

 

7. Summary

 

In this module, we discussed about some very popular discrete distributions like discrete uniform distribution, Bernoulli distribution, Binomial distribution, Poisson distribution and geometric distribution. These distributions are among the most widely used distribution in the literature as well as its application part. We also discussed about their mathematical expression, properties like mean, variance etc. In this module, we discussed about how to choose a particular distribution for a problem.

 

8. Suggested Readings

 

Agresti, A. and B. Finlay, Statistical Methods for the Social Science, 3rd Edition, Prentice Hall, 1997.

 

Daniel, W. W. and C. L. Cross, C. L., Biostatistics: A Foundation for Analysis in the Health Sciences, 10th Edition, John Wiley & Sons, 2013.

 

Hogg, R. V., J. Mckean and A. Craig, Introduction to Mathematical Statistics, Macmillan Pub. Co. Inc., 1978.

 

Meyer, P. L., Introductory Probability and Statistical Applications, Oxford & IBH Pub, 1975.

 

Stephens, L. J., Schaum’s Series Outline: Beginning Statistics, 2nd Edition, McGraw Hill, 2006.

 

Triola, M. F., Elementary Statistics, 13th Edition, Pearson, 2017.

 

Weiss, N. A., Introductory Statistics, 10th Edition, Pearson, 2017.

you can view video on Introduction to Discrete Probability Distributions

 

One can refer to the following links for further understanding of the statistics terms.

 

http://biostat.mc.vanderbilt.edu/wiki/pub/Main/ClinStat/glossary.pdf

 

http://www.stats.gla.ac.uk/steps/glossary/alphabet.html

 

http://www.reading.ac.uk/ssc/resources/Docs/Statistical_Glossary.pdf

 

https://stats.oecd.org/glossary/

 

http://www.statsoft.com/Textbook/Statistics-Glossary

 

https://www.stat.berkeley.edu/~stark/SticiGui/Text/gloss.htm

 

https://stats.oecd.org/glossary/alpha.asp?Let=A