15 Introduction to Continuous Probability Distributions
Dr. Harmanpreet Singh Kapoor
Learning Objectives
- Introduction
- Normal Distribution
- Log-Normal Distribution
- Rectangular Distribution
- Beta Distribution of Type-I
- Beta Distribution of Type-II
- Exponential Distribution
- Gamma Distribution
- Cauchy Distribution
- Summary
- Suggested Readings
1. Learning Objectives
The main objective of this module is to give an introduction to some widely used continuous distribution. We will also discuss about the different continuous distributions with their properties that will help to differentiate between them. To know in detail about the continuous distributions, mathematical and graphical representation of different type of continuous distributions will be discussed. Some questions and answers are also included for better understanding.
2. Introduction
We are very well known that there are two types of random variables in probability i.e. discrete random variable and continuous random variable. We are also much known to the fact that probability distribution is a mathematical expression of values of random variable that have specified probability attach with them under specific law. If random variable is discrete in nature, then the probability distribution function of variable is known as probability mass function (P.M.F.). Discrete random variable has distribution that follows some specific laws, properties and generally termed as discrete probability distribution. There are some discrete distributions that have the features like probability of occurrence of each event is equally likely (discrete uniform distribution), only two outcomes (Bernoulli distribution) etc. The detail discussion about the discrete distributions is already covered in the module “Introduction to Discrete Probability Distributions”.
On the other hand, if the random variable is continuous then the probability distribution function of the variable is known as probability density function (P.D.F.). Continuous random variable has distribution that follows some specific laws, properties and generally termed as continuous probability distribution. There are some important continuous random variable that we will discuss in this module. Some of them are uniform distribution, normal distribution, gamma distribution, beta distribution, exponential distribution, Cauchy distribution etc.
Note: The probability of the continuous random variable for a specific value is zero. So the probability of all continuous random variables is calculated in interval.
In this module, we will discuss not only about the continuous distributions but also try to differentiate them with their features and properties through graphical and mathematical representation. So that one can understand how to apply a particular distribution for a particular case.
In this next section, we will discuss the most widely used continuous distribution that is normal distribution. Normal distribution is the distribution that has the most practical applications in various fields.
We will discuss various continuous distributions one by one so that one can easily differentiate them through their mathematical properties.
3. Normal Distribution
Normal distribution is considered as the ruling model among the all continuous probability distributions due to its features and characteristics. It is considered as the important part of the probability theory in statistical analysis due to its features.
The normal distribution was discovered by De-Movire, Laplace and Gauss. This is the reason that normal distribution is also known as Laplace distribution and Gauss distribution. Most of the distributions in probability theory tend to normal distribution when they have large sample size or have sample size greater than 30. If a random variable follows normal distribution, random variable is said to be normally distributed. Normal continuous probability distribution is widely used in social sciences due to its characteristics. As we already said that when the observations are very large then by using laws of probability the data can be assumed to follow normal distribution. Hence normal distribution will help to extract information from the responses of the respondent in the questionnaire.
Normal distribution is also used in linear and non-linear regression modeling as the error terms in these model are assumed to follow normal distribution. Normal distribution has a vital role in inferential statistics. Normal distribution is the most widely used distribution in hypothesis testing, estimation and to find out the confidence interval of the parameters. Normal distribution has a wide application in parametric tests. Investment analysis, market analysis and risk analysis, biostatistics and environmental sciences are the other areas where normal distribution is widely used.
Mathematical Expression:
Let us suppose Y is a random variable and Y is said to have normal distribution with parameter ? ??? ?2 (which are the mean and variance respectively),
Then the PDF of the Y is given as:
The mathematical notation for a random variable ? that follow normal distribution is given as ?~?(?, ?2).
Standard Normal Distribution:
The standard normal distribution is a special case of the normal distribution with mean 0 and variance
1. As the probability of random variable that lie in an interval (?, ?) i.e. ?(? < ? < ?) can be calculated in an exact manner because the integration cannot give a closed expression. To find out the probability value of continuous random variable for normal distribution with mean (?) and variance (?2).
We have to adopt the following method given below:
(i) To solve this problem, first we have to transform the normal variable to standard normal variable by subtracting mean value from normal random variable and divide it by standard deviation;
(ii) then calculate the probability of this standard normal distributed random table from the table given in the literature.
Mathematical expression:
Properties:
i. Normal Distribution is a limiting case of Binomial distribution if sample size is very large (? → ∞) and probability of success (i.e. p) is very small.
ii. Normal Distribution is a limiting case of Poisson Distribution if average outcomes is very large (i. e. ? → ∞).
iii. Normal Distribution is a bell shape curve and symmetrical distribution about ?. Also ???? =?????? = ????.
Figure 1
iv. Normal Distribution curve doesn’t change the shape of curve if we change the mean it’s simply move the position of the curve.
v. If we change the variance of the normal distribution that will change the peak of the curve. A smaller variance squashes and larger variance stretches it.
vi. Odd central moment will be zero. Even central moment can be calculated by using this relation:
?2? = 1.3.5. . , … . (2? − 1)2?, ? = 1,2, …
vii. Sum and difference of two independent standard normal variables also follow a normal variable using the uniqueness property of moment generating function. If U and V are the two standard normal variable, if ? = ? + ? & ? = ? − ? then ? & D both follow normal distribution with mean zero and variance 2 that is ? & D ~ ?(0,2).
Suppose are the normal random variable, linear combination of the random variable follows normal variable i.e.
Question 1:
Suppose there are 100 environmental studies students study in the Central University, and the probability for any student requires a reference book of statistics from the University library on working days is 0.07. How many copies of the book should be kept in the University library so that the probability may be greater than 0.60 that none of the students who has a requirement for a copy from the library has to come back disappointed?
Answer
Given that: n = 100, p = 0.07.
We can see that this is case of Binomial distribution but here the sample size is very large and probability of success is very small. So the random variable follow normal distribution with mean and variance as . We already discussed that normal distribution is a limiting case of Binomial distribution.
? = ?? = 100 ∗ 0.07 = 7
?2 = ??? = 100 ∗ 0.07 ∗ 0.93 = 6.51, ? = 2.551
?(? ≤ 40) = 1 − ?(? ≥ 40) = 1 − 0.0001 = 0.9999.
Similarly for
?(0 ≤ ? ≤ 30) =?
⇒ ?(−3 ≤ ? ≤ 2) = ?(−3 ≤ ? ≤ 0) + ?(0 ≤ ? ≤ 2)
⇒ ?(0 ≤ ? ≤ 3) + ?(0 ≤ ? ≤ 2) = 0.49865 + 0.4772 = 0.97585.
4. Log-Normal Distribution
A random variable Y is said to have Log-Normal Distribution if log ? is normally distributed.
Let ? = log ? ~?(?, ?2), ??? ? > 0
The PDF of the Log-Normal distribution of the random variable Y is given as:
The volume of gas in a petroleum reserve, amounts of rainfall, size distribution of rainfall droplets, milk production by cows all can be modeled with a Log-Normal distribution. Log-Normal distribution is also known as Galton and Cobb-Douglas distribution.
= 0.65542 − 0.61791 = 0.03751.
Hence the probability value is 0.03751.
5. Rectangular Distribution
Rectangular Distribution or Uniform Distribution is applicable for both discrete and continuous random variable. In this distribution, all the values that random variable can take in a given interval have an equal probability of occurrence. Discrete uniform distribution is a constant quantity who’s PMF depends on the number of integers observations. The probability density function of the uniform distribution is also a constant quantity which depends on the length of given interval. The main application of the uniform distribution that is random number generation as well as p-value. P-value is the value that is used to make conclusion about the acceptance and rejection of the null hypothesis. In case of generating random number from any continuous distribution, first generate the random numbers from the uniform distribution that has interval [0,1]. Then equate the CDF of that particular distribution with the generated number from the uniform distribution and then find the required random number.
Suppose a random variable Y is said to have rectangular distribution over the given interval (m, n) i.e.
?~?(?, ?) ?? ?~?(?, ?).
PDF of the rectangular distribution is given as:
Question 5
The average weight gained by a person in winter is uniformly distributed from 0 to 50lbs. Find the probability a person will gain weight between 17 and 30 lbs. in winter.
Answer
First find the range of the variable (y axis) on the axis. We know that area under a probability distribution is always 1. Hence the random variable follows uniform distribution with constant probability of 1/50.
Now find the width of the area in which we have to find the probability that is between 17 and 30 is 30-17= 13.
Hence the probability of a person gain weight in winter between 17 and 30 is given as:
Probability value is 13 ∗ 1/30 = 0.26
6. Beta Distribution of Type I
Beta continuous variable having distribution of type I represents a family of probabilities and outcomes for proportions. For example: How likely is that current Prime Minister of India will win the next prime minister election? One of us few think that he will win, one of us think he will not think and some of them think he will win or not. To solve this type of problem’s solution beta distribution will be used. Beta distribution of Type I is also known as basic beta distribution or beta distribution of first kind. There are many applications of the basic beta distribution like testing of Bayesian hypothesis, the rule of succession, task duration modeling etc.
A random variable Y is said to have a beta distribution of first kind with parameter
? and ? ?. ?. ?~?1(?, ?).
The PDF of the beta distribution of first kind is given as:
7. Beta Distribution of Second Kind or Beta Distribution of Type II:
Beta distribution of second kind is also known as beta prime distribution and inverted beta distribution. Beta distribution of second kind is the odds distribution associated with a beta distributed random variable. Same as the basic beta distribution, the beta distribution of second kind is also used to model random probabilities and proportions.
A random variable Y is said to have beta distribution of second kind with parameter
? and ? ?. ?. ?~?2(?, ?).
PDF of the beta distribution of second kind is given as:
8. Exponential Distribution:
A random variable that follows Exponential probability distribution is used to describe the time between Poisson point processes. Poisson point process is a process in which events occurs continuously and independently at a constant average rate. It is also known as memory less distribution. It is a particular case of Gamma distribution. Exponential distribution is not same as the exponential families of the distribution. Exponential distribution has a wide application in reliability, stochastic process etc. Exponential distribution is the only distribution that has a constant failure rate so it has great importance in reliability. Exponential distribution is also used to find inter arrival time in a homogeneous Poisson process. Exponential distribution is also known as Negative exponential distribution.
Let us suppose Y is random variable that follows exponential distribution with parameter ?.
The PDF of Exponential distribution is given as:
?(?) = ? exp(−??) , ? ≥ 0, ? > 0
The CDF is given as:
??(?) = 1 − exp ( −??).
Question 7
On the average, a certain machine parts last for ten years. The length of time the machine part lasts is exponentially distributed. What is the probability that a machine part less than more than 8 years?
Answer
Let Y is the span of time in years a machine part lasts.
The probability that machine part less than more than 8 years is 2.22.
9. Gamma Distribution:
Gamma distribution is widely used distribution like Normal distribution, Exponential distribution. Gamma distribution is also related to the beta distribution. Problems can be seen in practical situations related to gamma distribution in processes where the waiting time for more than one person between Poisson processes. Suppose random variable Y is said to follow Gamma distribution with one or two parameter.
Relationship with the Chi-Square Distribution
Gamma distribution is related with chi-square distribution or we can say that for specific values of the parameters Gamma distribution is transformed into chi-square distribution.
10. Cauchy Distribution
Cauchy distribution is continuous probability distribution. Cauchy distribution is named on the French mathematician and physicist Augustin Cauchy. Cauchy distribution is also known as Lorentz, Cauchy-Lorentz distribution. Cauchy distribution does not have finite moments like mean, variance and moment generating function (MGF). Mode and median of the Cauchy distribution are well defined. Cauchy distribution found in the field of working with exponential growth.
Let us suppose Y is random variable then Y is said to follow Cauchy distribution with parameter , with scale parameter s and location parameter t if the PDF is given as:
The curve of the standard Cauchy distribution will same as the standard Normal distribution bell shape curve.
Properties:
i. Mean, variance, coefficient of variation and kurtosis are undefined.
ii. Mode and median will be location parameter t.
iii. Skewness is 0.
11. Summary
In this module, we discussed about some very popular continuous distributions like Uniform distribution, Normal distribution, Gamma distribution, Exponential distribution, Beta distribution of first kind, Beta distribution of second kind and Cauchy distribution. These distributions are among the most widely used distribution in the literature as well as its application part. We also discussed about their mathematical expression, properties like mean, variance etc. In this module, we discussed about how to evaluate probabilities for distribution through questions and answers.
- Suggested Readings
Agresti, A. and B. Finlay, Statistical Methods for the Social Science, 3rd Edition, Prentice Hall, 1997.
Daniel, W. W. and C. L. Cross, C. L., Biostatistics: A Foundation for Analysis in the Health Sciences, 10th Edition, John Wiley & Sons, 2013.
Hogg, R. V., J. Mckean and A. Craig, Introduction to Mathematical Statistics, Macmillan Pub. Co. Inc., 1978.
Meyer, P. L., Introductory Probability and Statistical Applications, Oxford & IBH Pub, 1975.
Stephens, L. J., Schaum’s Series Outline: Beginning Statistics, 2nd Edition, McGraw Hill, 2006.
Triola, M. F., Elementary Statistics, 13th Edition, Pearson, 2017.
Weiss, N. A., Introductory Statistics, 10th Edition, Pearson, 2017.
you can view video on Introduction to Continuous Probability Distributions |
One can refer to the following links for further understanding of the statistics terms.
http://biostat.mc.vanderbilt.edu/wiki/pub/Main/ClinStat/glossary.pdf
http://www.stats.gla.ac.uk/steps/glossary/alphabet.html
http://www.reading.ac.uk/ssc/resources/Docs/Statistical_Glossary.pdf
https://stats.oecd.org/glossary/
http://www.statsoft.com/Textbook/Statistics-Glossary
https://www.stat.berkeley.edu/~stark/SticiGui/Text/gloss.htm
https://stats.oecd.org/glossary/alpha.asp?Let=A