15 Continuous distribution – Normal distribution : Normal curve

Dr. Nidhi Handa

 

1.      Introduction : Continuous Probability Distribution

 

2.      Normal Distribution

 

3.      Normal Probability Distribution

 

4.      Table of Normal Distribution

 

5.      Summary.

 

 

1 CONTINUOUS PROBABILITY DISTRIBUTION

 

We defined a continuous random variable as a random variable whose value are not countable. A continuous random variable can assume any value over an interval or intervals. Because the number of values contained in any interval is infinite, the possible number of values that a continuous random variable can assume is also infinite Moreover we cannot count these values, the life of a battery, height of persons, time takes to complete an examination, amount of milk in a gallon, weights of babies, and prices of houses are all example of continuous random variables. Note that although, money can be counted, usually all variable involving money are considered to be continuous random variables. This is so because a variable involving money often has a very large number of outcomes.

 

Suppose there are 5000 female students enrolled at a university and x is the continuous random variable that represents heights of these female students. Table.1 lists the frequency and relative frequency distributions of x.

 

Table 1 : Frequency and relative Frequency Distributions of Heights of Female Students

Height of a Female Relative
Student (in inches) X f Frequency
60 to less than 61 90 .018
61 to less than 62 170 .034
62to less than 63 460 .092
63 to less than 64 750 .150
64 to less than 65 970 .194
65to less than 66 760 .152
66 to less than 67 640 .128
67 to less than 68 440 .088
68 to less than 69 320 .064
69 to less than 70 220 .044
70 to less than 71 180 .036
N = 5000 Sum = 1.0

The relative frequencies listed in Table 1 can be used as approximate probabilities of respective classes.

 

Figure 1 displays the histogram and polygon for the relative frequency distribution of Table .1. Figure .2 shows the smoothed polygon for the data of Table 1. The smoothed polygon is an approximation of the probability distribution curve of the continuous random variable x. Note that each class in Table 1 has a width equal to 1 inch. If the width of classes.

Fig. 1 : Histogram and polygon for Table 1

 

Fig.2 Probability distribution curve for heights.

 

 

is more than I unit, we first obtain the relative frequency densities and then graph these relative frequency densities to obtain the distribution curve. The relative frequency density of a class is obtained by dividing the relative frequency of that class by the class width. The relative frequency densities are calculated to make the sum of the area of all rectangles in the histogram equal to 1.0. The probability distribution curve of a continuous random variable is also called its probability density function.

 

The probability distribution of a continuous random variable possesses the following two characteristics.

 

1.      The probability that x assumes a value in any interval lies in the range 0 to 1.

 

2.      The total probability of all the (mutually exclusive) intervals within which x can assume a value is 1.0.

 

The first characteristic state that the area under the probability distribution curve of a continuous random variable between any two points is between 0 and 1, as shown is Figure 3. The second characteristic indicates that the total area under the probability distribution curve of a continuous random variable is always 1.0 or 100%, as shown in figure 4.

Fig. 3 Area under a curve between two points.

Fig. 4 Total area under a probability distribution curve.

 

Fig. .5 Area under the curve as probability.

 

The probability that a continuous random variable x assumes a value within a certain interval is given by the area under the curve between two limits of the interval, as shown in Figure 5. The shaded area under the curve from a to b in this figure gives the probability that x falls in the interval a to b. That is,

 

P(a < x < b) = Area under the curve from a to b

 

Note that the interval a < x < b states that x is greater than or equal to a but less than or equal to b.

 

Reconsider the example on the heights of all female students at a university. The probability that the height of a randomly selected female student from this university lies in the interval 65 to 68 inches is given by the area under the distribution curve of the heights of all female students from x = 65 to x = 68, as shown in Figure 6. (This probability) is written as

 

P(65 < x < 68)

 

Which states that x greater than or equal to 65 but less than or equal to 68.

Fig.6 Probability that x lies in the interval 65 to 68.

 

For a continuous probability distribution, the probability is always calculated for an interval. For example, in Figure 6, the interval representing the shaded area is from 65 to 68. Consequently, the shaded area in that figure gives the probability for the interval 65 < x < 68.

 

The probability that a continuous random variable x assumes a single value is always zero. This is so because the area of a line, which represents a single point, is zero. For example, if x is the height of a randomly selected female student from that university, then the probability that this student is exactly 67 inches tall is zero. That is,

 

P(x = 67) = 0

 

This probability is shown in Figure 7. Similarly, the probability for x to assume any other single value is zero.

Fig. 7 Probability of a single value of x is zero

 

 

In general, if a and b are two of the values that x can assume, then,

 

P(a) = 0 and P(b) = 0

 

From this we can deduce that for a continuous random variable

 

P(a < x < b) = P(a < x < b)

 

In other words, the probability that x assumes a value in the interval a to b is the same whether or not the values a and b are included in the interval. For the example on the heights of female students, the probability that a randomly selected female student in between 65 and 68 inches tall is the same as the probability that this female is 65 to 68 inches tall. This is shown in Figure 8.

Fig. 8 Probability “from 65 to 68” and “between 65 and 68”

Note that the interval “between 65 and 68” represents “65 < x < 68” and it does not include 65 and 68. On the other hand, the interval “from 65 to 68” represents “65 < x < 68” and it does include 65 and 68. However, as mentioned previously, in the case of a continuous random variable both of these intervals contain the same probability or each under the curve.

 

2 THE NORMAL DISTRIBUTION :

 

The normal distribution is one of the many probability distributions that a continuous random variable can possess. The normal distribution is the most important and most widely used of all the probability distributions. A large number of phenomena in the real world are normally distributed either exactly or approximately. The continuous random variables representing the heights and weights of people, scores on an examination, weights of packages (e.g., cereal boxes, boxes, of cookies), amount of milk in a gallon, life of an item (such as a light bulb or a television set), and the time taken to complete a certain job have all been observed to have a (approximate) normal distribution.

 

The normal probability distribution or the normal curve is given by a bell-shaped (symmetric) curve. Such a curve is shown in Figure 9. It has a mean of and a standard deviation of . A continuous random variable x that has a normal distribution is called a normal random variable. Note that not all bell-shaped curves represent a normal distribution curve. Only a specific kind of bell-shaped curve represents a normal curve.

Fig. 9 Normal distribution with mean and standard deviation

NORMAL PROBABILITY DISTRIBUTION

 

A normal probability distribution, when plotted, gives a bell-shaped curve such that

 

1. The total area under the curve is 1.0.

2.  The curve is symmetric about the mean.

3.  The two tails of the curve extend indefinitely.

 

A normal distribution possesses the following three characteristics.

 

1.    The total area under a normal distribution curve is 1.0 or 100%, as shown in Figure 6.1

Fig. 10 Total area under a normal curve.

 

2.  A normal distribution curve is symmetric about the mean, a shown in Figure 6.13. consequently, ½ of the total area under a normal distribution curve lies on the left side of the mean and ½ lies on the right side of the mean.

Fig 11 A normal curve is symmetric about the mean.

3.  The tails of a normal distribution curve extend indefinitely in both directions without touching or crossing the horizontal axis. Although a normal distribution curve never meets the horizontal axis, beyond the points represented by – 3 and + 3 it becomes so close to this axis that the area under the curve beyond these points in both directions can be taken as virtually zero. These areas are shown in Figure 12.

Fig 12

 

The mean and the standard deviation are the parameters of the normal distribution. Given the value of these two parameters, we can find the area under a normal distribution curve for any interval. Remember, there is not just one normal distribution curve but rather a family of normal distribution curves. Each different set of value of and gives a different normal distribution. The value of determines the center of a nomal distribution on the horizontal axis and the value of gives the spread of the normal distribution curve. The three normal distributon curves drawn in Figure 13 have the same mean but different standard deviations. By contrast, the three normal distribution curves in Figure 14 have different means but the same standard deviation.

Fig. 13 Three normal distribution curves with the same  mean but different standard deviations.

Fig. 14 Three normal distribution curves with different  means but the same standard deviation

 

.Like the binomial and Poisson probability distributions, the normal probability distribution can also be expressed by a mathematical equation, However, we will not use this equation to find the area under a normal distribution curve. Instead, we will use tabular values.

 

TABLE The entries in the table give the areas under the standard normal curve from 0 to …

Z .00 .01 .02 .03 .04 .05 .06 .07 .08 .09
0.0 .0000 .0040 .0080 .0120 .0610 0.199 .0239 .0279 .0319 .0359
0.1 .0398 .0438 .0478 .0517 .0557 .596 .0636 .0675 .0714 .4753
0.2 .0798 .0832 .0871 .0910 .0948 .0987 .1026 .1064 .1103 .1141
0.3 .1179 .1217 .1255 .1293 .1331 .1368 .1406 .1443 .1480 .1517
0.4 .1554 .1591 .1628 .164 .1700 .1736 .1772 .1808 .1844 .1879
0.5 .1915 .1950 .1985 .2019 .2054 .2088 .2123 .2157 .2190 .2224
0.6 .2257 .2291 .2324 .2357 .2389 .2422 .2454 .2486 .2517 .2549
0.7 .2580 .2611 .2642 .2673 .2704 .2734 .2764 .2794 .2823 .2852
0.8 .2881 .2910 .2939 .2967 .2995 .3023 .3051 .3078 .3106 .3133
0.9 .3159 .3186 .3212 .3238 .3234 .3289 .3315 .3340 .3365 .3389
1.0 .3413 .3438 .3461 .3485 .3508 .3531 3554 .357 .3599 .3621
1.1 .3643 .3665 .3686 .3708 .3729 .3749 .3770 .3790 .3810 .3830
1.2 .3849 .3869 .3888 .3907 .3925 .3944 .3962 .3980 .3997 .4015
1.3 .4032 .4049 .4066 .4082 .4099 .4115 .4131 .4147 .4162 .4177
1.4 .4192 .4207 .4222 .4263 .4251 .4265 .4279 .4292 .4306 .4319
1.5 .4332 .4345 .4357 .4370 .4382 .4394 .4406 .4418 .4429 .4441
1.6 .4452 .4463 .4474 .4884 .4495 .4505 .4515 .4525 .4535 .4545
1.7 .4554 .4564 .4573 .4582 .4591 .4599 .4608 .4616 .4625 34633
1.8 .4641 .4649 .4656 .4664 .4671 .4678 .4686 .4693 .4699 .4706
1.9 .4713 .4719 .4726 .4732 .4738 .4744 .4750 .4756 .4762 .4767
2.0 .4772 .4778 .4783 .4788 .4793 .4798 .4803 .4808 .4812 .4817
2.1 .4821 .4826 .4830 .4834 .4838 .4842 .4846 .4850 .4854 .4857
2.2 .4861 .4864 .4868 .4871 .4875 .4878 .4881 .4884 .4887 .4890
2.3 .4893 .4896 .4898 .4901 .4904 .4906 .4809 .4911 .4913 .4916
2.4 .49.18 .4920 .4922 .4925 .4927 .4929 .4931 .4932 .4934 .4936
2.5 .4938 .4940 .4941 .4943 .4945 .4946 .4948 .4949 .4951 .4952
2.6 .4953 .4955 .4953 .4957 .4959 .4960 .4961 .4962 .4963 .4964
2.7 .4965 .4966 .4967 .4968 .4969 .4970 .4971 .4972 .4973 .4974
2.8 .4974 .4975 .4976 .4977 .4977 .4978 .4979 .4979 .4980 .4981
2.9 .4981 .4982 .4982 .4983 .4984 .4984 .4985 4985 .4986 .4986
3.0 .4987 .4987 .4987 .4988 .4988 .4989 .4989 .4989 .4990 .4990

 

 

Summary

 

A continuous random variable can possess one of many probability distributions.In this module we have discussed normal probability distribution.This distribution is also an approximation to the binomial distribution. The possible values that random variable can assume are infinite and uncountable.Continuous random variable can be defined as a variable that can assume any value in one or more intervals.Characteristics play an important role for probability distribution of a continuous random variable. Characteristics of Normal probability distribution are also discussed.The mean and standard deviation are the parameters of Normal Distribution. Different values of standard deviation have been also discussed here .