18 Sampling and Sampling Distributions: Sampling Distribution of X bar
Dr. Uday Khanna
1. Learning Outcome
2. Introduction
3. Central limit theorem
4. Sampling distribution of a sampling mean
5. Characteristics of the sampling distribution of mean
6. Sampling distribution of proportion
7. Mean & standard deviation of distribution of proportion
Learning outcomes:
After completing this module the students will be able to:
1. Understand sampling distribution of X bar
2. Understand implication of central limit theorem
3. Learn sampling from normal population
4. Understand sampling distribution of proportion
1. INTRODUCTION
The sampling distribution of the sample mean, denoted by ̅ , is a concept that is required to understand right from the sample collection to analysis till the meaningful interpretation drawn from the sample. It relates to introductory statistical inference, which includes normal distribution, confidence intervals and hypothesis testing. Proper analysis and interpretation of a sample statistic requires knowledge of its distribution. If repeated random samples are chosen from the same population, the values of the sample mean, denoted ̅will vary from sample to sample. The resulting sampling distribution ̅ is the distribution of these sample mean ̅values, for a large number of samples.
Process of Inferential Statistics
2. Central limit theorem
Suppose we take numerous simple random samples of a given size n from a normal distribution with population mean μ and standard deviation . Then we compute the mean for each of those samples. Some of these sample means will be less than μ and some will be greater than it, thus giving us the sampling distribution. If we plot the sample means using a histogram, we will see that they are normally distributed, where the mean and standard deviation of the sampling distribution X̅ are approximately equal to the mean and standard deviation of the population.
If the original population has a normal distribution with mean μ and standard deviation , then plotting the sample means for the simple random samples (SRS), each containing n observations, will produce a sampling distribution that also follows a normal distribution. If the original population does not have a normal distribution, but each SRS has a large n (where most texts suggest n > 30), then plotting the sample means will produce a sampling distribution that has an approximate normal distribution. This result is called the Central Limit Theorem.
The central limit theorem states that if a large enough sample is taken (typically n > 30); then the sampling distribution of ̅ is approximately a normal distribution with a mean of μ and a standard deviation of σ√n. Therefore µ ̅ = μ and ̅ = √
3. Sampling Distribution of a Sample Mean
The sample mean ̅is a statistic whose value is the average of sample data drawn from a population.
For random samples of size n taken from a given population, the random variable ̅ is the collection of these sample means, the ̅ ’s. Like any random variable, ̅ has a probability distribution associated with it; i.e., shape, mean, standard deviation The probability distribution created by plotting sample means, the ̅’s, is the sampling distribution of the mean ̅.
The sampling distribution of ̅depends on the:
i. distribution of the original population (e.g., normal, skewed, uniform, symmetric)
ii. sample size n
iii. method of sample selection
3.1 Characteristics of the Sampling Distribution of a mean
When sampling from a normal population whose mean is μ and standard deviation is σ is taken, than all possible samples of size n are selected from a normal population, then the sampling distribution of the mean has the following three characteristics:
1. The sampling distribution of the mean is a normal distribution, regardless of sample size,n.
2. The mean of the sampling distribution of the mean, ̅is equal to the mean of the
population, μ: ̅= μ.
3. The standard error of the sampling distribution of the mean ̅is equal to the standard deviation of the population, σ, divided by the square root of the sample size, n:
̅= √
Since in practice we usually do not know μ or σ we estimate these by ̅ and s√n respectively. In this case s is the estimate of σ and is the standard deviation of the sample. The expression s√n is known as the standard error of the mean, labelled SE (x¯).
Consider a population of 5 working people who are all neighbours in gurgoan. They are asked to list the no. of kilometres they used to travel daily for work.
Now we can have a sample of two person (n=2) and find the mean ̅to estimate µ
Like Rakesh = 50km and Aman = 80km, then Rakesh and Suresh and likewise.
List all possible samples of two people and calculate the mean, ̅for each sample.
4. Formulas
The sample standard deviation formula is:
where,
s = sample standard deviation
= sum of…
= sample mean
n = number of scores in sample.
The population standard deviation formula is:
where,
Confidence coefficient | 50% | 68.27% | 90% | 95% | 95.45% | 99% | 99.73% |
Z | 0.6745 | 1.00 | 1.645 | 1.96 | 2.00 | 2.58 | 3.00 |
5. Sample Proportion
A sample proportion is where a random sample of objects n is taken from a population P; if x objects have a certain characteristic then the sample proportion “p” is: p = x/n. For example: 100 people are asked if they are non-vegetarian. If 40 people respond “yes” then the sample proportion p = 40/100.
5.1 Sampling Distribution of a Proportion
The sampling distribution of a proportion is when you repeat your survey for all possible samples of the population. For example: instead of polling 100 people once to ask if they are non-vegetarian, you’ll poll them multiple times to get a better estimate of your statistic.
5.2 Mean of Sampling Distribution of the Proportion
The mean of sampling distribution of the proportion, P, is a special case of the sampling distribution of the mean. The mean of the sampling distribution of the proportion is related to the binomial distribution.
5.3 Standard Deviation of Sampling Distribution of the Proportion
If a random sample of n observations is taken from a binomial population with parameter p, the sampling distribution (i.e. all possible samples taken from the population) will have a standard deviation of:
Standard deviation of binomial distribution = σp = √ [pq/n] where q=1-p.
- Summary
To summarize:
1.) The sampling distribution is a theoretical distribution of a sample statistic.
2.) There is a different sampling distribution for each sample statistic.
3.) Each sampling distribution is characterized by parameters, two of which known μ and σ.
The latter is called the standard error.
4.) The sampling distribution of the mean is a special case of the sampling distribution.
5.) The Central Limit Theorem relates the parameters of the sampling distribution of the mean to the population model and is very important in statistical thinking.
Learn More:
- Black. K (2013) Business Statistics For Contemporary Decision Making (8th Edition) New Delhi: Wiley
- Cooper D.R., Schindler P. S. and Sharma J.K. (2012). Business Research Methods (11th Edition) New Delhi: Mc Graw Hill Education
- Vohra N.D. (2009). Quantitative Techniques in Management (4th Edition) New Delhi: Mc Graw Hill Publication.
- Tulsian P.C. and Pandey V. (2002). Quantitative Techniques, Theory & Problems (1st edition). New Delhi: Pearson India.
- http://www.statisticssolutions.com/standard-error/
- http://www.statisticshowto.com/probability-and-statistics/sampling-in-statistics/
- Bunnies, Dragons and the ‘Normal’ World: https://www.youtube.com/watch?v=jvoxEYmQHNM
- http://www.statisticshowto.com/sampling-distribution/