40 Test of Significance II – Large Sample

K. Ramya

epgp books

 

 

 

1 Introduction

 

This module discusses about Test of Significance for Large Sample. Before moving to large samples, test of significance has to be seen in brief.

 

Test of significance is performed after framing the hypothesis (tentative statements) at say 1%, 5% and 10% level. The level of significance (denoted as α or alpha) represents the probability of error or chances of making wrong decisions. The choice of significance level stays with the researcher and it has to be cautiously decided based on the subject area of research. For example, for clinical researches, 1% level of significance is advisable as it needs high levels of accuracy. On the other hand, researches in social science can use significance level of 5% or even 10% as the results will not lead to fatal consequences even if it wrong.

 

Given below are the significance level and its corresponding confidence level which represents the researchers’ confidence level in reporting the decision. If significance level is 1%, then CI is 99% (100 – Significance level in this case) and so on.

The statistical significance has to be assessed using p-value. To assess statistical significance, it is important to examine the test’s p-value.

 

The following table gives the treatment for 5% level of significance and for its relevant p-value.

 

1.2 Large Sample Test

 

Large sample means a size greater than 30. Z test is generally applied for large samples. Z-test is a statistical test for approximately normally distributed data. Because of the central limit theorem, many test statistics are approximately normally distributed for large samples.

 

Conditions to be fulfilled for Z-tests

  • the data are drawn using simple random sample (SRS) from the population of interest
  • the sample size is large (at least 10 times as large as the sample) and/or
  • the population variance (Standard Deviation) is known.

Note: If the population variance is unknown, then sample SD may be used.

 

The following tests are discussed in large sample tests.

 

(i)Test of significance for proportion

(ii) Test of significance for difference between two proportions

(iii) Test of significance for mean

(iv)Test of significance for diff

(v)erence between two means.

 

1.2.1 Large Sample Test of Hypothesis – Step by Step Procedure

 

Identify the null hypothesis (specific claim to be tested) H0 : µ = µ0

 Identify the alternative hypothesis that must be true when the original claim is false. One-tailed test Ha: µ > µ0 or, Ha: µ < µ0

 

Two-tailed test Ha: µ ¹ µ0 2/9

 

Once null and alternative hypotheses have been formulated for a particular claim, the next step is to compute a test statistic. The appropriate significance test is known as the z-test. Calculate the test statistic z-test

 

Select the significant level a based on the seriousness of a type I error. The values of 0.05 and 0.01 are very common.

 

Determine the critical values and the critical region. Draw a graph and include the test statistic, critical value(s), and critical (rejection) region.

 

Reject H0 if the test statistic is in the critical region. Fail to reject H0 if the test statistic is not in the critical region.

 

Restate this decision in simple, non-technical terms.

 

1.3 Test of Significance for Proportion:

 

A hypothesis test of a proportion can be done when the following conditions are met:

  • Sample follows Simple Random Sampling (SRS)
  • Each sample point can result in just two possible outcomes. (Success or Failure)
  • The sample includes at least 10 successes and 10 failures.
  • The population size is at least 20 times as big as the sample size.

Here we check, sample proportion = population proportion.

  • The sample distribution of P (proportion) is approximately normal with a mean or

Example 1: ABC bus services claims that 90%, the bus reaches destination on time. A consumer group passengers and found that only 192 out of 250 passengers reported that the bus reached on time. Is the claim of the bus services valid? Explain

 

H0: Proportion of buses reaching on time is equal to 0.90

 

H1: Proportion of buses reaching on time is not equal to 0.90

 

Solution:

 

We are given n = 250

 

 = observed proportion in the sample = 192/250= 0.768

 

So, actually, buses reaching on time are 76.8%

 

But given in the question is 90% which seems to be untrue. Let us statistically check it using z test as the sample is large.

Result: Since the calculated Z0 > Ze (observed > expected) we reject our null hypothesis at 5% level of significance and conclude that claim of bus services is invalid i.e the buses reaching destination on time is less than 90%

 

Example 2: In a random sample of 800 toys from a large toy shop 240 toys were for playing purpose. Can it be said that toys to be kept in the showcases and toys for playing purpose are in the ratio 5:3 in the population? Use 5% level of significance.

 

H0: Proportion of playing toys and showcase toys in the population are in the ratio 5:3.

 

H1: Proportion of playing toys and showcase toys in the population are not in the ratio 5:3.

 

Solution:

 

We are given n = 800

 

 = observed proportion of playing toys in the sample = 240/800 = 0.30  = proportion of playing toys in the population 3/8 = 0.375

                                                                              Z = -4.4

 

Result: On calculating statistically, since the calculated Z0 > Ze (observed > expected) we reject our null hypothesis at 5% level of significance and conclude that the proportion of playing toys and showcase toys in the population are not in the ratio 5:3.

 

Example 3: In a sample of 200 parts manufactured by a factory, the number of defective parts was found to be 15. The company, however, claimed that only 5% of their product is defective. Is the claim tenable?

 

H0: The claim of the company that 5% of product is defective is not acceptable.

 

H1: The claim of the company that 5% of product is defective is acceptable. 5/9

 

Solution:We are given n = 200

 

 = proportion of defectives in the sample = 15/200 = 0.075  = proportion of defectives in the population = 0.05

Z = 1.62

Result: Since the calculated Z0 < Ze, we accept our null hypothesis at 5% level of significance and we conclude that the company’s claim is not acceptable.

 

1.4 Test of significance for difference between two proportions:

 

This test compares 2 sample proportions each from a different group

 

This test answers the following research questions. Are the two groups the same? Are they different?

 

Formula:

Where,

= Sample proportion 1

 = Sample proportion 2

 = Number of observations of sample 1

 = Number of observations of sample 2

 

Example 1: In a referendum submitted to the ‘student body’ at a university, 920 men and 450 women voted. 530 of the men and 310 of the women voted ‘yes’. Does this indicate a significant difference of the opinion on the matter between men and women students?

 

H0: There is no significant difference of the opinion on the matter between men and women students

 

H1: There is significant difference of the opinion on the matter between men and women students

 

Total population (n1) = 920

 

Total population (n2) = 450

 

Sample of n1(x1) = 530

 

Sample of n2 (x2) = 310

 

P1 = Proportion of men voted “yes” = 530/920 = 0.576

 

P2 = Proportion of women voted “yes” = 310/450 = 0.689

 

Test Statistic:

                                                                Z = 2.56

 

Result: On calculation, since Z0 > Ze we reject our null hypothesis at 5% level of significance and say that the data indicate a significant difference of the opinion on the matter between men and women students.

 

Example 2: In a certain city 325 men in a sample of 1000 are found to be self employed. In another city, the number of self employed is 1375 in a random sample of 2000. Does this indicate that there is a greater population of self employed in the second city than in the first?

 

H0: There is no significant difference between the populations of self employed men in 2 cities

 

H1: There is significant difference between the populations of self employed men in 2 cities

 

Total population (n1) = 1000

 

Total population (n2) = 2000

 

Sample of n1(x1) = 325

 

Sample of n2 (x2) = 1375

 

P1 = Proportion of self employed male in on city = 325/1000 = 0.325

 

P2 = Proportion of self employed male in another city = 1375/2000 = 0.688

 

For common sense, difference can be noted in self employment between two cities. Let us check it statistically.

Z = -8.067

 

Result: Since Z0 > Ze we reject the null hypothesis at 5% level of significance and say that there is a significant difference between the two population proportions.

 

1.5 Test of significance for mean:

 

This test is applied to test the significance of single sample mean with population mean.

 

Formula:

Where,  = Sample mean

 

 = Hypothetical mean of the population

 

 = Standard deviation

 

 = Total number of observations

 

Note: If standard deviation of population is not known, we can consider the standard deviation of sample represented by small‘s’.

 

Example 1: The mean lifetime of 200 fluorescent light bulbs produced by a company is computed to be 3140 hours with a standard deviation of 240 hours. Check if there is any difference between population and sample mean using a 5% level of significance.

 

H0: There is no significant difference between the sample mean and the population mean

 

H1: There is significant difference between the sample mean and the population mean

 

Population mean (m) = 3200 hrs

 

Sample mean = 3140 hrs

 

Standard Deviation of sample = 240 hrs

 

Total number of observations (n) = 200

 

Using common knowledge, it can be seen that population mean and sample mean are different.

 

 

Z =

Z = -3.55

 

Result: On computation, since Z0 > Ze we reject our null hypothesis at 5% level of significance and say that there is significant difference between the sample mean and the population mean.

 

Example 2: The mean productive time for individuals who have not attended any special training is 30 hours with standard deviation 8. A training company is interested in finding out if the special training has any effect on the productive time. The company chooses a random sample of 200 individuals who have attended the special training and determines the mean productive time for those individuals was 28.5 hours. Was the special training effective?

 

H0:    is equal to 30

 

H1:    is not equal to 30

 

Z =

 

Z = -1.5/.5657 = 2.65

 

Result: On computation, since Z0 > Ze we reject our null hypothesis at 5% level of significance and conclude that special training was effective.

 

1.6 Test of significance for difference between two means:

 

It is much more common for a researcher to be interested in the difference between means than in the specific values of the means themselves. This test covers how to test for differences between means from two separate groups of subjects.

 

Where, = Mean of population 1 10/9

 

 = Mean of population 2

 

 = Standard deviation of population 1

 

 = Standard deviation of population 2

 

 = Number of observations of sample 1

 

 = Number of observations of sample 2

 

Example 1: A test of the breaking strengths of two different types of cables was conducted using samples of n1 = n2 = 100 pieces of each type of cable.

Do the data provide sufficient evidence to indicate a difference between the mean breaking strengths of the two cables? Use 0.05 level of significance.

 

H0: There is no significant difference in the mean breaking strengths of two cables

 

H1: There is a significant difference in the mean breaking strengths of two cables

 

Mean of population 1 () = 1425

 

Mean of population 2 () = 1405

 

Standard deviation of population 1 () = 40

 

Standard deviation of population 2 () = 30

 

Number of observations of sample 1 () = 100

 

Number of observations of sample 2 () = 100

 

Common sense knowledge shows there is difference in mean in two cables. But let us check statistically.

Z = 4

 

Inference: On computation, it can be seen that since Z0 > Ze, we reject the H0. Hence the formulated null hypothesis is wrong i.e. there is a significant difference in the mean breaking strengths of two cables.

 

Example 2: The means of two large samples of 500 and 1000 items are 47.5cms and 48.0cms respectively. Can the samples be regarded as drawn from the population with standard deviation 2.5cms. Test at 5% level of significance.

Where,

 

 = Mean of sample 1

 

 = Mean of sample 2

 

12/9

 

 = Standard deviation of population

 

 = Number of observations of sample 1

 

 = Number of observations of sample 2

 

H0: The sample have not been drawn from the same population

 

H1: The sample have been drawn from the same population

Z = 3.62

 

Inference: Since Z0 > Ze, we reject the H0 at 5% level of significance and conclude that the samples have not come from the same population.

 

7 Conclusion

 

Z test is used when population mean is normally distributed and SD is known and when the sample is large i.e. n>30.

 

We discussed the following tests for significance for large sample.

 

(i) Test of significance for proportion

(ii) Test of significance for difference between two proportions

(iii) Test of significance for mean

(iv) Test of significance for difference between two means.

 

Z test is different from t test which follows t distribution. Also when sample size become larger in t test, then it resembles z test.

 

t test and z test are similar in calculation expect for distribution and the formula vary slightly between t test and z test. However, in SPSS, we cannot find this difference. For large sample also, we use the menu and test and results are given as t statistics instead of z statistics.

 

Therefore, researchers need not get confused with the results of SPSS. For large sample, the t statistics results of SPSS can be read as z test.

 

you can view video on Test of Significance II – Large Sample

 

WEBLINKS

  • https://www.khanacademy.org/math/statistics-probability/significance-tests-one-sample/tests-about-population-proportion/v/large-sample-proportion-hypothesis-testing
  • https://saylordotorg.github.io/text_introductory-statistics/s12-02-large-sample-tests-for-a-popul.html
  • http://www.conceptstew.co.uk/pages/nsamplesize.html
  • http://statweb.stanford.edu/~susan/courses/s60/split/node102.html
  • https://stats.libretexts.org/Textbook_Maps/Map%3A_Introductory_Statistics_(Shafer_and_Z hang)/08%3A_Testing_Hypotheses/8.2%3A_Large_Sample_Tests_for_a_Population_Mean
  • http://www.statisticshowto.com/z-test/
  • http://www.intuitor.com/statistics/CentralLim.html
  • https://en.wikibooks.org/wiki/Statistics/Testing_Data/z-tests
  • https://web.williams.edu/Mathematics/sjmiller/public_html/BrownClasses/162/Handouts/Stat sTests04.pdf
  • http://www.surveystar.com/ztest.htm https://www.youtube.com/watch?v=80YzzIm8NK8