29 Analysis of Variance and Experimental Design: Two-Way ANOVA

Prof. Pankaj Madan

 

Learning Objectives:

 

After the completion of this module the student will understand:

  • An Introduction to Analysis of Variance : Two – way ANOVA
  • Difference between One- way and Two – way ANOVA
  • Blocking in two-way ANOVA
  • General ANOVA table for Two way classification

 

1.  Introduction

 

In statistics, the two way analysis of variance is an extension of the one- way ANOVA that examines the influence of two different categorical independent variables on one continuous dependent variable. In two- way analysis of variance two criteria (or variables) are used to analyse the difference between more than two population means.

 

The two-way analysis of variance can be used to

 

Explore one criterion (or factor) of interest to partition the sample data so as to remove the unaccountable variation, and arriving at a true conclusion.

 

Investigate two criteria (factors) of interest for testing the difference between sample means.

 

Consider any interaction between two variables.

 

2. Difference between one- way and two -way ANOVA

 

Basis for comparison One way ANOVA Two way ANOVA
Meaning One  way  ANOVA  is  a  hypothesis  test, used to test the equality of three of more population  means  simultaneously  using variables Two way ANOVA is a statistical technique wherein, the interaction between factors influencing variable can be studied.
Independent variable One Two
Compares Three or more level of one factor Effect  of multiple level  of  two factors
Number of observations Need not to be same in each group Need to be equal in each group
Design of experiments Need to satisfy only two principles All  three  principles  needs  to  be satisfied
3.  Blocking in two-way ANOVA

 

In two way analysis of variance we are introducing another term called ‘blocking variable’ to remove the undesirable accountable variation. A block variable is the variable that the researcher wants to control but is not the sample/population (treatment) variable of interest. The term ‘blocking’ refers to block of land and comes from agricultural origin. The block of land might make some difference in the study of growth pattern of varieties of seeds for a given type of land. R. A. Fisher designated several different plots of land as blocks, which he controlled as a second variable. Each of the seed varieties were planted on each of the blocks. The main aim of his study was to compare the seed varieties (independent variable). He only wanted to control the difference in plots of land (blocking variable).

 

 

4.      General ANOVA table for two-way classification

Source of variation Sum of Square Degrees of freedom Mean Square Test Statistics
Between columns SSTR c-1 MSTR=SSTR/(c-1) Fpopulation/treatment= MSTR/MSE
Between rows SSR r-1 MSR= SSR/(r-1) Fblocks =MSR/MSE
Residual error SSE (c-1) (r-1) MSE= SSE/(c-1) (r-1)
Total SST n-1

 

SSTR= Sum of Square between Columns

 

SSR= Sum of Square between Rows

 

SSE= Sum of Square due to Error

 

SST= Total Sum of Square

 

MSTR= Mean Sum of Square (Columns)

 

MSR= Mean Sum of Square (Rows)

 

MSE= Mean Sum of Square (Error)

 

As stated above, total variation consists of three parts: (i) variation between columns,

 

SSTR; (ii) variation between rows, SSR; and (iii) actual variation due to random error, SSE. That is

 

SST = SSTR + (SSR + SSE)

 

The degree of freedom associated with SST is cr-1, where c and r are the number of columns and rows, respectively

 

Degrees of freedom between columns = c-1

 

Degrees of freedom between rows = r-1

 

Degrees of freedom for residual error = (c-1) (r-1) = N-n-c+1 The test-statistics F for analysis of variance is given by

 

F population/treatment = MSTR/MSE; MSTR>MSE or MSE/MSTR; MSE>MSTR Fblocks = MSR/MSE; MSR>MSE or MSE/MSR; MSE.MSR

 

Decision rule

 

If Fcal<Ftable, accept null hypothesis H0 Otherwise reject H0

 

 

5.  Self-Check Exercise with solutions

 

Q.1. The following table gives the number of refrigerators sold by 4 salesmen in three months May, June and July:

Is there a significant difference in the sales made by the four salesmen? Is there a significant difference in the sales made during different months?

 

Solution

 

Let us take the null hypothesis that there is no significant difference between sales made by the four salesmen during different months. The given data are coded by subtracting 40 from each observation. Calculations for a two- criteria- month and salesmen analysis of variance are shown in table:

 

SSTR = Sum of squares between salesmen (columns)

= {(15)32 + (12)32 + 1832 + (3)32} − 192

= (75 + 48 + 108 + 3) -192 = 42

 

SSR = Sum of squares between months (rows)

= {(17)42 + (29)42 + (2)42} − 192

=  (72.25 + 210.25 + 1) -192 = 91.5 SST = Total sum of squares

=      (∑ 12 + ∑ 22 + ∑ 32 + ∑ 42) −

=  (137 + 80 + 164 + 27) – 192 = 216

 

SSE = SST – (SSC + SSR) = 216 – (42 + 91.5) = 82.5

 

The total degree of freedom are, df = n-1 = 12 – 1 = 11

 

So dfc = c-1= 4-1=3

dfr = r-1= 3-1=2

df = (c-1) (r-1) = 3×2 = 6

Thus

( − 1) 3
= ⁄( − 1) = 91.5⁄2 = 45.75
= ⁄( − 1)( − 1) = 82.5⁄6 = 13.75

 

(a) The table value of F = 4.75 for df1=3, df2=6, and α = 0.05. Since the calculated value of F population/treatment = 1.018 is less than its table value, the null hypothesis is accepted. Hence we conclude that sales made by the salesmen do not differ significantly.

 

(b)   The table value of F= 5.14 for df1=2, df2=6, and α=0.05. Since the calculated value of Fblock= 3.327 is less than its table value, the null hypothesis is accepted. Hence, we conclude that sales made during different months do not differ significantly.

 

Q.2. To study the performance of three detergents and three different water temperatures, the following ‘whiteness’ reading were obtained with specially designed equipment:

Water Temperature Detergent A Detergent B Detergent C
Cold water 57 55 67
Warm water 49 52 68
Hot water 54 46 58

Perform a two-way analysis of variance, using 5 percent level of significance.

 

Solution

 

Let us take the null hypothesis that there is no significant difference in the performance of three detergents due to water temperature and vice-versa. The data are coded by subtracting 50 from each observation. The data in coded form are in table:

T = Sum of all observations in three samples of detergent = 56

T2 (56)2
CF = Correction factor = = = 348.44

SSTR = Sum of squares between detergents (columns)
= { (10)2 /3+ (3)3/3 + (43)2/3 } − CF
= 33.33 + 3 + 616.33 − 348.44 = 304.22
SSR = Sum of squares between water temperature (rows)
= {(29)32 + (19)32 + (8)32} −
= (280.33 + 120.33 + 21.33) − 348.44 = 73.55

 

SST = Total sum of squares

= (∑ 12 + ∑ 22 + ∑ 32) − = (66 + 45 + 677) − 348.44 = 439.56

SSE = SST – (SSC + SSR) = 439.56 – (304.22 + 73.55) = 61.79

 

Whereas; SSE= Sum of Square (due to Error), SST= Sum of Square (Total), SSC= Sum of Square (due to Column), SSR= Sum of Square (due to Rows)

 

⁄( − 1) = 304.22⁄2 = 152.11
= ⁄( − 1) = 73.55⁄2 = 36.775
= ⁄( − 1)( − 1) = 61.79⁄4 = 15.447

MSTR= Mean Square (Column), MSR= Mean Square (Rows), MSE= Mean Square (Error)

 

Two-way ANOVA table

 

(a) Since calculated value of F population/treatment = 9.847 at df1 = 2, df2 = 4, and α = 0.05 is greater than its

table value of F=6.94, the null hypothesis is rejected. Hence we conclude that there is significant difference

between the performances of the three detergents.

(b) Since the calculated value of Fblock = 2.380 at df1= 2, df2=4, and α=0.05 is less than its table value F= 6.94,

the null hypothesis is accepted. Hence we conclude that the water temperature do not make a significant

difference in the performance of the detergent.

 

SUMMARY

 

In this module, we have understood the difference between the One-way ANOVA and Two-Way ANOVA. We also understood that ‘blocking’ or ‘blocking variable’ in Analysis of Variance technique is used to remove the undesirable accountable variation. We also learnt that in Two-way ANOVA, two variables or attributes are used to test the hypothesis and the steps of Two-way ANOVA test were explained in detail specially a comparison between the Fcal and the Ftab values. In case of Fpopulation, it is calculated by dividing MSTR (Mean Square of Columns) by MSE (Mean Square Error) while in case of Fblock, it is calculated by dividing MSR (Mean Square of Rows) by MSE (Mean Square Error). The Mean Square Error (MSE) is the mean of the squared errors used to judge the quality of a set of errors.

 

Learn More:

  1. https://www.google.co.in/#q=two+way+analysis+of+variance
  2. http://keydifferences.com/difference-between-one-way-and-two-way-anova.html
  3. Sharma, J K (2014). In: Business Statistics, II eds., S Chand & Company, N Delhi.
  4. Chandel, S.R.S. (2006). In: A Handbook of Agricultural Statistics, Anchal Prakashan mandir, Kanpur.