44 Application of Software in Statistical Analysis II – SPSS

epgp books

 

 

 

 

  1. Introduction

SPSS is statistical software for social sciences which is user friendly statistical software. The SPSS programmes are java based programmes. SPSS is used to perform from basic statistical analysis such as charts and graphs and descriptive statistics, to advanced econometric models such as simultaneous equation models and structural equation models (SPSS with Amous). In this module, we are going to discuss about selected statistical analysis which are frequently used in social sciences.

 

Objective

 

At the end of this module, you will have an idea about how to enter the data for various statistical analysis in SPSS data sheet and how to perform various statistical techniques such as frequency distribution, descriptive statistics, correlation, regression equation, discriminant analysis, different types of t test, chi square test, one way ANOVA and two way ANOVA and so on. Let us discuss various statistical techniques one by one.

 

Univariate Frequency Distribution

 

Find out the frequency distribution of the following:

Objective:

 

To find univariate frequency for the data given.

 

Procedure:

  • Open the SPSS and then enter the data in data sheet.
  • Then give the variable name as age
  • Go to the analysis, click the descriptive statistics
  • Then click frequencies
  1. Bi Variate Frequency Distribution

Construct bi variate frequency distribution for the following data

 

1 Represent      – Primary education; 2 Represent–Secondary education

3 Represent      – Higher Secondary education; 4 Represent – College education

5 Represent      – Professional courses

 

Objective:

 

To find out the bi-variate frequency distribution for the given data.

 

Procedure:

  • Open the SPSS and then enter the data in data sheet.
  • Go to variable view, enter variable name as age and education in Name column of variable view in the corresponding rows
  • Enter data on age in one column and education in another column
  • Go to value label, put the cursor at the corner of the value label cell corresponding to the education variable name row. Give value label 1 = Primary Education, 2 = Secondary education, 3 = Higher Secondary Education, 4 = College Education and 5 = Professional Education.
  • Go to analyse, descriptive statistics
  • Select Gross tabs, Click cells, click frequency and percentage, click OK.
  1. Measures of Central Values and Dispersion

Calculate measures of central tendency and dispersion for the following data (un- grouped data) in SPSS.

Objectives:

 

To calculate measures of central value and dispersion for the given data.

 

Procedure:

  • Enter the data on temperature in SPSS worksheet and enter the variable name temperature in variable view.
  • Go to analyse, click reports, click summarize cases.
  • Select all descriptive statistics and click continue and ok.The output will be displayed
  1. Measures of Central Values and Dispersion

 

Suppose if we want to calculate descriptive statistics for a particular variable based on other variable, we can use compare mean option to estimate the descriptive statistics.

 

For example we have data on sugar level for various age groups. The person in the same age group may have different sugar level in the survey responses. The data are distributed as follows.

The following is the proceure to calculate descriptive statistics for the above type of data.

 

Procedure:

  • Enter the data on age and sugar level in SPSS worksheet
  • Enter the variable name as age and sugar level in variable view

Go to analyse, compare means, adialague box will open. In the dialague box, select sugar level in dependent list and age in independent list.Go to option, select mean, median, geometric mean, harmonic mean, range, standard deviation, variance,skewness and kurtosis, click continue and OK

  • The output will appear in the output file.
  1. Simple – Correlation

Simple correlation is calculated to find out the relationship between only two variables.Estimate simple correlation co-efficient for the following data on age and sugar level .

 

Objective:

To calculate simple correlation co-efficient for the given data on age and sugar level.

Procedure:

  • Enter the data on age and sugar level in SPSS data sheet. Give the name in the variable view.
  • Go to analysis à select correlate, bivariate.
  • A dialogue box will open. In the dialogue box, select sugar level and age and put in the variables list.
  • Select Pearson, click OK, the output will be displayed in output file.
  1. Multiple – Correlation

To find out the correlation between more than two variables, multiple correlation co efficient is estimated. Estimate multiple correlation co-efficient for the data given below in SPSS.

Procedure:

  • Enter the data on sugar level, age and rice in daily food in SPSS worksheet.
  • Go to the variable view, enter the variable name in the name column.
  • Go to analysis, select regression, select linear.
  • A dialogue box will open. In the box, put the sugar level in the dependent cell. Select age and rice in daily food in the independent cell.
  • Go to statistics, select part and partial correlation, R2 change, model fit and estimate, click continue, click OK.
  • The output will be displayed in the output file of SPSS.

In the above output, R in the model summary is the multiple R or multiple correlation. It reveals the relationship among all the three variables.

  1. Partial – Correlation Purpose

To find out the correlation between variables when some of the variables are kept constant. Calculate partial correlation co efficient for the following data.

 

  1. Regression Analysis

Purpose

To find out the cause and effect relationship between the variables in which dependent variable is quantitative

Estimate the regression equation

Y = β0 + β1 X1 + β2 X2 + u

for the following data on sugar level, age and rice in daily food

 

Procedure:

  • Enter the data on sugar level, age and rice in daily food in SPSS work sheet.
  • Go to the variable view, enter the variable name in the name cells.
  • Go to analysis à select regression, select linear.
  • A dialogue box will open. In the dialogue box, put sugar level in dependent variable cell, select age and rice in daily food in the independent variable cell, click OK.

The output will be displayed in the output file.

 

  1. Discriminant Analysis Purpose

To identify the variables which discriminate the groups into two or more groups Estimate discriminant equation for the following data

Note: 1- Sugar patient 2- Heart patient

Objectives

To estimate discriminant equation for the given data using SPSS

Procedure

  1. Enter the data on family income, age and patient categoryin data sheet and go to variable view, give the variable name.
  2. Go to analyse, click classify, discriminant, a dialogue box will open.
  3. In the dialogue box, put patient type in the grouping variable. Give maximum value as 2 and minimum value as 1
  4. In the Independent list, put family income and age as independent variables.
  5. Click statistics, click mean, univariate ANOVA, Fishers un standardized and so on.
  6. Click continue and OK. The output will be displayed in the output file.
  7. Calculate relative contribution of the variable to the total discriminant score. For calculating relative discriminant score, first open excel sheet, enter the group I and group II mean of family income and age.
  1. Chi Square Test

Chi square test is used to find out the association between the variables which are categorical. It is a non parametric test.

Find out the association between loan defaulting and Income by using Chi – square test in SPSS

  • Enter the data on monthly income, occupation and education in SPSS data sheet.
  • Go to the variable view, give variable name in name column
  • Click analyse in the main menu, click descriptive statistics and click crosstab and click statistics. A dialague box will open.
  • In the dialague box, click statistics, click chi square, click continue and OK. The output will appear in the output file
  1. One Sample T Test

Purpose

One sample t test is used to compare the sample mean and the populatiomean. Calculate one sample t test for the following data on temperature of human body from 9 samples

  • Enter the data on temperature of samples in SPSS worksheet
  • Click analyse, compare means, click one sample t test
  • A dialogue box will open. In that dialogue box, click temperature and put to the test variable cell.
  • Enter the test value as 98 which is the population mean.
  • Click OK. The output will be displayed in the output file

The estimated t value was statistically insignificant which indicates that there was no significant difference between the sample temperature and standard temperature.

 

Independent Sample T Test

 

Independent sample t test is used to compare the sample mean of two independent groups. For example, we can compare the mean blood pressure of two age groups.

 

Calculate independent sample ttest for the following data on blood pressure of two age groups.

Procedure

  1. Enter the data on blood pressure for various age groups in one column of SPSS worksheet.
  2. Give dummy value 1 for age group of persons in the range of 15 years and 40 years and2 for the persons in the age group of 40 years and above. The dummy variable is named as group dummy.
  3. Click analyse, compare means, click independent sample t test
  4. A dialog box will open. In that dialog box, click blood pressure and put to the test variable cell.
  5. Enter the group dummy to the grouping variable cell.
  6. Click OK. The output will be displayed in the output file
  1. Paired Sample T Test Purpose:

The paired sample t test is used to compare the mean of one sample group in different situations

 

Perform paired sample t test using SPSS for the following data on Blood pressure before treatment and after treatment.

 

Procedure:

  1. Open SPSS data sheet. Enter the data on blood pressure before treatment and after treatment in separate column.
  2. Click analyse in the main menu, click compare means, click variable blood pressure before treatment and after treatment and put in paired variable cell.
  3. Click OK. The output will be displayed in the output file

The estimated t value 3.658 is statistically significant at one percentage level indicating that there is significant difference in the blood pressure before and after treatment.

  1. One Way Anova Purpose

 

Analysis of variance is used to compare the mean of more than two groups. In one way ANOVA, the dependent variable is classified based on only one factor.

 

We can find out the effect of different chemical process on cloth weight using one way ANOVA.

Procedure:

  1. Enter the data on cloth weight in different stages of process in one column.
  2. Enter dummy values of 1 for the first process and 2 for second process and 3 for third process.
  3. Go to analyse, click compare means, click one way ANOVA, a dialog box will open. In the dialog box, put cloth weight in the dependent list and put group dummy in ‘factor’. Click OK.

The output will be displayed in output file of SPSS

The estimated F value is statistically insignificant (p = 0.396) indicating in significant difference in the mean weight of cloths in different process.

 

17.Two WayAnova

 

Purpose

 

Analysis of variance is used to compare the mean of  more than two groups.In two way ANOVA, the dependent variable is classified based on two factors.

 

Calculate two way ANOVA for the following data on household income of different age group

Note: 1 indicates 30- 40 years of age and 2 indicates 50 -60 years of age and 3 indicates 60 and above years of age. In education, 1 indicates non professional degree holders and 2 indicates professional degree holders.

 

Procedure

  1. Enter the data on incomeof different age groups in one column. Then in new column, give dummy value1 for the income of age group of 30-40 years. Give dummy value 2 for the income of age group of 40-60 years and 3 for the income of age group of 60 and above. In the third column give 1 for non professional degree holders and 2 indicates professional degree holders.
  2. Go to analyse, click general linear model, click univariate.
  3. A dialog box will open, put variable income in dependent variable list. Put age and education in fixed factor list. Click OK.
  4. The output will be displayed in output file of SPSS.

Result: Two way ANOVA shows that mean income of the individuals differed significantly across age and education.

  1. CONCLUSION

Let us summarise, we have seen how to apply the SPSS soft ware to calculate various statistical techniques such as charts, univariate and bi variate frequency distribution, measures of central value, measures of dispersion, correlation, regression, chi square test, discriminant analysis, ANOVA and so on. . All the above analysis can be done with SPSS 16 version even with lower version of SPSS. The factor analysis, logit and probit analysis, growth models etc.is omitted due to the scope of this module and structural equation modeling can also be performed with advanced version of SPSS along with Amous.

 

All the statistical analysis can be performed easilywith SPSS but the logic of analysis and interpretation of analysis is more important. The software in the statistics or econometrics may not do the above. Hence the softwares could not compete with human mind and knowledge.

you can view video on Application of Software in Statistical Analysis II – SPSS