23 Factor Analysis using R

Sumitra Purkayastha

 

1 Introduction

 

Factor analysis is a technique that is used to reduce a large number of variables into fewer numbers of factors. This technique extracts maximum common vari-ance from all variables and puts them into a common score. As an index of all variables, we can use this score for further analysis.

 

Factor analysis can be considered an extension of principal component analysis. Both can be viewed as attempts to approximate the covariance matrix. However the approximation based on the factor analysis model is more elaborate.

 

We are rst going to see how many factors are required to best explain the data. We perform eigen analysis of the sample correlation matrix from the rst prin-ciples. We nd the cumulative proportion of variability explained by factors

 

 

 

 

Interpretation:

 

The score corresponding to “Endurance” (Factor 1) is more for the 4th individual as compared to the 6th individual. Looking at the data cor-responding to the two individuals we see that for the 4th individual time taken to complete a longer distance race i.e 5k, 10k or Marathon is less as compared to the 6th individual.

 

Likewise the score corresponding to “Strenth” (Factor 2) is more for the 6th individual as compared to the 4th individual. Looking at the data corresponding to the two individuals we see that for the 6th individual time taken to complete a shorter distance race i.e 100m, 200m, etc is less as compared to the 4th individual.

 

SUMMARY

  • In R we can perform Factor Analysis using inbuilt R functions
  • If the data is available in raw form we can go for either estimation using Principal Component method of Maximum Likelihood method and also estimate the factor scores
  • If the data is not available in raw form, but we have the correlation or dis-persion matrix, we can only go for estimation using Principal Component method and estimate the factor scores from rst principles
  • If the variables are highly correlated amongst themselves we achieve Di-mension Reduction using Factor Analysis, however such dimension reduc-tion is not signi cant if the correlations are low

    References

  • R.A.Johnson & D.W. Wichern, Applied Multivariate Statistical Analysis, Pearson
  • T.W. Anderson, An Introduction to Multivariate Analysis, John Wiley
  • G.A.F. Seber, Multivariate Observations, John Wiley
  • N.C. Giri, Multivariate Statistical Inference, Academic Press