Basic Exploratory Factor Analysis (EFA) Example in R
The datafile “job_placement” needs to be read in to the R session.
dat <- read.csv("../../data/job_placement.csv", header = FALSE)
Because the datafile does not have column (or variable) names, the variable names need to be specified.
colnames(dat) <- c("id", "wjcalc", "wjspl", "wratspl", "wratcalc", "waiscalc", "waisspl", "edlevel", "newschl", "suspend", "expelled", "haveld", "female", "age")
In the original data file the missing values are coded as “99999”. These values need to be recoded to NA so that R recognizes them as missing.
dat[dat == 99999] <- NA
Then the variables that are to be used in the EFA need to be put into a separate data frame. This is the data frame that will be used in the analysis. The dat[,2:7] command makes a data frame using all rows, but only columns 2-7 from the “dat” data frame.
dat1 <- dat[,2:7]
The last part of the data manipulation is to remove the cases with missing values in the analysis data frame, this is something that Mplus does automatically.
dat1 <- na.omit(dat1)
Now the EFA can be run with 1 and 2 factors extracted.
output1 <- factanal(dat1, 1, rotation = "varimax")
output1
##
## Call:
## factanal(x = dat1, factors = 1, rotation = "varimax")
##
## Uniquenesses:
## wjcalc wjspl wratspl wratcalc waiscalc waisspl
## 0.728 0.093 0.108 0.695 0.749 0.116
##
## Loadings:
## Factor1
## wjcalc 0.522
## wjspl 0.953
## wratspl 0.945
## wratcalc 0.552
## waiscalc 0.501
## waisspl 0.940
##
## Factor1
## SS loadings 3.511
## Proportion Var 0.585
##
## Test of the hypothesis that 1 factor is sufficient.
## The chi square statistic is 461.38 on 9 degrees of freedom.
## The p-value is 1.06e-93
output2 <- factanal(dat1, 2, rotation = "varimax")
output2
##
## Call:
## factanal(x = dat1, factors = 2, rotation = "varimax")
##
## Uniquenesses:
## wjcalc wjspl wratspl wratcalc waiscalc waisspl
## 0.184 0.089 0.107 0.096 0.477 0.112
##
## Loadings:
## Factor1 Factor2
## wjcalc 0.230 0.873
## wjspl 0.907 0.298
## wratspl 0.894 0.306
## wratcalc 0.248 0.918
## waiscalc 0.281 0.667
## waisspl 0.896 0.293
##
## Factor1 Factor2
## SS loadings 2.617 2.318
## Proportion Var 0.436 0.386
## Cumulative Var 0.436 0.823
##
## Test of the hypothesis that 2 factors are sufficient.
## The chi square statistic is 3.8 on 4 degrees of freedom.
## The p-value is 0.434