Multiple Group Confirmatory Factor Analysis (CFA) Example in R
Load the lavaan package at the beginning of the session
library(lavaan)
## This is lavaan 0.5-23.1097
## lavaan is BETA software! Please report any bugs.
The data file is read in, columns are named, and missing values are specified.
dat <- read.csv("../../data/job_placement.csv", header = FALSE)
colnames(dat) <- c("id", "wjcalc", "wjspl", "wratspl", "wratcalc",
"waiscalc", "waisspl", "edlevel", "newschl",
"suspend", "expelled", "haveld", "female", "age")
dat[dat == 99999] <- NA
Now the CFA model that is to be tested needs to be specified as an R object. In lavaan the “=~” operator is the exact same as the BY operator in Mplus. In the statements below the variables on the right of the “=~” are indicators of the variable to the left of the “=~”. Also notice that there are “+” signs that separate the variables on the right side of the equation. The entire model statement needs to be wrapped in quotation marks.
ConfigModel <- "MATH =~ wratcalc + wjcalc + waiscalc
SPELL =~ wratspl + wjspl + waisspl"
ConfigOutput <- cfa(model = ConfigModel, data = dat, group = "female",
std.lv = TRUE, missing = "fiml")
summary(ConfigOutput)
## lavaan (0.5-23.1097) converged normally after 118 iterations
##
## Number of observations per group
## 1 101
## 0 221
##
## Number of missing patterns per group
## 1 4
## 0 2
##
## Estimator ML
## Minimum Function Test Statistic 19.215
## Degrees of freedom 16
## P-value (Chi-square) 0.258
##
## Chi-square for each group:
##
## 1 11.555
## 0 7.660
##
## Parameter Estimates:
##
## Information Observed
## Standard Errors Standard
##
##
## Group 1 [1]:
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|)
## MATH =~
## wratcalc 6.545 0.496 13.187 0.000
## wjcalc 4.215 0.366 11.530 0.000
## waiscalc 2.290 0.276 8.306 0.000
## SPELL =~
## wratspl 7.010 0.547 12.817 0.000
## wjspl 6.833 0.518 13.182 0.000
## waisspl 6.638 0.520 12.763 0.000
##
## Covariances:
## Estimate Std.Err z-value P(>|z|)
## MATH ~~
## SPELL 0.634 0.064 9.914 0.000
##
## Intercepts:
## Estimate Std.Err z-value P(>|z|)
## .wratcalc 38.267 0.666 57.464 0.000
## .wjcalc 23.603 0.465 50.781 0.000
## .waiscalc 10.266 0.316 32.528 0.000
## .wratspl 37.171 0.735 50.590 0.000
## .wjspl 42.337 0.705 60.073 0.000
## .waisspl 38.030 0.697 54.579 0.000
## MATH 0.000
## SPELL 0.000
##
## Variances:
## Estimate Std.Err z-value P(>|z|)
## .wratcalc 1.957 1.621 1.208 0.227
## .wjcalc 3.960 0.848 4.671 0.000
## .waiscalc 4.765 0.714 6.678 0.000
## .wratspl 5.307 1.105 4.804 0.000
## .wjspl 3.474 0.910 3.818 0.000
## .waisspl 4.900 1.007 4.865 0.000
## MATH 1.000
## SPELL 1.000
##
##
## Group 2 [0]:
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|)
## MATH =~
## wratcalc 5.764 0.330 17.492 0.000
## wjcalc 4.123 0.244 16.926 0.000
## waiscalc 2.446 0.200 12.212 0.000
## SPELL =~
## wratspl 6.278 0.337 18.646 0.000
## wjspl 6.788 0.359 18.929 0.000
## waisspl 6.177 0.335 18.441 0.000
##
## Covariances:
## Estimate Std.Err z-value P(>|z|)
## MATH ~~
## SPELL 0.520 0.053 9.762 0.000
##
## Intercepts:
## Estimate Std.Err z-value P(>|z|)
## .wratcalc 39.222 0.417 93.962 0.000
## .wjcalc 23.910 0.305 78.420 0.000
## .waiscalc 11.367 0.226 50.320 0.000
## .wratspl 36.172 0.448 80.762 0.000
## .wjspl 41.371 0.480 86.157 0.000
## .waisspl 36.767 0.444 82.899 0.000
## MATH 0.000
## SPELL 0.000
##
## Variances:
## Estimate Std.Err z-value P(>|z|)
## .wratcalc 5.280 1.232 4.285 0.000
## .wjcalc 3.549 0.664 5.342 0.000
## .waiscalc 5.269 0.557 9.463 0.000
## .wratspl 4.915 0.726 6.772 0.000
## .wjspl 4.881 0.795 6.138 0.000
## .waisspl 5.285 0.736 7.176 0.000
## MATH 1.000
## SPELL 1.000
metricModel <- "MATH =~ wratcalc + wjcalc + waiscalc
SPELL =~ wratspl + wjspl + waisspl
MATH ~~ c(1, NA)*MATH
SPELL ~~ c(1, NA)*SPELL"
metricOutput <- cfa(model = metricModel, data = dat, std.lv = TRUE,
group = "female", group.equal = "loadings",
missing = "fiml")
summary(metricOutput)
## lavaan (0.5-23.1097) converged normally after 94 iterations
##
## Number of observations per group
## 1 101
## 0 221
##
## Number of missing patterns per group
## 1 4
## 0 2
##
## Estimator ML
## Minimum Function Test Statistic 25.871
## Degrees of freedom 20
## P-value (Chi-square) 0.170
##
## Chi-square for each group:
##
## 1 15.581
## 0 10.290
##
## Parameter Estimates:
##
## Information Observed
## Standard Errors Standard
##
##
## Group 1 [1]:
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|)
## MATH =~
## wratclc (.p1.) 6.359 0.490 12.972 0.000
## wjcalc (.p2.) 4.354 0.342 12.729 0.000
## waisclc (.p3.) 2.502 0.229 10.909 0.000
## SPELL =~
## wratspl (.p4.) 6.751 0.508 13.295 0.000
## wjspl (.p5.) 7.020 0.516 13.592 0.000
## waisspl (.p6.) 6.568 0.493 13.326 0.000
##
## Covariances:
## Estimate Std.Err z-value P(>|z|)
## MATH ~~
## SPELL 0.641 0.063 10.164 0.000
##
## Intercepts:
## Estimate Std.Err z-value P(>|z|)
## .wratcalc 38.267 0.656 58.319 0.000
## .wjcalc 23.602 0.474 49.832 0.000
## .waiscalc 10.263 0.330 31.085 0.000
## .wratspl 37.172 0.713 52.136 0.000
## .wjspl 42.337 0.721 58.719 0.000
## .waisspl 38.031 0.691 55.052 0.000
## MATH 0.000
## SPELL 0.000
##
## Variances:
## Estimate Std.Err z-value P(>|z|)
## MATH 1.000
## SPELL 1.000
## .wratcalc 3.047 1.413 2.157 0.031
## .wjcalc 3.603 0.778 4.632 0.000
## .waiscalc 4.700 0.714 6.586 0.000
## .wratspl 5.686 1.123 5.061 0.000
## .wjspl 3.230 0.929 3.478 0.001
## .waisspl 4.995 1.023 4.883 0.000
##
##
## Group 2 [0]:
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|)
## MATH =~
## wratclc (.p1.) 6.359 0.490 12.972 0.000
## wjcalc (.p2.) 4.354 0.342 12.729 0.000
## waisclc (.p3.) 2.502 0.229 10.909 0.000
## SPELL =~
## wratspl (.p4.) 6.751 0.508 13.295 0.000
## wjspl (.p5.) 7.020 0.516 13.592 0.000
## waisspl (.p6.) 6.568 0.493 13.326 0.000
##
## Covariances:
## Estimate Std.Err z-value P(>|z|)
## MATH ~~
## SPELL 0.456 0.089 5.103 0.000
##
## Intercepts:
## Estimate Std.Err z-value P(>|z|)
## .wratcalc 39.222 0.422 92.880 0.000
## .wjcalc 23.910 0.302 79.192 0.000
## .waiscalc 11.367 0.221 51.462 0.000
## .wratspl 36.172 0.454 79.673 0.000
## .wjspl 41.371 0.472 87.637 0.000
## .waisspl 36.767 0.446 82.522 0.000
## MATH 0.000
## SPELL 0.000
##
## Variances:
## Estimate Std.Err z-value P(>|z|)
## MATH 0.860 0.156 5.501 0.000
## SPELL 0.895 0.158 5.676 0.000
## .wratcalc 4.619 1.205 3.834 0.000
## .wjcalc 3.835 0.654 5.864 0.000
## .waiscalc 5.370 0.564 9.525 0.000
## .wratspl 4.764 0.721 6.607 0.000
## .wjspl 5.154 0.790 6.526 0.000
## .waisspl 5.233 0.732 7.153 0.000
scalarModel <- "MATH =~ wratcalc + wjcalc + waiscalc
SPELL =~ wratspl + wjspl + waisspl
MATH ~~ c(1, NA)*MATH
SPELL ~~ c(1, NA)*SPELL
MATH ~ c(0, NA)*1
SPELL ~ c(0, NA)*1 "
scalarOutput <- cfa(model = scalarModel, data = dat, std.lv = TRUE,
group = "female",
group.equal = c("loadings", "intercepts"),
missing = "fiml")
summary(scalarOutput)
## lavaan (0.5-23.1097) converged normally after 95 iterations
##
## Number of observations per group
## 1 101
## 0 221
##
## Number of missing patterns per group
## 1 4
## 0 2
##
## Estimator ML
## Minimum Function Test Statistic 36.362
## Degrees of freedom 24
## P-value (Chi-square) 0.051
##
## Chi-square for each group:
##
## 1 22.546
## 0 13.815
##
## Parameter Estimates:
##
## Information Observed
## Standard Errors Standard
##
##
## Group 1 [1]:
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|)
## MATH =~
## wratclc (.p1.) 6.381 0.491 12.994 0.000
## wjcalc (.p2.) 4.340 0.342 12.699 0.000
## waisclc (.p3.) 2.529 0.232 10.895 0.000
## SPELL =~
## wratspl (.p4.) 6.751 0.508 13.299 0.000
## wjspl (.p5.) 7.015 0.516 13.592 0.000
## waisspl (.p6.) 6.578 0.494 13.329 0.000
##
## Covariances:
## Estimate Std.Err z-value P(>|z|)
## MATH ~~
## SPELL 0.640 0.063 10.128 0.000
##
## Intercepts:
## Estimate Std.Err z-value P(>|z|)
## MATH 0.000
## SPELL 0.000
## .wratclc (.18.) 38.273 0.653 58.628 0.000
## .wjcalc (.19.) 23.374 0.454 51.465 0.000
## .waisclc (.20.) 10.751 0.289 37.239 0.000
## .wratspl (.21.) 37.211 0.692 53.771 0.000
## .wjspl (.22.) 42.411 0.713 59.443 0.000
## .waisspl (.23.) 37.869 0.675 56.144 0.000
##
## Variances:
## Estimate Std.Err z-value P(>|z|)
## MATH 1.000
## SPELL 1.000
## .wratcalc 2.841 1.424 1.995 0.046
## .wjcalc 3.743 0.793 4.723 0.000
## .waiscalc 4.977 0.766 6.496 0.000
## .wratspl 5.681 1.123 5.056 0.000
## .wjspl 3.237 0.930 3.481 0.000
## .waisspl 5.035 1.032 4.880 0.000
##
##
## Group 2 [0]:
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|)
## MATH =~
## wratclc (.p1.) 6.381 0.491 12.994 0.000
## wjcalc (.p2.) 4.340 0.342 12.699 0.000
## waisclc (.p3.) 2.529 0.232 10.895 0.000
## SPELL =~
## wratspl (.p4.) 6.751 0.508 13.299 0.000
## wjspl (.p5.) 7.015 0.516 13.592 0.000
## waisspl (.p6.) 6.578 0.494 13.329 0.000
##
## Covariances:
## Estimate Std.Err z-value P(>|z|)
## MATH ~~
## SPELL 0.454 0.089 5.104 0.000
##
## Intercepts:
## Estimate Std.Err z-value P(>|z|)
## MATH 0.148 0.121 1.220 0.222
## SPELL -0.156 0.121 -1.293 0.196
## .wratclc (.18.) 38.273 0.653 58.628 0.000
## .wjcalc (.19.) 23.374 0.454 51.465 0.000
## .waisclc (.20.) 10.751 0.289 37.239 0.000
## .wratspl (.21.) 37.211 0.692 53.771 0.000
## .wjspl (.22.) 42.411 0.713 59.443 0.000
## .waisspl (.23.) 37.869 0.675 56.144 0.000
##
## Variances:
## Estimate Std.Err z-value P(>|z|)
## MATH 0.857 0.156 5.502 0.000
## SPELL 0.894 0.158 5.676 0.000
## .wratcalc 4.518 1.204 3.753 0.000
## .wjcalc 3.911 0.655 5.970 0.000
## .waiscalc 5.431 0.576 9.435 0.000
## .wratspl 4.762 0.721 6.603 0.000
## .wjspl 5.173 0.791 6.541 0.000
## .waisspl 5.235 0.734 7.134 0.000
Now the models can be compared via chi-square difference testing.
anova(ConfigOutput, metricOutput)
## Chi Square Difference Test
##
## Df AIC BIC Chisq Chisq diff Df diff Pr(>Chisq)
## ConfigOutput 16 10301 10444 19.215
## metricOutput 20 10300 10428 25.871 6.6558 4 0.1552
Here the metric model does not fit significantly worse than the configural model, so metric invariance is supported.
Now we can test for scalar invariance.
anova(metricOutput, scalarOutput)
## Chi Square Difference Test
##
## Df AIC BIC Chisq Chisq diff Df diff Pr(>Chisq)
## metricOutput 20 10300 10428 25.871
## scalarOutput 24 10302 10415 36.361 10.491 4 0.03293 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Here the scalar model fits significantly worse, so scalar invariance is not supported.