Ben Kite, Center for Research Methods and Data Analysis, University of Kansas <bakite@ku.edu>
Chong Xing, Center for Research Methods and Data Analysis, University of Kansas <cxing@ku.edu>
Paul Johnson, Center for Research Methods and Data Analysis, University of Kansas <cxing@ku.edu>
Please visit http://crmda.ku.edu/guides
Keywords: Structural Equation Modeling, R, lavaan
2019 January 18
Abstract
This guide outlines how to fit a structural equation model with measurement and structural components. In other words, a model where latent variables are measured and then used in a regression model.Load the lavaan package at the beginning of the session
library(lavaan)
This is lavaan 0.6-3
lavaan is BETA software! Please report any bugs.
The data file is read in, columns are named, and missing values are specified.
dat <- read.csv("../../data/job_placement.csv", header = FALSE)
colnames(dat) <- c("id", "wjcalc", "wjspl", "wratspl",
"wratcalc", "waiscalc", "waisspl",
"edlevel", "newschl", "suspend",
"expelled", "haveld", "gender", "age")
dat[dat == 99999] <- NA
dat$gender <- factor(dat$gender, labels = c("Male", "Female"))
This builds the SEM model
SEMModel <-
' MATH =~ wratcalc + wjcalc + waiscalc ## measurement model for MATH
SPELL =~ wratspl + wjspl + waisspl ## measurement model for SPELL
MATH ~ edlevel + newschl + suspend + expelled + haveld + gender + age ## MATH as an outcome
SPELL ~ edlevel + newschl + suspend + expelled + haveld + gender + age ## SPELL as an outcome
MATH ~~ SPELL ## correlation between MATH and SPELL
'
Here the model is fitted and the summary is requested.
output <- sem(model = SEMModel, data = dat, std.lv = TRUE,
missing = "fiml", mimic = "Mplus")
Warning in lav_data_full(data = data, group = group, cluster = cluster, : lavaan WARNING: 9 cases were deleted due to missing values in
exogenous variable(s), while fixed.x = TRUE.
summary(output, standardized = TRUE, fit.measures = TRUE)
lavaan 0.6-3 ended normally after 82 iterations
Optimization method NLMINB
Number of free parameters 33
Used Total
Number of observations 313 322
Number of missing patterns 4
Estimator ML
Model Fit Test Statistic 49.780
Degrees of freedom 36
P-value (Chi-square) 0.063
Model test baseline model:
Minimum Function Test Statistic 2035.834
Degrees of freedom 57
P-value 0.000
User model versus baseline model:
Comparative Fit Index (CFI) 0.993
Tucker-Lewis Index (TLI) 0.989
Loglikelihood and Information Criteria:
Loglikelihood user model (H0) -4909.694
Loglikelihood unrestricted model (H1) -4884.805
Number of free parameters 33
Akaike (AIC) 9885.389
Bayesian (BIC) 10009.014
Sample-size adjusted Bayesian (BIC) 9904.348
Root Mean Square Error of Approximation:
RMSEA 0.035
90 Percent Confidence Interval 0.000 0.057
P-value RMSEA <= 0.05 0.859
Standardized Root Mean Square Residual:
SRMR 0.024
Parameter Estimates:
Information Observed
Observed information based on Hessian
Standard Errors Standard
Latent Variables:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
MATH =~
wratcalc 5.250 0.248 21.203 0.000 6.104 0.950
wjcalc 3.593 0.180 19.977 0.000 4.177 0.905
waiscalc 2.114 0.145 14.564 0.000 2.458 0.732
SPELL =~
wratspl 5.726 0.259 22.148 0.000 6.566 0.947
wjspl 5.943 0.265 22.403 0.000 6.815 0.952
waisspl 5.539 0.251 22.025 0.000 6.352 0.944
Regressions:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
MATH ~
edlevel 0.328 0.056 5.885 0.000 0.282 0.342
newschl 0.102 0.124 0.824 0.410 0.088 0.044
suspend -0.329 0.129 -2.546 0.011 -0.283 -0.141
expelled -0.100 0.182 -0.549 0.583 -0.086 -0.030
haveld -0.285 0.164 -1.736 0.083 -0.245 -0.089
gender -0.144 0.131 -1.098 0.272 -0.124 -0.058
age 0.118 0.035 3.403 0.001 0.101 0.193
SPELL ~
edlevel 0.204 0.053 3.831 0.000 0.178 0.216
newschl -0.037 0.121 -0.305 0.760 -0.032 -0.016
suspend -0.019 0.125 -0.155 0.877 -0.017 -0.008
expelled -0.444 0.179 -2.477 0.013 -0.387 -0.134
haveld -1.158 0.167 -6.917 0.000 -1.010 -0.367
gender 0.147 0.129 1.144 0.253 0.128 0.060
age 0.061 0.034 1.815 0.069 0.053 0.101
Covariances:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
.MATH ~~
.SPELL 0.512 0.046 11.178 0.000 0.512 0.512
Intercepts:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
.wratcalc 9.475 4.005 2.366 0.018 9.475 1.475
.wjcalc 3.687 2.816 1.309 0.190 3.687 0.798
.waiscalc -0.836 1.738 -0.481 0.630 -0.836 -0.249
.wratspl 17.106 4.283 3.994 0.000 17.106 2.466
.wjspl 21.581 4.443 4.857 0.000 21.581 3.016
.waisspl 18.474 4.143 4.459 0.000 18.474 2.744
.MATH 0.000 0.000 0.000
.SPELL 0.000 0.000 0.000
Variances:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
.wratcalc 4.034 0.966 4.175 0.000 4.034 0.098
.wjcalc 3.876 0.528 7.335 0.000 3.876 0.182
.waiscalc 5.221 0.457 11.426 0.000 5.221 0.464
.wratspl 4.985 0.612 8.151 0.000 4.985 0.104
.wjspl 4.751 0.626 7.587 0.000 4.751 0.093
.waisspl 4.960 0.591 8.385 0.000 4.960 0.109
.MATH 1.000 0.740 0.740
.SPELL 1.000 0.760 0.760
R version 3.5.1 (2018-07-02)
Platform: x86_64-redhat-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)
Matrix products: default
BLAS/LAPACK: /usr/lib64/R/lib/libRblas.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] lavaan_0.6-3 stationery_0.98.5.7
loaded via a namespace (and not attached):
[1] Rcpp_1.0.0 digest_0.6.18 MASS_7.3-51.1 plyr_1.8.4
[5] xtable_1.8-3 magrittr_1.5 stats4_3.5.1 evaluate_0.12
[9] zip_1.0.0 stringi_1.2.4 pbivnorm_0.6.0 openxlsx_4.1.0
[13] rmarkdown_1.11 tools_3.5.1 stringr_1.3.1 foreign_0.8-71
[17] kutils_1.59 yaml_2.2.0 xfun_0.4 compiler_3.5.1
[21] mnormt_1.5-5 htmltools_0.3.6 knitr_1.21
Available under Created Commons license 3.0