Ben A. Kite, CRMDA, University of Kansas <bakite@ku.edu>
Paul E. Johnson, CRMDA, University of Kansas <pauljohn@ku.edu>
Chong Xing, CRMDA, University of Kansas <cxing@ku.edu>
CRMDA Guide #44. Please visit http://crmda.ku.edu/guides for updates.
Keywords:
Structural Equation Modeling (SEM), Chi-Square Difference test, Categorical Data. Weighted Least Squares (WLSMV), Mplus DIFFTEST
2017 December 29
Abstract
In the Mplus statistical software suite (Muthén & Muthén, 1998-2017), the \(T_3\) method (Asparouhov & Muthén, 2010) is used to conduct nested model comparisons (“difftest”) with data that is not normally distributed. The Mplus \(T_3\) method is widely used for comparison of models fitted with Likert-type ordinal variables, and yet it is not widely understood. This technical report describes the anatomy of the Mplus \(T_3\) DIFFTEST procedure. We offer an R (http://r-project.org) function crmda_difftest
which will import Mplus output files and reconstitute matrices needed to calculate \(T_3\). We provide an exhaustive collection of Mplus examples, along with their output, to demonstrate the fact that the \(T_3\) estimates from Mplus are reproduced exactly.
DIFFTEST
Filecrmda_difftest
FunctionDIFFTEST
CalculationsThe MplusDIFFTEST
is a popular procedure (a chi-square difference based test) for comparing nested structural equation models. It is available for models estimated by the “MLMV”, “WLSMV”, and “ULSMV” procedures in Mplus. A model including a larger number of parameters is estimated, and then a second model is estimated with some parameters that are restricted (for example, to be equal to each other, or equal to 0). Then the two models are compared using a statistic, referred to as \(T_3\), that is supposed to be distributed as a \(\chi^2\) random variable. When the test comparing models indicates that a difference is not statistically significant, or the equal-fit hypothesis is not rejected, researchers generally prefer the simpler model, the one which estimates fewer parameters.
In Mplus, each input file can be used to estimate only one model. To communicate the results of estimation from the first to the second model, the Mplus SAVEDATA
command creates a text file that is a single column of numbers that must be sorted into various vectors and matrices. This file, commonly referred to as a DIFFTEST
file, is a snapshot summary of the internal state of the model. This file is then put to use by a DIFFTEST
statement in the input file for the second model.
There are two elements in the process that seem to be rather like a “black box”. First, the format of the DIFFTEST
file itself is not formally documented. It is simply a column of numbers. It is not apparent to the user how these values can be used to reconstruct the matrices needed to calculate the difference of models test. Second, the details required to calculate \(T_3\) are elusive. There are a number of elements in the calculation, especially for multi-group models, that are not publicly documented.
As we illustrate in this report, it is possible to unbox both of these elements. We parse the DIFFTEST
files from two models and from them it is possible to recalculate the value of \(T_3\) reported by Mplus. To the best of our knowledge, our R function, dubbed the crmda_difftest
, provides values of the \(T_3\) statistic that matches the Mplus DIFFTEST
results perfectly (Mplus reports values to the third decimal place).
The following sections first describe the two-step procedure in Mplus for conducting \(T_3\) DIFFTEST
. Second, calculation steps in crmda_difftest
for reproducing \(T_3\) statistics are outlined - from importing a DIFFTEST
file, to separating the values into a collection of matrices, and to using these matrices for \(T_3\) calculation. The last section reports a comprehensive series of models which demonstrate that the crmda_difftest
procedures reproduce/match the values reported by Mplus.
Applications of crmda_difftest
to 36 model specifications are included to demonstrate the exact match with Mplus results. These examples employ data from the Human Behavior in School-Aged Children (HBSC) project (Iannotti, 2005-2006). The Mplus input (inp
), output (out
), and difftest (dif
) files for all 36 models are included in the folder with this report. The user should find a number of files.
44.difftest.Rmd
. The R markdown file with which the HTML is prepared.difftest_functions.R
. Includes the main function crmda_difftest
and several auxiliary functions.m*.inp
, along with their associated output and snapshot DIFFTEST
files.The \(T_3\) (second order) chi-square difference correction has been implemented in Mplus since Version 6 (Asparouhov & Muthén, 2010). It was proposed as a replacement of the procedure suggested by Satorra (1999, 2000), which we refer to as \(T_2\) (Asparouhov & Muthén, 2006). One criticism of \(T_2\) is that the degrees of freedom for the test is a floating point number (which is typically rounded to the nearest integer). The \(T_2\) method is still available in Mplus when Satterthwaite = ON
is specified under the ANALYSIS
command in Mplus model syntax. Instead of involving a \(df\) approximation, the degrees of freedom for a \(T_3\) statistic is an integer, the difference in the number of parameters in two models.
The calculation of the \(T_3\) statistic is somewhat complicated. One must first calculate a summary of the difference between the two models. The difference between models can be referred to as \(T.chisq\). This value is not in fact distributed as a \(\chi^2\) statistic, an issue which has been widely discussed in the structural equation modeling (SEM) literature.
To remedy that fact, Asparouhov and Muthén (2010) propose a linear transformation of \(T.chisq\). To perform the transformation, a ‘scale factor’, referred to as \(a\), and a ‘shift parameter’, \(b\), are needed. The \(T_3\) statistic/transformation is computed with the following equation:
\[ T_3 = T.chisq \times a + b. \tag{1} \]
Asparouhov and Muthén suggest that this shifting and scaling procedure produces a test statistic that is more desirable than the previously proposed methods (e.g., \(T_2\)), given the statistical property of \(T_3\) is more similar to a \(\chi^2\) statistic than the other methods. The importance of this claim is that researchers who rely on other methods may mistakenly detect differences between models.
When estimating structural equation models in Mplus, the estimators known as “MLMV”, “WLSMV”, and “ULSMV” include a feature to create a model summary snapshot file (the so-called DIFFTEST file). The syntax of the SAVEDATA command is:
SAVEDATA:
DIFFTEST = m0.dif;
This saves a model snapshot that we have chosen to name m0.dif
(for clarity). A second model is then estimated, and the input file (say, m1.inp
) includes an ANALYSIS
stanza that requests a comparison against m0.dif
by a DIFFTEST
statement:
ANALYSIS:
ESTIMATOR = WLSMV;
DIFFTEST = m0.dif
In this nested model comparison, following Asparouhov and Muthén ( 2006) terminology, the first model is called the parent (or the \(H_1\)) model. The dif
file for that model (m0.dif
) is necessary to calculate the DIFFTEST
which compares the parent against the second, nested model (\(H_0\)). In order to calculate the \(T_3\) estimate with R, we need to save the dif
snapshot files for both \(H_1\) and \(H_0\) models.
DIFFTEST
FileAn Mplus DIFFTEST
file contains a single column vector. It combines, without punctuation or labels, the values of 4 scalars, \(T\), \(g\), \(s\) and \(p\), along with consecutive sets of values that must be separated and rearranged into 3 matrices, \(\mathbf{\Delta}_i\), \(\mathbf{P}_i\), and \(\mathbf{V}_i\).
\(T\) is the minimized value of the fit function; this matches the value reported for the last iteration in the TECH5 output.
The number of groups (\(g\)).
The number of sample statistics (\(s\)).
The number of model parameters (\(p\)).
\(\mathbf{\Delta}_i\), an \(s \cdot g \times p\) matrix to be reconstructed from the next \(s \cdot g \cdot p\) elements in the dif
file (there is one stanza for each group to be constructed).
\(\mathbf{P}_i\), a symmetric \(p \times p\) matrix, reconstructed from the next \(p(p + 1)/2\) elements in the dif
file.
\(\mathbf{V}_i\), a symmetric \(p \times p\) matrix, reconstructed from the final \(p(p + 1)/2\) elements in the dif
file.
Because the DIFFTEST
file provides the one long column of numbers, some trial and error was necessary to separate the values and place them into \(\mathbf{\Delta}_i\), \(\mathbf{P}_i\) and \(\mathbf{V}_i\). The individual values are placed into the matrices row wise. The matrix \(\mathbf{\Delta}_i\) has separate stanzas for each of \(g\) groups. Because \(\mathbf{P}\) and \(\mathbf{V}\) are symmetric matrices, the dif
file provides only \(p(p+1)/2\) distinct values for each one. The matrix is reconstructed by filling in a lower triangle row-wise (e.g., insert \(\mathbf{P}[1,1]\), then \(\mathbf{P}[2,1]\), \(\mathbf{P}[2,2]\), \(\mathbf{P}[3,1]\) and so forth). The fact that these are symmetric is relied upon to complete their reconstruction.
T.chisq is computed by first taking the difference between the \(T\) values (the minimized fit function),
\[ T_\Delta = T_{nested} - T_{parent}, \tag{2} \]
and then rescaling \(T_\Delta\) by multiplying \(N*2\), where N is the total sample size:
\[ T.chisq = T_\Delta*N*2. \tag{3} \]
The value of \(T.chisq\) cannot be reconstructed from the output of the two fitted models. In particular, T.chisq is not based on the model \(\chi^2\) values that are reported in Mplus output. When a model is estimated with “WLSMV”, the chi-square value reported is a mean and variance adjusted version of \(T\). In the \(T_3\) test, T.chisq incorporates unscaled values of \(T_{nested}\) and \(T_{parent}\).
The calculation of \(T_3\) will require the construction of several separate matrices that will be used to derive the scaling and shifting parameters. Here we are concerned with \(\mathbf{H}\). This matrix is mentioned in Asparouhov and Muthén (2006), where the details of its calculation are left as an exercise for the reader.
For that reason, we do not know how this is calculated in Mplus, but the following will prove to be correct:
\[ H = (\Delta'_1\Delta_1)^{-1}\Delta_1'\Delta_0. \tag{4} \]
This method is suggested by Yves Rosseel in the source code of the R package lavaan
(Rosseel, 2017). Note that to obtain the two matrices, \(\mathbf{\Delta}_0\) and \(\mathbf{\Delta}_1\), it is necessary parse both of the model dif
files.
\(\mathbf{M}\) is an ingredient in the equations for the scaling factor and shift parameter.
As shown by Asparouhov and Muthen (2006), \(\mathbf{M}\) can be calculated as:
\[ M = (P_1 - P_1(H(H'P_1H)^{-1}H'P_1)V_1 \tag{5} \]
using the matrices constructed previously.
The scaling factor \(a\) is computed with the following formula:
\[ a = \sqrt{\frac{D}{Tr(M*M)}}. \tag{6} \]
Here \(D\) is the degrees of freedom, calculated as the difference between the numbers of parameters estimated in the two models.
The shift parameter \(b\) is computed as
\[ b = D - \sqrt{\frac{D*Tr(M)^2}{Tr(M*M)}}. \tag{7} \]
In these formulas, \(Tr\) is the trace operator, the sum of the elements along the main diagonal.
Once \(a\) and \(b\) have been computed, they are used in Equation 1 along with \(T.chisq\) to compute the \(T_3\) chi-square difference statistic. Asparouhov and Muthén (2010) suggest that the \(T_3\) statistic follows a chi-square distribution with \(D\) degrees of freedom.
crmda_difftest
FunctionThe crmda_difftest
function takes the file names of the Mplus output files for two nested models as input. For example:
crmda_difftest("parent.out", "nested.out")
We use functions provided by the MplusAutomation package (Hallquist and Wiley, 2017) for R to parse the Mplus output files.
The function detects the location of the companion dif
files, parent.dif
and nested.dif
. In our examples, these are available in the same folder as the output files. The dif
files are parsed by an auxiliary function named diffReader
. If there are multiple groups, the user must also specify the total sample size as a third argument to crmda_difftest
. The function returns the \(T_3\) value calculated by the aforementioned formulas, along with the estimate of \(T_3\) retrieved from the Mplus output file.
DIFFTEST
CalculationsThe crmda_difftest
function imports the dif
files and reproduces estimates of the \(\chi^2\) test that match the \(T_3\) estimates reported in Mplus (Asparouhov and Muthén, 2010). This section shows the results obtained from the crmda_difftest
function match the values reported by the Mplus DIFFTEST procedure on a variety of models.
A subset of the HBSC data (Health Behavior in School-Aged Children; Iannotti, 2005-2006) was used to obtain the demonstration results. The HBSC data contains responses from students in grades 6, 7, 8, 9, and 10. The subset data being used here contains responses to ordinal measures (five-point Likert-type) on self-reported depression level, physical well-being, bullying experiences (got bullied by others and bullied others), and alcohol use. A total of 25 nested model comparison scenarios were devised. The Mplus estimator “WLSMV” was used for all of the model estimations.
The following models were fitted to the HBSC data.
Model Number | Model Description | Output File |
---|---|---|
Model 00 | Two-Factor CFA, Two Indicators per Factor, Correlation Free | m00-2_indicators.out |
Model 01 | Two-Factor CFA, Two Indicators per Factor, Correlation Fixed Zero | m01-2_indicators.out |
Model 02 | Two-Factor CFA, Three Indicators per Factor, Correlation Free | m02-3_indicators.out |
Model 03 | Two-Factor CFA, Three Indicators per Factor, Correlation Fixed Zero | m03-3_indicators.out |
Model 04 | Two-Factor CFA, Five/Six Indicators per Factor, Correlation Free | m04-5_6_indicators.out |
Model 05 | Two-Factor CFA, Five/Six Indicators per Factor, Correlation Fixed Zero | m05-5_6_indicators.out |
Model 06 | Three-Factor CFA, Five/Six/Nine Indicators per Factor, All Correlations Free | m06-3_factors.out |
Model 07 | Three-Factor CFA, Five/Six/Nine Indicators per Factor, One Correlation Fixed Zero | m07-3_factors.out |
Model 08 | Three-Factor CFA, Five/Six/Nine Indicators per Factor, Two Correlations Equate | m08-3_factors.out |
Model 09 | Structural, Two IVs Predict One DV, IVs Correlation Free | m09-structural.out |
Model 10 | Structural, Two IVs Predict One DV, IVs Correlation Fixed Zero | m10-structural.out |
Model 11 | Structural, Two IVs Predict One DV, IVs Correlation Free, Two Regressive Paths Equate | m11-structural.out |
Model 12 | Structural, Two IVs Predict One DV, IVs Correlation Fixed to Zero, Two Regressive Paths Equate | m12-structural.out |
Model 13 | Structural, Two IVs Predict One DV, IVs Correlation and Two Regressive Paths Equate | m13-structural.out |
Model 14 | Indirect, One IV, One DV, One Mediator, All (a, b, c) Paths Free, Bootstrapping Not Allowed | m14-indirect.out |
Model 15 | Indirect, One IV, One DV, One Mediator, a and b Paths Free, c (IV-DV) Path Fixed Zero, Bootstrapping Not Allowed | m15-indirect.out |
Model 16 | Interaction, Two IVs, One DV, One Latent Interaction Term, .dif Not Allowed to Save | m16-interaction.out |
Model 17 | One Factor Two Group (6th and 7th Grade) Configural Model | m17-1_factor-2_groups-config.out |
Model 18 | One Factor Two Group Weak Model | m18-1_factor-2_groups-weak.out |
Model 19 | One Factor Two Group Strong Model | m19-1_factor-2_groups-strong.out |
Model 20 | One Factor Two Group (6th and 7th Grade) Stict Model | m20-1_factor-2_groups-strict.out |
Model 21 | One Factor Three Group (6th, 7th, and 8th Grade) Configural Model | m21-1_factor-3_groups-config.out |
Model 22 | One Factor Three Group (6th, 7th, and 8th Grade) Weak Model | m22-1_factor-3_groups-weak.out |
Model 23 | One Factor Three Group (6th, 7th, and 8th Grade) Strong Model | m23-1_factor-3_groups-strong.out |
Model 24 | One Factor Three Group (6th, 7th, and 8th Grade) Strict Model | m24-1_factor-3_groups-strict.out |
Model 25 | One Factor Four Group (6th, 7th, 8th, and 9th Grade) Configural Model | m25-1_factor-4_groups-config.out |
Model 26 | One Factor Four Group (6th, 7th, 8th, and 9th Grade) Weak Model | m26-1_factor-4_groups-weak.out |
Model 27 | One Factor Four Group (6th, 7th, 8th, and 9th Grade) Strong Model | m27-1_factor-4_groups-strong.out |
Model 28 | One Factor Four Group (6th, 7th, 8th, and 9th Grade) Strict Model | m28-1_factor-4_groups-strict.out |
Model 29 | One Factor Five Group (6th, 7th, 8th, 9th, and 10th Grade) Configural Model | m29-1_factor-5_groups-config.out |
Model 30 | One Factor Five Group (6th, 7th, 8th, 9th, and 10th Grade) Weak Model | m30-1_factor-5_groups-weak.out |
Model 31 | One Factor Five Group (6th, 7th, 8th, 9th, and 10th Grade) Strong Model | m31-1_factor-5_groups-strong.out |
Model 32 | One Factor Five Group (6th, 7th, 8th, 9th, and 10th Grade) Strict Model | m32-1_factor-5_groups-strict.out |
Model 33 | Six Factor Five Group (6th, 7th, 8th, 9th, and 10th Grade) Configural Model | m33-5_factors-5_groups-config.out |
Model 34 | Six Factor Five Group (6th, 7th, 8th, 9th, and 10th Grade) Weak Model | m34-5_factors-5_groups-weak.out |
Model 35 | Five Factor Five Group (6th, 7th, 8th, 9th, and 10th Grade) Strong Model | m35-5_factors-5_groups-strong.out |
Model 36 | Five Factor Five Group (6th, 7th, 8th, 9th, and 10th Grade) Strict Model | m36-5_factors-5_groups-strict.out |
In the following table, we summarize the \(T_3\) values calculated with crmda_difftest
, based on the dif
files from Mplus, and compare against the values of the \(T_3\) estimates reported in the Mplus output files. These results show that across a variety of model comparison scenarios the crmda_difftest
function reproduces the test statistic provided by Mplus exactly to three decimal places.
Comparison | crmda_difftest_T3 | Mplus_DIFFTEST_T3 | df | p |
---|---|---|---|---|
Model 00 vs. 01 | 145.270 | 145.27 | 1.000 | 0.000 |
Model 02 vs. 03 | 239.494 | 239.494 | 1.000 | 0.000 |
Model 04 vs. 05 | 440.748 | 440.748 | 1.000 | 0.000 |
Model 06 vs. 07 | 63.328 | 63.328 | 1.000 | 0.000 |
Model 06 vs. 08 | 25.534 | 25.534 | 1.000 | 0.000 |
Model 09 vs. 10 | 717.946 | 717.946 | 1.000 | 0.000 |
Model 09 vs. 11 | 68.182 | 68.182 | 1.000 | 0.000 |
Model 10 vs. 12 | 67.541 | 67.541 | 1.000 | 0.000 |
Model 11 vs. 13 | 166.535 | 166.535 | 1.000 | 0.000 |
Model 14 vs. 15 | 0.786 | 0.786 | 1.000 | 0.375 |
Model 17 vs. 18 | 2.143 | 2.143 | 5.000 | 0.829 |
Model 18 vs. 19 | 21.789 | 21.789 | 17.000 | 0.193 |
Model 19 vs. 20 | 34.030 | 34.03 | 6.000 | 0.000 |
Model 21 vs. 22 | 5.803 | 5.803 | 5.000 | 0.326 |
Model 22 vs. 23 | 40.958 | 40.958 | 17.000 | 0.001 |
Model 23 vs. 24 | 51.198 | 51.198 | 6.000 | 0.000 |
Model 25 vs. 26 | 11.685 | 11.685 | 5.000 | 0.039 |
Model 26 vs. 27 | 60.603 | 60.603 | 17.000 | 0.000 |
Model 27 vs. 28 | 101.549 | 101.549 | 6.000 | 0.000 |
Model 29 vs. 30 | 26.145 | 26.145 | 5.000 | 0.000 |
Model 30 vs. 31 | 79.822 | 79.822 | 17.000 | 0.000 |
Model 31 vs. 32 | 148.519 | 148.519 | 6.000 | 0.000 |
Model 33 vs. 34 | 157.958 | 157.958 | 32.000 | 0.000 |
Model 34 vs. 35 | 360.917 | 360.917 | 106.000 | 0.000 |
Model 35 vs. 36 | 367.814 | 367.814 | 37.000 | 0.000 |
We have outlined the procedures necessary to import and parse the Mplus DIFFTEST
files and to calculate a difference of fit test known as \(T_3\). While some elements in the calculations are based on our best guess about calculations in Mplus, we believe that the combination of elements is correct. The difference in our estimates and the values reported by Mplus is smaller than \(10^{-4}\).
The larger project from which this report is drawn will compare the hypothesis tests based on \(T_2\), \(T_3\), and others. With the completion of this phase, we are one step closer to the final goal. The next phase in this project is to create R functions that provide replacements for \(\mathbf{\Delta}_i\), \(\mathbf{P}_i\), \(\mathbf{V}_i\).
R version 3.4.3 (2017-11-30)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 17.10
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] methods stats graphics grDevices utils datasets base
other attached packages:
[1] psych_1.7.8 MplusAutomation_0.7 xtable_1.8-2
[4] kutils_1.31 crmda_0.49
loaded via a namespace (and not attached):
[1] Rcpp_0.12.12 knitr_1.17 magrittr_1.5
[4] mnormt_1.5-5 lattice_0.20-35 pbivnorm_0.6.0
[7] quadprog_1.5-5 stringr_1.2.0 plyr_1.8.4
[10] tools_3.4.3 parallel_3.4.3 grid_3.4.3
[13] nlme_3.1-131 texreg_1.36.23 coda_0.19-1
[16] htmltools_0.3.6 yaml_2.1.14 digest_0.6.12
[19] rprojroot_1.2 lavaan_0.5-23.1097 codetools_0.2-15
[22] evaluate_0.10.1 gsubfn_0.6-6 rmarkdown_1.6
[25] openxlsx_4.0.17 stringi_1.1.5 pander_0.6.1
[28] compiler_3.4.3 backports_1.1.0 boot_1.3-20
[31] stats4_3.4.3 foreign_0.8-69 proto_1.0.0
Available under Created Commons license 3.0