1 Motivation

The MplusDIFFTEST is a popular procedure (a chi-square difference based test) for comparing nested structural equation models. It is available for models estimated by the “MLMV”, “WLSMV”, and “ULSMV” procedures in Mplus. A model including a larger number of parameters is estimated, and then a second model is estimated with some parameters that are restricted (for example, to be equal to each other, or equal to 0). Then the two models are compared using a statistic, referred to as \(T_3\), that is supposed to be distributed as a \(\chi^2\) random variable. When the test comparing models indicates that a difference is not statistically significant, or the equal-fit hypothesis is not rejected, researchers generally prefer the simpler model, the one which estimates fewer parameters.

In Mplus, each input file can be used to estimate only one model. To communicate the results of estimation from the first to the second model, the Mplus SAVEDATA command creates a text file that is a single column of numbers that must be sorted into various vectors and matrices. This file, commonly referred to as a DIFFTEST file, is a snapshot summary of the internal state of the model. This file is then put to use by a DIFFTEST statement in the input file for the second model.

There are two elements in the process that seem to be rather like a “black box”. First, the format of the DIFFTEST file itself is not formally documented. It is simply a column of numbers. It is not apparent to the user how these values can be used to reconstruct the matrices needed to calculate the difference of models test. Second, the details required to calculate \(T_3\) are elusive. There are a number of elements in the calculation, especially for multi-group models, that are not publicly documented.

As we illustrate in this report, it is possible to unbox both of these elements. We parse the DIFFTEST files from two models and from them it is possible to recalculate the value of \(T_3\) reported by Mplus. To the best of our knowledge, our R function, dubbed the crmda_difftest, provides values of the \(T_3\) statistic that matches the Mplus DIFFTEST results perfectly (Mplus reports values to the third decimal place).

The following sections first describe the two-step procedure in Mplus for conducting \(T_3\) DIFFTEST. Second, calculation steps in crmda_difftest for reproducing \(T_3\) statistics are outlined - from importing a DIFFTEST file, to separating the values into a collection of matrices, and to using these matrices for \(T_3\) calculation. The last section reports a comprehensive series of models which demonstrate that the crmda_difftest procedures reproduce/match the values reported by Mplus.

Applications of crmda_difftest to 36 model specifications are included to demonstrate the exact match with Mplus results. These examples employ data from the Human Behavior in School-Aged Children (HBSC) project (Iannotti, 2005-2006). The Mplus input (inp), output (out), and difftest (dif) files for all 36 models are included in the folder with this report. The user should find a number of files.

  1. 44.difftest.Rmd. The R markdown file with which the HTML is prepared.
  2. difftest_functions.R. Includes the main function crmda_difftest and several auxiliary functions.
  3. A plethora of Mplus input files named m*.inp, along with their associated output and snapshot DIFFTEST files.

2 The \(T_3\) Method

The \(T_3\) (second order) chi-square difference correction has been implemented in Mplus since Version 6 (Asparouhov & Muthén, 2010). It was proposed as a replacement of the procedure suggested by Satorra (1999, 2000), which we refer to as \(T_2\) (Asparouhov & Muthén, 2006). One criticism of \(T_2\) is that the degrees of freedom for the test is a floating point number (which is typically rounded to the nearest integer). The \(T_2\) method is still available in Mplus when Satterthwaite = ON is specified under the ANALYSIS command in Mplus model syntax. Instead of involving a \(df\) approximation, the degrees of freedom for a \(T_3\) statistic is an integer, the difference in the number of parameters in two models.

The calculation of the \(T_3\) statistic is somewhat complicated. One must first calculate a summary of the difference between the two models. The difference between models can be referred to as \(T.chisq\). This value is not in fact distributed as a \(\chi^2\) statistic, an issue which has been widely discussed in the structural equation modeling (SEM) literature.

To remedy that fact, Asparouhov and Muthén (2010) propose a linear transformation of \(T.chisq\). To perform the transformation, a ‘scale factor’, referred to as \(a\), and a ‘shift parameter’, \(b\), are needed. The \(T_3\) statistic/transformation is computed with the following equation:

\[ T_3 = T.chisq \times a + b. \tag{1} \]

Asparouhov and Muthén suggest that this shifting and scaling procedure produces a test statistic that is more desirable than the previously proposed methods (e.g., \(T_2\)), given the statistical property of \(T_3\) is more similar to a \(\chi^2\) statistic than the other methods. The importance of this claim is that researchers who rely on other methods may mistakenly detect differences between models.

2.1 The two step estimation procedure

When estimating structural equation models in Mplus, the estimators known as “MLMV”, “WLSMV”, and “ULSMV” include a feature to create a model summary snapshot file (the so-called DIFFTEST file). The syntax of the SAVEDATA command is:

    DIFFTEST = m0.dif;

This saves a model snapshot that we have chosen to name m0.dif (for clarity). A second model is then estimated, and the input file (say, m1.inp) includes an ANALYSIS stanza that requests a comparison against m0.dif by a DIFFTEST statement:

    DIFFTEST = m0.dif

In this nested model comparison, following Asparouhov and Muthén ( 2006) terminology, the first model is called the parent (or the \(H_1\)) model. The dif file for that model (m0.dif) is necessary to calculate the DIFFTEST which compares the parent against the second, nested model (\(H_0\)). In order to calculate the \(T_3\) estimate with R, we need to save the dif snapshot files for both \(H_1\) and \(H_0\) models.

3 Calculating \(T_3\)

3.1 Step 1 - Extracting model information from an Mplus DIFFTEST File

An Mplus DIFFTEST file contains a single column vector. It combines, without punctuation or labels, the values of 4 scalars, \(T\), \(g\), \(s\) and \(p\), along with consecutive sets of values that must be separated and rearranged into 3 matrices, \(\mathbf{\Delta}_i\), \(\mathbf{P}_i\), and \(\mathbf{V}_i\).

  1. \(T\) is the minimized value of the fit function; this matches the value reported for the last iteration in the TECH5 output.

  2. The number of groups (\(g\)).

  3. The number of sample statistics (\(s\)).

  4. The number of model parameters (\(p\)).

  5. \(\mathbf{\Delta}_i\), an \(s \cdot g \times p\) matrix to be reconstructed from the next \(s \cdot g \cdot p\) elements in the dif file (there is one stanza for each group to be constructed).

  6. \(\mathbf{P}_i\), a symmetric \(p \times p\) matrix, reconstructed from the next \(p(p + 1)/2\) elements in the dif file.

  7. \(\mathbf{V}_i\), a symmetric \(p \times p\) matrix, reconstructed from the final \(p(p + 1)/2\) elements in the dif file.

Because the DIFFTEST file provides the one long column of numbers, some trial and error was necessary to separate the values and place them into \(\mathbf{\Delta}_i\), \(\mathbf{P}_i\) and \(\mathbf{V}_i\). The individual values are placed into the matrices row wise. The matrix \(\mathbf{\Delta}_i\) has separate stanzas for each of \(g\) groups. Because \(\mathbf{P}\) and \(\mathbf{V}\) are symmetric matrices, the dif file provides only \(p(p+1)/2\) distinct values for each one. The matrix is reconstructed by filling in a lower triangle row-wise (e.g., insert \(\mathbf{P}[1,1]\), then \(\mathbf{P}[2,1]\), \(\mathbf{P}[2,2]\), \(\mathbf{P}[3,1]\) and so forth). The fact that these are symmetric is relied upon to complete their reconstruction.

3.2 Step 2 - Calculating \(T.chisq\)

T.chisq is computed by first taking the difference between the \(T\) values (the minimized fit function),

\[ T_\Delta = T_{nested} - T_{parent}, \tag{2} \]

and then rescaling \(T_\Delta\) by multiplying \(N*2\), where N is the total sample size:

\[ T.chisq = T_\Delta*N*2. \tag{3} \]

The value of \(T.chisq\) cannot be reconstructed from the output of the two fitted models. In particular, T.chisq is not based on the model \(\chi^2\) values that are reported in Mplus output. When a model is estimated with “WLSMV”, the chi-square value reported is a mean and variance adjusted version of \(T\). In the \(T_3\) test, T.chisq incorporates unscaled values of \(T_{nested}\) and \(T_{parent}\).

3.3 Step 3 - Calculating the \(\mathbf{H}\) Matrix

The calculation of \(T_3\) will require the construction of several separate matrices that will be used to derive the scaling and shifting parameters. Here we are concerned with \(\mathbf{H}\). This matrix is mentioned in Asparouhov and Muthén (2006), where the details of its calculation are left as an exercise for the reader.

For that reason, we do not know how this is calculated in Mplus, but the following will prove to be correct:

\[ H = (\Delta'_1\Delta_1)^{-1}\Delta_1'\Delta_0. \tag{4} \]

This method is suggested by Yves Rosseel in the source code of the R package lavaan (Rosseel, 2017). Note that to obtain the two matrices, \(\mathbf{\Delta}_0\) and \(\mathbf{\Delta}_1\), it is necessary parse both of the model dif files.

3.4 Step 4 - Calculating The \(\mathbf{M}\) Matrix

\(\mathbf{M}\) is an ingredient in the equations for the scaling factor and shift parameter.

As shown by Asparouhov and Muthen (2006), \(\mathbf{M}\) can be calculated as:

\[ M = (P_1 - P_1(H(H'P_1H)^{-1}H'P_1)V_1 \tag{5} \]

using the matrices constructed previously.

3.5 Step 5 - Calculating the Scaling Factor and the Shift Parameter

The scaling factor \(a\) is computed with the following formula:

\[ a = \sqrt{\frac{D}{Tr(M*M)}}. \tag{6} \]

Here \(D\) is the degrees of freedom, calculated as the difference between the numbers of parameters estimated in the two models.

The shift parameter \(b\) is computed as

\[ b = D - \sqrt{\frac{D*Tr(M)^2}{Tr(M*M)}}. \tag{7} \]

In these formulas, \(Tr\) is the trace operator, the sum of the elements along the main diagonal.

3.6 Step 6 - Compute the Scaled \(T_3\) Statistic

Once \(a\) and \(b\) have been computed, they are used in Equation 1 along with \(T.chisq\) to compute the \(T_3\) chi-square difference statistic. Asparouhov and Muthén (2010) suggest that the \(T_3\) statistic follows a chi-square distribution with \(D\) degrees of freedom.

3.7 Using the crmda_difftest Function

The crmda_difftest function takes the file names of the Mplus output files for two nested models as input. For example:

We use functions provided by the MplusAutomation package (Hallquist and Wiley, 2017) for R to parse the Mplus output files.

The function detects the location of the companion dif files, parent.dif and nested.dif. In our examples, these are available in the same folder as the output files. The dif files are parsed by an auxiliary function named diffReader. If there are multiple groups, the user must also specify the total sample size as a third argument to crmda_difftest. The function returns the \(T_3\) value calculated by the aforementioned formulas, along with the estimate of \(T_3\) retrieved from the Mplus output file.

4 Matching Mplus DIFFTEST Calculations

The crmda_difftest function imports the dif files and reproduces estimates of the \(\chi^2\) test that match the \(T_3\) estimates reported in Mplus (Asparouhov and Muthén, 2010). This section shows the results obtained from the crmda_difftest function match the values reported by the Mplus DIFFTEST procedure on a variety of models.

4.1 The demonstration data

A subset of the HBSC data (Health Behavior in School-Aged Children; Iannotti, 2005-2006) was used to obtain the demonstration results. The HBSC data contains responses from students in grades 6, 7, 8, 9, and 10. The subset data being used here contains responses to ordinal measures (five-point Likert-type) on self-reported depression level, physical well-being, bullying experiences (got bullied by others and bullied others), and alcohol use. A total of 25 nested model comparison scenarios were devised. The Mplus estimator “WLSMV” was used for all of the model estimations.

4.2 Example models

The following models were fitted to the HBSC data.

Model Number Model Description Output File
Model 00 Two-Factor CFA, Two Indicators per Factor, Correlation Free m00-2_indicators.out
Model 01 Two-Factor CFA, Two Indicators per Factor, Correlation Fixed Zero m01-2_indicators.out
Model 02 Two-Factor CFA, Three Indicators per Factor, Correlation Free m02-3_indicators.out
Model 03 Two-Factor CFA, Three Indicators per Factor, Correlation Fixed Zero m03-3_indicators.out
Model 04 Two-Factor CFA, Five/Six Indicators per Factor, Correlation Free m04-5_6_indicators.out
Model 05 Two-Factor CFA, Five/Six Indicators per Factor, Correlation Fixed Zero m05-5_6_indicators.out
Model 06 Three-Factor CFA, Five/Six/Nine Indicators per Factor, All Correlations Free m06-3_factors.out
Model 07 Three-Factor CFA, Five/Six/Nine Indicators per Factor, One Correlation Fixed Zero m07-3_factors.out
Model 08 Three-Factor CFA, Five/Six/Nine Indicators per Factor, Two Correlations Equate m08-3_factors.out
Model 09 Structural, Two IVs Predict One DV, IVs Correlation Free m09-structural.out
Model 10 Structural, Two IVs Predict One DV, IVs Correlation Fixed Zero m10-structural.out
Model 11 Structural, Two IVs Predict One DV, IVs Correlation Free, Two Regressive Paths Equate m11-structural.out
Model 12 Structural, Two IVs Predict One DV, IVs Correlation Fixed to Zero, Two Regressive Paths Equate m12-structural.out
Model 13 Structural, Two IVs Predict One DV, IVs Correlation and Two Regressive Paths Equate m13-structural.out
Model 14 Indirect, One IV, One DV, One Mediator, All (a, b, c) Paths Free, Bootstrapping Not Allowed m14-indirect.out
Model 15 Indirect, One IV, One DV, One Mediator, a and b Paths Free, c (IV-DV) Path Fixed Zero, Bootstrapping Not Allowed m15-indirect.out
Model 16 Interaction, Two IVs, One DV, One Latent Interaction Term, .dif Not Allowed to Save m16-interaction.out
Model 17 One Factor Two Group (6th and 7th Grade) Configural Model m17-1_factor-2_groups-config.out
Model 18 One Factor Two Group Weak Model m18-1_factor-2_groups-weak.out
Model 19 One Factor Two Group Strong Model m19-1_factor-2_groups-strong.out
Model 20 One Factor Two Group (6th and 7th Grade) Stict Model m20-1_factor-2_groups-strict.out
Model 21 One Factor Three Group (6th, 7th, and 8th Grade) Configural Model m21-1_factor-3_groups-config.out
Model 22 One Factor Three Group (6th, 7th, and 8th Grade) Weak Model m22-1_factor-3_groups-weak.out
Model 23 One Factor Three Group (6th, 7th, and 8th Grade) Strong Model m23-1_factor-3_groups-strong.out
Model 24 One Factor Three Group (6th, 7th, and 8th Grade) Strict Model m24-1_factor-3_groups-strict.out
Model 25 One Factor Four Group (6th, 7th, 8th, and 9th Grade) Configural Model m25-1_factor-4_groups-config.out
Model 26 One Factor Four Group (6th, 7th, 8th, and 9th Grade) Weak Model m26-1_factor-4_groups-weak.out
Model 27 One Factor Four Group (6th, 7th, 8th, and 9th Grade) Strong Model m27-1_factor-4_groups-strong.out
Model 28 One Factor Four Group (6th, 7th, 8th, and 9th Grade) Strict Model m28-1_factor-4_groups-strict.out
Model 29 One Factor Five Group (6th, 7th, 8th, 9th, and 10th Grade) Configural Model m29-1_factor-5_groups-config.out
Model 30 One Factor Five Group (6th, 7th, 8th, 9th, and 10th Grade) Weak Model m30-1_factor-5_groups-weak.out
Model 31 One Factor Five Group (6th, 7th, 8th, 9th, and 10th Grade) Strong Model m31-1_factor-5_groups-strong.out
Model 32 One Factor Five Group (6th, 7th, 8th, 9th, and 10th Grade) Strict Model m32-1_factor-5_groups-strict.out
Model 33 Six Factor Five Group (6th, 7th, 8th, 9th, and 10th Grade) Configural Model m33-5_factors-5_groups-config.out
Model 34 Six Factor Five Group (6th, 7th, 8th, 9th, and 10th Grade) Weak Model m34-5_factors-5_groups-weak.out
Model 35 Five Factor Five Group (6th, 7th, 8th, 9th, and 10th Grade) Strong Model m35-5_factors-5_groups-strong.out
Model 36 Five Factor Five Group (6th, 7th, 8th, 9th, and 10th Grade) Strict Model m36-5_factors-5_groups-strict.out

4.3 Nested model comparisons

In the following table, we summarize the \(T_3\) values calculated with crmda_difftest, based on the dif files from Mplus, and compare against the values of the \(T_3\) estimates reported in the Mplus output files. These results show that across a variety of model comparison scenarios the crmda_difftest function reproduces the test statistic provided by Mplus exactly to three decimal places.

Comparison crmda_difftest_T3 Mplus_DIFFTEST_T3 df p
Model 00 vs. 01 145.270 145.27 1.000 0.000
Model 02 vs. 03 239.494 239.494 1.000 0.000
Model 04 vs. 05 440.748 440.748 1.000 0.000
Model 06 vs. 07 63.328 63.328 1.000 0.000
Model 06 vs. 08 25.534 25.534 1.000 0.000
Model 09 vs. 10 717.946 717.946 1.000 0.000
Model 09 vs. 11 68.182 68.182 1.000 0.000
Model 10 vs. 12 67.541 67.541 1.000 0.000
Model 11 vs. 13 166.535 166.535 1.000 0.000
Model 14 vs. 15 0.786 0.786 1.000 0.375
Model 17 vs. 18 2.143 2.143 5.000 0.829
Model 18 vs. 19 21.789 21.789 17.000 0.193
Model 19 vs. 20 34.030 34.03 6.000 0.000
Model 21 vs. 22 5.803 5.803 5.000 0.326
Model 22 vs. 23 40.958 40.958 17.000 0.001
Model 23 vs. 24 51.198 51.198 6.000 0.000
Model 25 vs. 26 11.685 11.685 5.000 0.039
Model 26 vs. 27 60.603 60.603 17.000 0.000
Model 27 vs. 28 101.549 101.549 6.000 0.000
Model 29 vs. 30 26.145 26.145 5.000 0.000
Model 30 vs. 31 79.822 79.822 17.000 0.000
Model 31 vs. 32 148.519 148.519 6.000 0.000
Model 33 vs. 34 157.958 157.958 32.000 0.000
Model 34 vs. 35 360.917 360.917 106.000 0.000
Model 35 vs. 36 367.814 367.814 37.000 0.000

5 Summary

We have outlined the procedures necessary to import and parse the Mplus DIFFTEST files and to calculate a difference of fit test known as \(T_3\). While some elements in the calculations are based on our best guess about calculations in Mplus, we believe that the combination of elements is correct. The difference in our estimates and the values reported by Mplus is smaller than \(10^{-4}\).

The larger project from which this report is drawn will compare the hypothesis tests based on \(T_2\), \(T_3\), and others. With the completion of this phase, we are one step closer to the final goal. The next phase in this project is to create R functions that provide replacements for \(\mathbf{\Delta}_i\), \(\mathbf{P}_i\), \(\mathbf{V}_i\).

5.1 R Session Info

R version 3.4.3 (2017-11-30)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.3

Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib

[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] psych_1.7.8         MplusAutomation_0.7 xtable_1.8-2       
[4] crmda_0.49          kutils_1.34        

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.15       knitr_1.18         magrittr_1.5      
 [4] mnormt_1.5-5       pbivnorm_0.6.0     lattice_0.20-35   
 [7] quadprog_1.5-5     stringr_1.2.0      plyr_1.8.4        
[10] tools_3.4.3        parallel_3.4.3     grid_3.4.3        
[13] nlme_3.1-131       texreg_1.36.23     coda_0.19-1       
[16] htmltools_0.3.6    yaml_2.1.16        rprojroot_1.3-2   
[19] digest_0.6.15      lavaan_0.5-23.1097 codetools_0.2-15  
[22] evaluate_0.10.1    gsubfn_0.6-6       rmarkdown_1.8     
[25] openxlsx_4.0.17    stringi_1.1.6      pander_0.6.1      
[28] compiler_3.4.3     backports_1.1.2    boot_1.3-20       
[31] stats4_3.4.3       foreign_0.8-69     proto_1.0.0       

Available under Created Commons license 3.0 CC BY