Abstract

In the Mplus statistical software suite (Muthén & Muthén, 1998-2017), the \(T_3\) method (Asparouhov & Muthén, 2010) is used to conduct nested model comparisons (“difftest”) with data that is not normally distributed. The Mplus \(T_3\) method is widely used for comparison of models fitted with Likert-type ordinal variables, and yet it is not widely understood. This technical report describes the anatomy of the Mplus \(T_3\) DIFFTEST procedure. We offer an R (http://r-project.org) function crmda_difftest which will import Mplus output files and reconstitute matrices needed to calculate \(T_3\). We provide an exhaustive collection of Mplus examples, along with their output, to demonstrate the fact that the \(T_3\) estimates from Mplus are reproduced exactly.

1 Motivation

The MplusDIFFTEST is a popular procedure (a chi-square difference based test) for comparing nested structural equation models. It is available for models estimated by the “MLMV”, “WLSMV”, and “ULSMV” procedures in Mplus. A model including a larger number of parameters is estimated, and then a second model is estimated with some parameters that are restricted (for example, to be equal to each other, or equal to 0). Then the two models are compared using a statistic, referred to as \(T_3\), that is supposed to be distributed as a \(\chi^2\) random variable. When the test comparing models indicates that a difference is not statistically significant, or the equal-fit hypothesis is not rejected, researchers generally prefer the simpler model, the one which estimates fewer parameters.

In Mplus, each input file can be used to estimate only one model. To communicate the results of estimation from the first to the second model, the Mplus SAVEDATA command creates a text file that is a single column of numbers that must be sorted into various vectors and matrices. This file, commonly referred to as a DIFFTEST file, is a snapshot summary of the internal state of the model. This file is then put to use by a DIFFTEST statement in the input file for the second model.

There are two elements in the process that seem to be rather like a “black box”. First, the format of the DIFFTEST file itself is not formally documented. It is simply a column of numbers. It is not apparent to the user how these values can be used to reconstruct the matrices needed to calculate the difference of models test. Second, the details required to calculate \(T_3\) are elusive. There are a number of elements in the calculation, especially for multi-group models, that are not publicly documented.

As we illustrate in this report, it is possible to unbox both of these elements. We parse the DIFFTEST files from two models and from them it is possible to recalculate the value of \(T_3\) reported by Mplus. To the best of our knowledge, our R function, dubbed the crmda_difftest, provides values of the \(T_3\) statistic that matches the Mplus DIFFTEST results perfectly (Mplus reports values to the third decimal place).

The following sections first describe the two-step procedure in Mplus for conducting \(T_3\) DIFFTEST. Second, calculation steps in crmda_difftest for reproducing \(T_3\) statistics are outlined - from importing a DIFFTEST file, to separating the values into a collection of matrices, and to using these matrices for \(T_3\) calculation. The last section reports a comprehensive series of models which demonstrate that the crmda_difftest procedures reproduce/match the values reported by Mplus.

Applications of crmda_difftest to 36 model specifications are included to demonstrate the exact match with Mplus results. These examples employ data from the Human Behavior in School-Aged Children (HBSC) project (Iannotti, 2005-2006). The Mplus input (inp), output (out), and difftest (dif) files for all 36 models are included in the folder with this report. The user should find a number of files.

44.difftest.Rmd. The R markdown file with which the HTML is prepared.
difftest_functions.R. Includes the main function crmda_difftest and several auxiliary functions.
A plethora of Mplus input files named m*.inp, along with their associated output and snapshot DIFFTEST files.

2 The \(T_3\) Method

The \(T_3\) (second order) chi-square difference correction has been implemented in Mplus since Version 6 (Asparouhov & Muthén, 2010). It was proposed as a replacement of the procedure suggested by Satorra (1999, 2000), which we refer to as \(T_2\) (Asparouhov & Muthén, 2006). One criticism of \(T_2\) is that the degrees of freedom for the test is a floating point number (which is typically rounded to the nearest integer). The \(T_2\) method is still available in Mplus when Satterthwaite = ON is specified under the ANALYSIS command in Mplus model syntax. Instead of involving a \(df\) approximation, the degrees of freedom for a \(T_3\) statistic is an integer, the difference in the number of parameters in two models.

The calculation of the \(T_3\) statistic is somewhat complicated. One must first calculate a summary of the difference between the two models. The difference between models can be referred to as \(T.chisq\). This value is not in fact distributed as a \(\chi^2\) statistic, an issue which has been widely discussed in the structural equation modeling (SEM) literature.

To remedy that fact, Asparouhov and Muthén (2010) propose a linear transformation of \(T.chisq\). To perform the transformation, a ‘scale factor’, referred to as \(a\), and a ‘shift parameter’, \(b\), are needed. The \(T_3\) statistic/transformation is computed with the following equation:

\[ T_3 = T.chisq \times a + b. \tag{1} \]

Asparouhov and Muthén suggest that this shifting and scaling procedure produces a test statistic that is more desirable than the previously proposed methods (e.g., \(T_2\)), given the statistical property of \(T_3\) is more similar to a \(\chi^2\) statistic than the other methods. The importance of this claim is that researchers who rely on other methods may mistakenly detect differences between models.

2.1 The two step estimation procedure

When estimating structural equation models in Mplus, the estimators known as “MLMV”, “WLSMV”, and “ULSMV” include a feature to create a model summary snapshot file (the so-called DIFFTEST file). The syntax of the SAVEDATA command is:

SAVEDATA:
    DIFFTEST = m0.dif;

This saves a model snapshot that we have chosen to name m0.dif (for clarity). A second model is then estimated, and the input file (say, m1.inp) includes an ANALYSIS stanza that requests a comparison against m0.dif by a DIFFTEST statement:

ANALYSIS:
    ESTIMATOR = WLSMV;
    DIFFTEST = m0.dif

In this nested model comparison, following Asparouhov and Muthén ( 2006) terminology, the first model is called the parent (or the \(H_1\)) model. The dif file for that model (m0.dif) is necessary to calculate the DIFFTEST which compares the parent against the second, nested model (\(H_0\)). In order to calculate the \(T_3\) estimate with R, we need to save the dif snapshot files for both \(H_1\) and \(H_0\) models.

3 Calculating \(T_3\)

3.1 Step 1 - Extracting model information from an Mplus `DIFFTEST` File

An Mplus DIFFTEST file contains a single column vector. It combines, without punctuation or labels, the values of 4 scalars, \(T\), \(g\), \(s\) and \(p\), along with consecutive sets of values that must be separated and rearranged into 3 matrices, \(\mathbf{\Delta}_i\), \(\mathbf{P}_i\), and \(\mathbf{V}_i\).

\(T\) is the minimized value of the fit function; this matches the value reported for the last iteration in the TECH5 output.
The number of groups (\(g\)).
The number of sample statistics (\(s\)).
The number of model parameters (\(p\)).
\(\mathbf{\Delta}_i\), an \(s \cdot g \times p\) matrix to be reconstructed from the next \(s \cdot g \cdot p\) elements in the dif file (there is one stanza for each group to be constructed).
\(\mathbf{P}_i\), a symmetric \(p \times p\) matrix, reconstructed from the next \(p(p + 1)/2\) elements in the dif file.
\(\mathbf{V}_i\), a symmetric \(p \times p\) matrix, reconstructed from the final \(p(p + 1)/2\) elements in the dif file.

Because the DIFFTEST file provides the one long column of numbers, some trial and error was necessary to separate the values and place them into \(\mathbf{\Delta}_i\), \(\mathbf{P}_i\) and \(\mathbf{V}_i\). The individual values are placed into the matrices row wise. The matrix \(\mathbf{\Delta}_i\) has separate stanzas for each of \(g\) groups. Because \(\mathbf{P}\) and \(\mathbf{V}\) are symmetric matrices, the dif file provides only \(p(p+1)/2\) distinct values for each one. The matrix is reconstructed by filling in a lower triangle row-wise (e.g., insert \(\mathbf{P}[1,1]\), then \(\mathbf{P}[2,1]\), \(\mathbf{P}[2,2]\), \(\mathbf{P}[3,1]\) and so forth). The fact that these are symmetric is relied upon to complete their reconstruction.

3.2 Step 2 - Calculating \(T.chisq\)

T.chisq is computed by first taking the difference between the \(T\) values (the minimized fit function),

\[ T_\Delta = T_{nested} - T_{parent}, \tag{2} \]

and then rescaling \(T_\Delta\) by multiplying \(N*2\), where N is the total sample size:

\[ T.chisq = T_\Delta*N*2. \tag{3} \]

The value of \(T.chisq\) cannot be reconstructed from the output of the two fitted models. In particular, T.chisq is not based on the model \(\chi^2\) values that are reported in Mplus output. When a model is estimated with “WLSMV”, the chi-square value reported is a mean and variance adjusted version of \(T\). In the \(T_3\) test, T.chisq incorporates unscaled values of \(T_{nested}\) and \(T_{parent}\).

3.3 Step 3 - Calculating the \(\mathbf{H}\) Matrix

The calculation of \(T_3\) will require the construction of several separate matrices that will be used to derive the scaling and shifting parameters. Here we are concerned with \(\mathbf{H}\). This matrix is mentioned in Asparouhov and Muthén (2006), where the details of its calculation are left as an exercise for the reader.

For that reason, we do not know how this is calculated in Mplus, but the following will prove to be correct:

\[ H = (\Delta'_1\Delta_1)^{-1}\Delta_1'\Delta_0. \tag{4} \]

This method is suggested by Yves Rosseel in the source code of the R package lavaan (Rosseel, 2017). Note that to obtain the two matrices, \(\mathbf{\Delta}_0\) and \(\mathbf{\Delta}_1\), it is necessary parse both of the model dif files.

3.4 Step 4 - Calculating The \(\mathbf{M}\) Matrix

\(\mathbf{M}\) is an ingredient in the equations for the scaling factor and shift parameter.

As shown by Asparouhov and Muthen (2006), \(\mathbf{M}\) can be calculated as:

\[ M = (P_1 - P_1(H(H'P_1H)^{-1}H'P_1)V_1 \tag{5} \]

using the matrices constructed previously.

3.5 Step 5 - Calculating the Scaling Factor and the Shift Parameter

The scaling factor \(a\) is computed with the following formula:

\[ a = \sqrt{\frac{D}{Tr(M*M)}}. \tag{6} \]

Here \(D\) is the degrees of freedom, calculated as the difference between the numbers of parameters estimated in the two models.

The shift parameter \(b\) is computed as

\[ b = D - \sqrt{\frac{D*Tr(M)^2}{Tr(M*M)}}. \tag{7} \]

In these formulas, \(Tr\) is the trace operator, the sum of the elements along the main diagonal.

3.6 Step 6 - Compute the Scaled \(T_3\) Statistic

Once \(a\) and \(b\) have been computed, they are used in Equation 1 along with \(T.chisq\) to compute the \(T_3\) chi-square difference statistic. Asparouhov and Muthén (2010) suggest that the \(T_3\) statistic follows a chi-square distribution with \(D\) degrees of freedom.

3.7 Using the `crmda_difftest` Function

The crmda_difftest function takes the file names of the Mplus output files for two nested models as input. For example:

crmda_difftest("parent.out", "nested.out")

We use functions provided by the MplusAutomation package (Hallquist and Wiley, 2017) for R to parse the Mplus output files.

The function detects the location of the companion dif files, parent.dif and nested.dif. In our examples, these are available in the same folder as the output files. The dif files are parsed by an auxiliary function named diffReader. If there are multiple groups, the user must also specify the total sample size as a third argument to crmda_difftest. The function returns the \(T_3\) value calculated by the aforementioned formulas, along with the estimate of \(T_3\) retrieved from the Mplus output file.

4 Matching Mplus `DIFFTEST` Calculations

The crmda_difftest function imports the dif files and reproduces estimates of the \(\chi^2\) test that match the \(T_3\) estimates reported in Mplus (Asparouhov and Muthén, 2010). This section shows the results obtained from the crmda_difftest function match the values reported by the Mplus DIFFTEST procedure on a variety of models.

4.1 The demonstration data

A subset of the HBSC data (Health Behavior in School-Aged Children; Iannotti, 2005-2006) was used to obtain the demonstration results. The HBSC data contains responses from students in grades 6, 7, 8, 9, and 10. The subset data being used here contains responses to ordinal measures (five-point Likert-type) on self-reported depression level, physical well-being, bullying experiences (got bullied by others and bullied others), and alcohol use. A total of 25 nested model comparison scenarios were devised. The Mplus estimator “WLSMV” was used for all of the model estimations.

4.2 Example models

The following models were fitted to the HBSC data.

Model Number	Model Description	Output File
Model 00	Two-Factor CFA, Two Indicators per Factor, Correlation Free	m00-2_indicators.out
Model 01	Two-Factor CFA, Two Indicators per Factor, Correlation Fixed Zero	m01-2_indicators.out
Model 02	Two-Factor CFA, Three Indicators per Factor, Correlation Free	m02-3_indicators.out
Model 03	Two-Factor CFA, Three Indicators per Factor, Correlation Fixed Zero	m03-3_indicators.out
Model 04	Two-Factor CFA, Five/Six Indicators per Factor, Correlation Free	m04-5_6_indicators.out
Model 05	Two-Factor CFA, Five/Six Indicators per Factor, Correlation Fixed Zero	m05-5_6_indicators.out
Model 06	Three-Factor CFA, Five/Six/Nine Indicators per Factor, All Correlations Free	m06-3_factors.out
Model 07	Three-Factor CFA, Five/Six/Nine Indicators per Factor, One Correlation Fixed Zero	m07-3_factors.out
Model 08	Three-Factor CFA, Five/Six/Nine Indicators per Factor, Two Correlations Equate	m08-3_factors.out
Model 09	Structural, Two IVs Predict One DV, IVs Correlation Free	m09-structural.out
Model 10	Structural, Two IVs Predict One DV, IVs Correlation Fixed Zero	m10-structural.out
Model 11	Structural, Two IVs Predict One DV, IVs Correlation Free, Two Regressive Paths Equate	m11-structural.out
Model 12	Structural, Two IVs Predict One DV, IVs Correlation Fixed to Zero, Two Regressive Paths Equate	m12-structural.out
Model 13	Structural, Two IVs Predict One DV, IVs Correlation and Two Regressive Paths Equate	m13-structural.out
Model 14	Indirect, One IV, One DV, One Mediator, All (a, b, c) Paths Free, Bootstrapping Not Allowed	m14-indirect.out
Model 15	Indirect, One IV, One DV, One Mediator, a and b Paths Free, c (IV-DV) Path Fixed Zero, Bootstrapping Not Allowed	m15-indirect.out
Model 16	Interaction, Two IVs, One DV, One Latent Interaction Term, .dif Not Allowed to Save	m16-interaction.out
Model 17	One Factor Two Group (6th and 7th Grade) Configural Model	m17-1_factor-2_groups-config.out
Model 18	One Factor Two Group Weak Model	m18-1_factor-2_groups-weak.out
Model 19	One Factor Two Group Strong Model	m19-1_factor-2_groups-strong.out
Model 20	One Factor Two Group (6th and 7th Grade) Stict Model	m20-1_factor-2_groups-strict.out
Model 21	One Factor Three Group (6th, 7th, and 8th Grade) Configural Model	m21-1_factor-3_groups-config.out
Model 22	One Factor Three Group (6th, 7th, and 8th Grade) Weak Model	m22-1_factor-3_groups-weak.out
Model 23	One Factor Three Group (6th, 7th, and 8th Grade) Strong Model	m23-1_factor-3_groups-strong.out
Model 24	One Factor Three Group (6th, 7th, and 8th Grade) Strict Model	m24-1_factor-3_groups-strict.out
Model 25	One Factor Four Group (6th, 7th, 8th, and 9th Grade) Configural Model	m25-1_factor-4_groups-config.out
Model 26	One Factor Four Group (6th, 7th, 8th, and 9th Grade) Weak Model	m26-1_factor-4_groups-weak.out
Model 27	One Factor Four Group (6th, 7th, 8th, and 9th Grade) Strong Model	m27-1_factor-4_groups-strong.out
Model 28	One Factor Four Group (6th, 7th, 8th, and 9th Grade) Strict Model	m28-1_factor-4_groups-strict.out
Model 29	One Factor Five Group (6th, 7th, 8th, 9th, and 10th Grade) Configural Model	m29-1_factor-5_groups-config.out
Model 30	One Factor Five Group (6th, 7th, 8th, 9th, and 10th Grade) Weak Model	m30-1_factor-5_groups-weak.out
Model 31	One Factor Five Group (6th, 7th, 8th, 9th, and 10th Grade) Strong Model	m31-1_factor-5_groups-strong.out
Model 32	One Factor Five Group (6th, 7th, 8th, 9th, and 10th Grade) Strict Model	m32-1_factor-5_groups-strict.out
Model 33	Six Factor Five Group (6th, 7th, 8th, 9th, and 10th Grade) Configural Model	m33-5_factors-5_groups-config.out
Model 34	Six Factor Five Group (6th, 7th, 8th, 9th, and 10th Grade) Weak Model	m34-5_factors-5_groups-weak.out
Model 35	Five Factor Five Group (6th, 7th, 8th, 9th, and 10th Grade) Strong Model	m35-5_factors-5_groups-strong.out
Model 36	Five Factor Five Group (6th, 7th, 8th, 9th, and 10th Grade) Strict Model	m36-5_factors-5_groups-strict.out

4.3 Nested model comparisons

In the following table, we summarize the \(T_3\) values calculated with crmda_difftest, based on the dif files from Mplus, and compare against the values of the \(T_3\) estimates reported in the Mplus output files. These results show that across a variety of model comparison scenarios the crmda_difftest function reproduces the test statistic provided by Mplus exactly to three decimal places.

Comparison	crmda_difftest_T3	Mplus_DIFFTEST_T3	df	p
Model 00 vs. 01	145.270	145.27	1.000	0.000
Model 02 vs. 03	239.494	239.494	1.000	0.000
Model 04 vs. 05	440.748	440.748	1.000	0.000
Model 06 vs. 07	63.328	63.328	1.000	0.000
Model 06 vs. 08	25.534	25.534	1.000	0.000
Model 09 vs. 10	717.946	717.946	1.000	0.000
Model 09 vs. 11	68.182	68.182	1.000	0.000
Model 10 vs. 12	67.541	67.541	1.000	0.000
Model 11 vs. 13	166.535	166.535	1.000	0.000
Model 14 vs. 15	0.786	0.786	1.000	0.375
Model 17 vs. 18	2.143	2.143	5.000	0.829
Model 18 vs. 19	21.789	21.789	17.000	0.193
Model 19 vs. 20	34.030	34.03	6.000	0.000
Model 21 vs. 22	5.803	5.803	5.000	0.326
Model 22 vs. 23	40.958	40.958	17.000	0.001
Model 23 vs. 24	51.198	51.198	6.000	0.000
Model 25 vs. 26	11.685	11.685	5.000	0.039
Model 26 vs. 27	60.603	60.603	17.000	0.000
Model 27 vs. 28	101.549	101.549	6.000	0.000
Model 29 vs. 30	26.145	26.145	5.000	0.000
Model 30 vs. 31	79.822	79.822	17.000	0.000
Model 31 vs. 32	148.519	148.519	6.000	0.000
Model 33 vs. 34	157.958	157.958	32.000	0.000
Model 34 vs. 35	360.917	360.917	106.000	0.000
Model 35 vs. 36	367.814	367.814	37.000	0.000

5 Summary

We have outlined the procedures necessary to import and parse the Mplus DIFFTEST files and to calculate a difference of fit test known as \(T_3\). While some elements in the calculations are based on our best guess about calculations in Mplus, we believe that the combination of elements is correct. The difference in our estimates and the values reported by Mplus is smaller than \(10^{-4}\).

The larger project from which this report is drawn will compare the hypothesis tests based on \(T_2\), \(T_3\), and others. With the completion of this phase, we are one step closer to the final goal. The next phase in this project is to create R functions that provide replacements for \(\mathbf{\Delta}_i\), \(\mathbf{P}_i\), \(\mathbf{V}_i\).

5.1 R Session Info

R version 3.4.3 (2017-11-30)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.3

Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] psych_1.7.8         MplusAutomation_0.7 xtable_1.8-2       
[4] crmda_0.49          kutils_1.34        

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.15       knitr_1.18         magrittr_1.5      
 [4] mnormt_1.5-5       pbivnorm_0.6.0     lattice_0.20-35   
 [7] quadprog_1.5-5     stringr_1.2.0      plyr_1.8.4        
[10] tools_3.4.3        parallel_3.4.3     grid_3.4.3        
[13] nlme_3.1-131       texreg_1.36.23     coda_0.19-1       
[16] htmltools_0.3.6    yaml_2.1.16        rprojroot_1.3-2   
[19] digest_0.6.15      lavaan_0.5-23.1097 codetools_0.2-15  
[22] evaluate_0.10.1    gsubfn_0.6-6       rmarkdown_1.8     
[25] openxlsx_4.0.17    stringi_1.1.6      pander_0.6.1      
[28] compiler_3.4.3     backports_1.1.2    boot_1.3-20       
[31] stats4_3.4.3       foreign_0.8-69     proto_1.0.0

Available under Created Commons license 3.0 CC BY

Replicating the Mplus DIFFTEST Procedure: An R Function to Reproduce Nested Model Comparisons