estimation alternatives using the LMS approach
estimation_lms.Rmd
Accelerated EM and Adaptive Quadrature
By default (as of v1.0.9
) the LMS approach uses an
accelerated EM procedure ("EMA"
) that uses Quasi-Newton and
Fisher Scoring optimization steps when needed. If desireable, this can
be switched to the standard Expectation-Maximization (EM) algorithm, by
setting algorithm = "EM"
.
By default the LMS approach also uses a fixed Gauss-Hermite
quadrature, to estimate a numerical approximation of the log-likelihood.
Instead of a fixed quadrature, it is possible to use a quasi-adaptive
quadrature instead. Due to performance reasons, the adaptive quadrature
does not fit an individual quadrature to each participant, but instead
one for the entire sample (at each EM iteration), based on the whole
sample densities of the likelihood function. It essentially works by
removing irrelevant nodes which don’t contribute to the integral, and
increasing the number of nodes which actually contribute to the
integral. This usually means that more nodes are placed towards the
center of the distribution, compared to a standard fixed Gauss-Hermite
quadrature. Using the EMA and adaptive quadrature might yield estimates
that are closer to results from Mplus
.
If the model struggles to converge, you can try changing the EM
procedure by setting algorithm = "EMA"
, or
algorithm = "EM"
, and adaptive.quad = TRUE
in
the modsem()
function. Additionally it is possible to tweak
these parameters:
-
max.iter
: Maximum number of iterations for the EM algorithm (default is 500). -
max.step
: Maximum number of steps used in the Maximization step of the EM algorithm (default is 1). -
convergence.rel
: Relative convergence criterion for the EM algorithm. -
convergence.abs
: Absolute convergence criterion for the EM algorithm. -
nodes
: Number of nodes for numerical integration (default is 24). Increasing this number can improve the accuracy of the estimates, especially for complex models. -
quad.range
: Integration range for quadrature. Smaller ranges means that the integral is more focused. Applies to only when using a quasi-adaptive quadrature. -
adaptive.frequency
: How often should the quasi-adaptive quadrature be calculated? Defaults to every third EM iteration.
Here we can see an example using the TPB_UK
dataaset,
which is more troublesome to estimate than the simulated
TPB
dataset.
tpb_uk <- "
# Outer Model (Based on Hagger et al., 2007)
ATT =~ att3 + att2 + att1 + att4
SN =~ sn4 + sn2 + sn3 + sn1
PBC =~ pbc2 + pbc1 + pbc3 + pbc4
INT =~ int2 + int1 + int3 + int4
BEH =~ beh3 + beh2 + beh1 + beh4
# Inner Model (Based on Steinmetz et al., 2011)
INT ~ ATT + SN + PBC
BEH ~ INT + PBC
BEH ~ INT:PBC
"
fit <- modsem(tpb_uk,
data = TPB_UK,
method = "lms",
nodes = 32, # Number of nodes for numerical integration
adaptive.quad = TRUE, # Use quasi-adaptive quadrature
adaptive.frequency = 3, # Update the quasi-adaptive quadrature every third EM-iteration
algorithm ="EMA", # Use accelerated EM algorithm (Default)
convergence.abs = 1e-4, # Relative convergence criterion
convergence.rel = 1e-10, # Relative convergence criterion
max.iter = 500, # Maximum number of iterations
max.step = 1) # Maximum number of steps in the maximization step
summary(fit)
#> Estimating baseline model (H0)
#>
#> modsem (version 1.0.11):
#>
#> Estimator LMS
#> Optimization method EMA-NLMINB
#> Number of observations 1169
#> Number of iterations 118
#> Loglikelihood -33404.36
#> Akaike (AIC) 66946.73
#> Bayesian (BIC) 67296.14
#>
#> Numerical Integration:
#> Points of integration (per dim) 32
#> Dimensions 1
#> Total points of integration 32
#>
#> Fit Measures for Baseline Model (H0):
#> Loglikelihood -35523
#> Akaike (AIC) 71181.74
#> Bayesian (BIC) 71526.09
#> Chi-square 5519.01
#> Degrees of Freedom (Chi-square) 162
#> P-value (Chi-square) 0.000
#> RMSEA 0.168
#>
#> Comparative Fit to H0 (LRT test):
#> Loglikelihood change 2118.51
#> Difference test (D) 4237.01
#> Degrees of freedom (D) 1
#> P-value (D) 0.000
#>
#> R-Squared Interaction Model (H1):
#> INT 0.898
#> BEH 0.922
#> R-Squared Baseline Model (H0):
#> INT 0.896
#> BEH 0.867
#> R-Squared Change (H1 - H0):
#> INT 0.002
#> BEH 0.055
#>
#> Parameter Estimates:
#> Coefficients unstandardized
#> Information observed
#> Standard errors standard
#>
#> Latent Variables:
#> Estimate Std.Error z.value P(>|z|)
#> PBC =~
#> pbc2 1.000
#> pbc1 0.859 0.021 41.34 0.000
#> pbc3 0.935 0.017 55.09 0.000
#> pbc4 0.818 0.021 39.86 0.000
#> ATT =~
#> att3 1.000
#> att2 0.965 0.011 86.35 0.000
#> att1 0.812 0.017 47.18 0.000
#> att4 0.870 0.019 45.46 0.000
#> SN =~
#> sn4 1.000
#> sn2 1.313 0.041 32.30 0.000
#> sn3 1.350 0.041 32.72 0.000
#> sn1 1.000 0.038 26.61 0.000
#> INT =~
#> int2 1.000
#> int1 0.970 0.011 92.13 0.000
#> int3 0.984 0.010 98.42 0.000
#> int4 0.992 0.009 104.58 0.000
#> BEH =~
#> beh3 1.000
#> beh2 0.986 0.013 77.71 0.000
#> beh1 0.814 0.019 42.71 0.000
#> beh4 0.803 0.019 41.50 0.000
#>
#> Regressions:
#> Estimate Std.Error z.value P(>|z|)
#> INT ~
#> PBC 1.037 0.036 28.45 0.000
#> ATT -0.060 0.030 -2.04 0.041
#> SN 0.051 0.033 1.55 0.121
#> BEH ~
#> PBC 0.398 0.052 7.62 0.000
#> INT 0.595 0.048 12.28 0.000
#> PBC:INT 0.140 0.008 17.65 0.000
#>
#> Intercepts:
#> Estimate Std.Error z.value P(>|z|)
#> .pbc2 4.028 0.066 61.19 0.000
#> .pbc1 3.994 0.063 63.16 0.000
#> .pbc3 3.764 0.063 59.90 0.000
#> .pbc4 3.798 0.061 62.04 0.000
#> .att3 3.731 0.064 58.52 0.000
#> .att2 3.846 0.062 62.30 0.000
#> .att1 4.217 0.060 70.15 0.000
#> .att4 3.697 0.065 56.78 0.000
#> .sn4 4.505 0.051 87.75 0.000
#> .sn2 4.354 0.054 80.09 0.000
#> .sn3 4.393 0.055 80.20 0.000
#> .sn1 4.474 0.052 85.99 0.000
#> .int2 3.731 0.067 56.01 0.000
#> .int1 3.876 0.066 58.90 0.000
#> .int3 3.748 0.066 56.58 0.000
#> .int4 3.792 0.066 57.16 0.000
#> .beh3 2.667 0.076 34.98 0.000
#> .beh2 2.585 0.076 34.17 0.000
#> .beh1 2.546 0.073 35.07 0.000
#> .beh4 2.688 0.072 37.12 0.000
#>
#> Covariances:
#> Estimate Std.Error z.value P(>|z|)
#> PBC ~~
#> ATT 3.679 0.177 20.80 0.000
#> SN 1.937 0.117 16.60 0.000
#> ATT ~~
#> SN 1.681 0.110 15.30 0.000
#>
#> Variances:
#> Estimate Std.Error z.value P(>|z|)
#> .pbc2 0.701 0.042 16.68 0.000
#> .pbc1 1.455 0.069 21.00 0.000
#> .pbc3 0.802 0.043 18.62 0.000
#> .pbc4 1.458 0.068 21.28 0.000
#> .att3 0.296 0.023 12.81 0.000
#> .att2 0.306 0.022 13.81 0.000
#> .att1 1.286 0.057 22.58 0.000
#> .att4 1.584 0.071 22.42 0.000
#> .sn4 1.362 0.065 21.03 0.000
#> .sn2 0.491 0.032 15.39 0.000
#> .sn3 0.377 0.032 11.89 0.000
#> .sn1 1.445 0.068 21.13 0.000
#> .int2 0.237 0.014 16.97 0.000
#> .int1 0.404 0.020 20.25 0.000
#> .int3 0.335 0.017 19.35 0.000
#> .int4 0.271 0.015 17.90 0.000
#> .beh3 0.456 0.030 15.26 0.000
#> .beh2 0.513 0.031 16.55 0.000
#> .beh1 1.836 0.082 22.31 0.000
#> .beh4 1.916 0.085 22.42 0.000
#> PBC 4.362 0.209 20.83 0.000
#> ATT 4.454 0.197 22.60 0.000
#> SN 1.718 0.119 14.49 0.000
#> .INT 0.507 0.038 13.28 0.000
#> .BEH 0.449 0.034 13.24 0.000