通过两个模拟研究,比较了SEM框架下WLSc和MWLSc估计方法与IRT框架下MML/EM估计方法的差异。研究结果表明:(1)三种方法中,WLSc参数估计的偏差最大,MWLSc和MML/EM估计方法相差不大;(2)随着样本量增大,各种项目参数估计的精度均提高;(3)项目因素载荷和难度估计的精度受测验长度的影响;(4)项目因素载荷和区分度估计的精度受其总体参数高低的影响;(5)测验项目中阈值的分布会影响参数估计的精度,其中受影响最大的是项目区分度。(6)总体来看,SEM框架下的项目参数估计精度较IRT框架下项目参数估计的精度高。
The factor analysis models and estimation methods for continuous (i. e. , interval or ratio scale) data are not appropriate for item-level data that are categorical in nature. The authors provided a brief review and synthesis of the item factor analysis estimation literature for categorical data (e. g. , 0-1 type response scales). Popular categorical item factor analysis models and estimation methods found in the structural equation modeling and item response theory literature were presented. Two Monte Carlo simulation studies were conducted and revealed: (1) Similar parameter estimates have been obtained from the SEM and IRT parameterizations. Even with a small sample and the IRT estimates converted to SEM parameters, the MWLSc, and MMIJEM results were found to be strikingly similar. But in a small sample size and long tests WLSc did not obtain the convergence parameter estimations. Although in short tests WLSc estimates obtained them, the estimates were consistently more discrepant than those yielded by the other estimation techniques. (2) The precision of the estimators enhanced as the quantity of the sample increased. (3) The precision of item factor load and of item difficulty parameter was influenced by the test length. (4) The precision of item factor load and of item discrimination parameter was influenced by the size of the whole factor load (discrimination). (5) The distribution of the threshold of test item affected the precision of the parameter estimate, and item discrimination was the most sensitive parameter to the threshold. (6) On the whole, the precision of item parameter estimate in SEM framework was higher than that in IRT framework. Both structural equation modeling (SEM) and the item response theory (IRT) could be used for factor analysis of dichotomous item responses. In this case, the measurement models of both approaches were formally equivalent. They were refined within and across different disciplines, and made complementary contributions to