传统锚题-非等组设计下的测验等值,等值要求的满足具有主观性,并且由于锚题失效或难以获得等因素的影响,则该方法的使用受到了限制。因此,本研究基于规则空间模型的Q矩阵理论,生成两个Q矩阵相同但无锚题的测验的共同受测者,使用共同组设计,利用同时性估计的方法对测验进行等值,并考虑了作答失误率和测验结构对等值稳定性的影响。结果表明:共同组设计同时估计方法的等值稳定性取得了优于或等于锚题-非等组同时估计方法;失误率的增大也会导致等值稳定性的下降;并且不同的测验结构也对等值稳定性产生了影响,其中直线型和收敛型结构稳定性较好,发散型和无结构型较差。
Test equating involves aspects such as equating requirements, equating design, equating methods and so on. In equating requirements, the equal construct requirement has no formula to justify, so it leads to equating misuse. In equating practice, we usually use non-equivalent groups with the anchor test (NEAT) design to obtain a common scale, and the equating accuracy mainly depends on the common items' quality (Dorans, Holland, 2006). When the common items' parameters have drifted or the content exposed, the equating result may be biased. So it's necessary to develop a new method for equating. In this study, the Rule Space Model (RSM) was applied to solve the aforementioned problems. RSM proposed by Tatsuoka (1983) is one ofthe cognitive diagnostic models. And the Q matrix theory is the foundation of the model. The A matrix acquired from Q matrix can be regarded as the formula to justify the equal construct requirement. In other words, two tests with the same Q matrix have equal constructs. And we can also generate the common group by the same Q matrix of two tests, because the same Q matrix will generate the same ideal response patterns. We can use the common group design with the concurrent estimate method to equate tests with no common items. We will compare this method with concurrent estimation under the NEAT design. On the other hand, the Q matrix theory results in another issue of equating. There are four basic attributes' hierarchy patterns for A matrix (Leighton, Girel & Hunka, 2004), and different attributes' hierarchy patterns represent different constructs of tests. But no one has yet explored the impact of different constructs on test equating. In addition, the probability of the examinee' s guessing or slipping will be considered in this study too. Simulated data is used to achieve this purpose. The result of the study suggests that the new method with the common group has more or no less stability than the traditional method with common items, using concurrent es