本研究基于Tatsuoka的规则空间方法,对理想反应模式与异常反应指标进行了扩展,推导了多级评分项目下规则空间方法的算法公式。在4种属性层级结构(发散型、收敛型、线型与无结构型)x4种“失误”作答概率(2%、5%、10%与15%)测验情境下,以属性模式判准率、被试属性判准率、敏感性与特异性为指标,检验了多级评分项目下规则空间方法的分类准确性。结果表明:(1)基于多级评分项目构建的异常反应指标,能有效地对被试进行分类与解释,且0—1评分项目下异常反应指标及其性质都是多级评分下的特例;(2)随着“失误”作答概率的增加,4种属性层级结构的分类准确性都会降低;(3)线型和收敛型的分类准确性明显好于发散型与无结构型;(4)纯规则点的分布对规则空间方法的分类准确性有显著影响。
Polytomous item has become an important item type in many large scale educational assessments. However, cognitive diagnostic models are initially proposed for dichotomous data, which limits the application and development of the Cognitive Diagnostic Assessment (CDA). For example, the rule space methodology (RSM) is still limited to the dichtomous item. The article thus extended the dichotomous RSM to that of polytomous item based on Graded Response Model, focusing on the definition and formula of ideal item-response pattern (IRP) and caution index in the polytomous item case. In this paper, the IRP, defined as the number of attributes in an item that could be correctly responded by examinees without slippage or guessing, can be obtained by matrix multiplication. The degree of aberrant response, which indicated higher ratio score on hard items and lower ratio score on easy items, was described by the caution index, f(V)=(R(θ)-V,R(θ)-T(8)).The expectation of f(V) was zero, its variance was ∑i=1[θ-tθ)] 2 (O)R2 , it was and orthogonal to the ability of θ at a given level of ability. In the simulation study, Monte-Carlo experiment was used to test the classification accuracy under 4(attribute hierarchy)x4(slippage probability) conditions, indicated by four indexes, the pattern ratio, marginal ratio, sensitivity and specificity. Under each of 16 situations, the IRP was firstly ranked by raw score in ascending order, and the distribution of IRP in the population was made to approximate a normal distribution. A sample of 5000 observed response patterns was generated from the ideal item-response patterns, by randomly adding slips to each component of the ideal item- response patterns. For example, in order to generate 2% random errors, a random number was produced from the distribution of U(0,1). For each item, if the random number was less than 0.01, and its ideal item score was not zero, then 1 was subtracted from it, otherwise 1 was added; if random numbe