传统上,群体评估都是以个体的评估结果的平均值为基础进行的。而群体水平IRTGIRT理论则可以避开对个体的评估,直接实现对群体的评估,它具有许多传统方法难以企及的优点。本文将群体水平IRT模型应用于2007年某省高考英语阅读理解的410所学校的能力评估,评估结果发现:410所学校的英语阅读理解能力几乎都在[-1,1]区间内,没有能力极高或极低的学校。对这些学校而言,测验中所有项目的难度较易,区分度适中。所有的评估结果与IRT模型的评估结果相关显著,GIRT模型在实践中是可供选择的群体评估方法。
Traditionally the assessment of groups is based on the assessment of individuals, which uses their means or other similar statistics as the findings. This method requires that all items be answered by each individual. But in practice, it is hard to perform. Now a new method, based on Group - level Item Response Theory (GIRT) , can be used for the assessment of groups without the individual assessment. Although it cannot realize the individual assessment, some simulation studies show that it has advantages over IRT when only group assessment is required. However, there have been few studies on the effect of the GIRT framework on group assessment in practice. This paper tried to explore this issue through an English reading comprehension test data with a two - parameter Logistic GIRT model(2GPLM). There are two purposes of this study: one is to conduct a group/school assessment to provide comparison between schools ; the other is to discuss the effect on group under 2GPLM through the comparison with the group - level finding under the two parametric Logistic IRT model (2PLM). Findings showed that: ( 1 ) Under 2GPLM, for the 410 schools, the mean and standard variation of the school ability was - 0. 1743 and . 4300, almost all school ability indicators were the internal [ - 1,1 ]. As a whole, the global group - level ability was at tile moderate to low level. When all of the schools were classified into three types( superior school, senior school and ordinary school) according to the rules of this province, the mean of the school ability indicators were different significantly through multiple comparison test (LSD), which was consistent with the intent of the rules. (2) Under the IRT framework, for the 82000 students, the mean and standard variation of their ability was zero and . 90913, the distribution of students' ability was normal. The correlation the school ability under GIRT with that under IRT was . 953, which suggests that the findings between both models were consisten