位置:成果数据库 > 期刊 > 期刊详情页
两种新的计算机化自适应测验在线标定方法
  • ISSN号:0439-755X
  • 期刊名称:《心理学报》
  • 时间:0
  • 分类:B841[哲学宗教—基础心理学;哲学宗教—心理学]
  • 作者机构:北京师范大学中国基础教育质量监测协同创新中心,北京100875
  • 相关基金:国家自然科学基金青年基金项目(31300862)、高等学校博士学科点专项科研基金项目新教师类(20130003120002)和东北师范大学应用统计教育部重点实验室开放课题(KLAS 130028614)资助.
作者: 陈平
中文摘要:

在线标定技术由于具有诸多优点而被广泛应用于计算机化自适应测验(CAT)的新题标定。Method A是想法最直接、算法最简单的CAT在线标定方法,但它具有明显的理论缺陷——在标定过程中将能力估计值视为能力真值。将全功能极大似然估计方法(FFMLE)与"利用充分性结果"估计方法(ECSE)的误差校正思路融入Method A(新方法分别记为FFMLE-Method A和ECSE-Method A),从理论上对能力估计误差进行校正,进而克服Method A的标定缺陷。模拟研究的结果表明:(1)在大多数实验条件下,两种新方法较Method A总体上可以改进标定精度,且在测验长度为10的短测验上的改进幅度最大;(2)当CAT测验长度较短或中等(10或20题)时,两种新方法的表现与性能最优的MEM已非常接近。当测验长度较长(30题)时,ECSE-Method A的总体表现最好、优于MEM;(3)样本量越大,各种方法的标定精度越高。

英文摘要:

With the development of computerized adaptive testing (CAT), many new issues and challenges have been raised. For example, as the test is continuously administered, some new items should be written, calibrated, and added to the item bank periodically to replace the flawed, obsolete, and overexposed items. The new items have to be precisely calibrated because the calibration precision will directly affect the accuracy of ability estimation. The technique of online calibration has been widely used to calibrate new items on-the-fly in CAT, since it offers several advantages over the traditional offline calibration approach. As the simplest and most straightforward online calibration method, Method A (Stocking, 1988) has an obvious theoretical limitation in that it treats the estimated abilities as true values and ignores the measurement errors in ability estimation. To overcome this weakness, we combined a full functional maximum likelihood estimator (FFMLE) and an estimator which made use of the consequences of sufficiency (ECSE) (Stefanski & Carroll, 1985) with Method A respectively to correct for the estimation error of ability, and the new methods are referred to as FFMLE-Method A and ECSE-Method A. A simulation study was conducted to compare the two new methods with three other methods: the original Method A [denoted as Method A (Original)], the original Method A which plugs in the true abilities of examinees [Method A (True)], and the “multiple EM cycles” method (MEM). These five methods were evaluated in terms of item-parameter recovery and calibration efficiency under three levels of sample sizes (1000, 2000 and 3000) and three levels of CAT test lengths (10, 20 and 30), assuming the new items are randomly assigned to examinees. Under the two-parameter logistic model, the true abilities for the three groups of examinees were randomly drawn from the standard normal distribution [N (0,1)]. For all conditions, 1000 operational items were simulated to constitut

同期刊论文项目
同项目期刊论文
期刊信息
  • 《心理学报》
  • 北大核心期刊(2011版)
  • 主管单位:中国科学院
  • 主办单位:中国心理学会 中国科学院心理研究所
  • 主编:张侃
  • 地址:北京市朝阳区林萃路16号院
  • 邮编:100101
  • 邮箱:xuebao@psych.ac.cn
  • 电话:010-64850861
  • 国际标准刊号:ISSN:0439-755X
  • 国内统一刊号:ISSN:11-1911/B
  • 邮发代号:82-12
  • 获奖情况:
  • 国内外数据库收录:
  • 日本日本科学技术振兴机构数据库,中国中国人文社科核心期刊,中国中国科技核心期刊,中国北大核心期刊(2004版),中国北大核心期刊(2008版),中国北大核心期刊(2011版),中国北大核心期刊(2014版),中国国家哲学社会科学学术期刊数据库,中国北大核心期刊(2000版)
  • 被引量:33136