动词次范畴化及其自动获取的研究已经在英、汉等很多语种里取得了较好的成果,但跨语言的次范畴化研究仍然很少,并且不成体系、描述了基于汉英双语语料库的统计分析并获取跨语言次范畴化对应关系的系统化实验。首先,根据双语词典和句法相似度识别谓词可能对齐的句对;然后,应用双重最大似然检验的统计过滤方法自动获取了654种次范畴化框架对应类型。实验结果分析表明,这些对应类型具备统计和句法意义上的协调性。
Research on verb subcategorization and its acquisition has achieved a lot for single languages,such as English and Chinese,whereas the cross-lingual subcategorization demands more systematic efforts.This paper describes a systematic experiment of statistical analysis and acquisition for bilingual subcategorization relatiuns based on Chinese-English parallel curpus.First,senfence pairs with possible parallel predicates are extracted.Then,654 bilingual basic types of subcategorization frames are acquired by meaus of the two-fold MLE filtering method.Analysis on the results show that the acquired bilingual subcategorization frames are statistically and syntaetically compatible.