在大规模语料的基础上,对趋向动词的用法和上下文信息进行了统计和分析,得到了趋向动词在谓词后面作趋向补语的概率、小概率作趋向补语(即趋向动词在谓词后面作补语的概率介于两个阈值之间)时的上下文信息;建立了一个基于趋向动词在谓词后面作趋向补语的概率统计模型来识别趋向动词用法,同时,根据趋向动词与谓词搭配后词义变化情况,对词典进行了补充.封闭测试识别精确率达99.01%,召回率达96.67%;开放测试识别精确率达98.14%,召回率达96.19%.
The usages and the contexts of Chinese directional verbs in large-scale corpus are analyzed, and the probabilities of directional complements behind verbs or adjectives are obtained. Also, the context information is achieved when the probabilities of directional complements are between two thresholds. Then a Chinese directional verb processing system is set up, which is built on statistical models. It is used to identify the different usages of Chinese directional verbs. At the same time, according to the word senses of Chinese directional verbs as directional complements, the dictionary is reinforced. The experiments have achieved 99.01% precision rate and 96.67% recall rate in close test, and 98. 14% precision rate and 96.19% recall rate in open test.