比较句中蕴含着大量的比较观点信息.文章根据汉语比较句特点,利用序列模式挖掘算法获取比较模式.为了提高挖掘算法的性能,对MS-PS算法进行改进,将比较句识别贡献较大的一些项名词和比较特征词设置较低的最小支持度,其余项的最小支持度取项支持度的倍数和1/N(N为序列集大小)中较大值.最后,将获取的序列模式直接匹配待识别的甸子.在两种数据集上进行实验,结果表明本文所给出的SeqPattMine算法是可行的.
A comparative sentence implies a great deal of comparison opinions. It is a special kind of sentence pattern in modern Chinese. Sequential pattern mining algorithm is used for acquiring comparison pattern. In order to enhance the performance of algorithm, the sequential pattern mining is improved. Some small val- ues are set for minimum supports of items (noun) and comparison feature words and part of speech and the minimum support of the other items chooses a larger value between a multiple of item support multiplied and 1/N (N is the size of the sequence set). Moreover, the comparative sentences are identified by obtaining pattern based on sequential pattern mining algorithm in directly match way. The experiment result indicates that the proposed SeqPattMine algorithm is feasible.