属性是概念的内涵表达,描述概念的特征或性质,通过属性可以区分不同的概念,发现它们之间的差异。属性具备描述概念和鉴别概念的功能。基于Web的属性获取是指对给定的概念从Web网页中自动获取其属性集合。属性获取是概念知识获取的起点,也是领域本体自动构建的关键。文中从文本知识获取的角度对属性进行分类,并结合属性的元性质,探讨属性名称在Web语料中的基本表达方式(词汇句法模式),并通过词汇句法模式从大规模语料中获取属性名称,并且提出了基于统计和语义的候选属性验证方法。最后利用属性迭代获取模式进行属性迭代获取。通过几组概念的实例进行属性获取,实验结果表明,文中方法获取的属性的准确率较高。
An attribute is the expression of connotation, which is used to explain some property of the conceptual word, and distinguish different concepts, and find their discrepancy. An conceptual word with attribute names are not an isolated vocabulary entry any more. Web-based attribute-acquisition is to acquire a set of attribute names from Web pages automatically for each given concept, enriching the semantics of the concept. Attribute acquisition is also a significant step of general knowledge acquisition from text, and an important task in automatic construction for domain ontologies. It makes a basic classification of attributes according to text knowledge acquisition in this paper and explores basic expressions (lexico-syntactical patterns) for attribute names in multi-linguistic Web corporal. After acquiring attribute names from large-scale corpus by patterns, a method based on statistics and semantic is proposed to validate. At last, attribute it- eration patterns are applied to acquire new attribute names through iteration method. The results show that the precision of attribute acqui- sition is very high through the experiment of several group concepts.