安全漏洞是网络安全的关键,漏洞库旨在收集、评估和发布安全漏洞信息。然而,漏洞库相互之间存在数据的冗余和异构,导致漏洞信息共享困难。针对上述问题,收集和分析了15个主流漏洞库共计84.2万条漏洞数据。基于文本挖掘技术提出了漏洞去除重复的规则(准确率为94.4%),以及漏洞数据库融合(UVDA,uniform vulnerability database alliance)框架。最后在多个漏洞库上,实现了UVDA框架,实现过程完全自动化。生成的UVDA数据库已经应用于国家安全漏洞库,并且可以按照产品型号和时间进行统一的检索,推进了漏洞信息发布机制标准化进程。
Security vulnerability was the core of network security. Vulnerability database was designed to collect, assess and publish vulnerability information. However, there was redundant and heterogeneous data in vulnerability database which leads to sharing difficulty of vulnerability information among vulnerability database. 15 main vulnerability database with a total of 842 thousands of vulnerability data items were connected and analyzed. Based on text mining technology, a rule of removing duplicate form vulnerabilities whose accuracy rate was 94.4% and vulnerability database fusion framework(UVDA) were proposed. Finally, three representative vulnerability database were used to realize UVDA framework, which made the process fully automatic. The generated UVDA vulnerability database has been used in national security vulnerability database and can be retrieved according to uniform product version and date time, promoting the standardization process of vulnerability information release mechanism.