现代企业每天生成很多日志文件,如果能实时处理日志数据,企业能获取更大的商业价值,但管理这个大日志数据是一个巨大的挑战,因为传统的技术用来处理庞大的数据不够高效.Hadoop生态系统提供一种新的方式来处理大数据,ElasticSearch技术是基于云环境的实时搜索引擎.本文提出了基于ElasticSearch实时进行大日志数据搜索的软件集成方案,采用基于硬件创建虚拟机环境,根据搜索条件使用ElasticSearch得到需要的rowkey列表,Hbase用这些rowkey直接从数据库中得到数据.实验证明,随着日志事件搜索量的增加,搜索反应时间不线性增加,基于ElasticSearch的大日志实时搜索的软件集成方案设计具有可行性.
Modern enterprise generates a lot of log files, the enterprise can obtain greater business value if it can be real-time processing of log data. But manage the big log data is a huge challenge because the traditional technology is not efficiently to deal with the huge data. Hadoop ecosystem provides a new way to deal with large data, the ElasticSearch technology is a real-time search engine based on the cloud. The integration scheme is proposed in this paper based on ElasticSearch real-time search software. The virtual machine environment is created base on the hardware, and the needed rowkey list is obtained by using ElasticSearch and the search conditions, then the Hbase use these rowkey to get data directly from the database. The experiments show that the search response time is not linear increase with the increment of log event search, and big log real-time search based on ElasticSearch is feasible for the integration of the software design.