云计算、物联网、大数据等新兴信息技术的发展与应用在提高景区信息化服务水平的同时,也对景区海量信息资源的有效利用提出了严峻挑战。面对超大规模、非结构化的海量数据,传统的基于关系型数据库的数据仓库已很难有效支持景区的数据存储与分析工作。基于此文中提出了一种基于云计算技术的景区数据仓库,通过采用HDFS对数据进行分布式存储管理,利用MapReduce设计海量数据的分析模式,使用HiveQL语言实现数据仓库与前端表现层的交互,能够有效解决景区海量数据的数据管理问题。以黄山风景区为实际背景的实验结果表明了该数据仓库的正确性和有效性。
The emergence of new information technologies, such as cloud computing, internet of things, big data, etc, greatly enhances the level of area of information technology services. However, how to effectively utilize the scenic area of information resources is a great challenge. Faced large scale and unstructured mass data, the data warehouse based on the traditional relational database has been difficult to effectively support the data storage and analysis in scenic area. Based on this, propose a scenic area data warehouse based on cloud computing technology, adopting HDFS for distributed storage of data, using MapReduce to design massive data analysis model, with HiveQL language to implement the interaction between data warehouse and front-end presentation layer, which can solve the data management problem of massive data in scenic area. Taking Huangshan as example, the experimental results indicate the data warehouse is correct and feasible.