针对Hadoop集群部署过程繁琐复杂、耗时费力、运维难度大,且不利于快速扩展的问题,提出一种结合Docker容器技术部署集群的解决方法。该方法把Ambari及其运行环境和配置构建成Docker镜像,并把多节点容器的运行和Hadoop集群的部署过程写成Shell脚本,只需一条命令,即可实现集群的自动化部署。实验结果表明,该方法简单可靠并极大地提高了集群部署的效率。因此,对海量数据的处理和分析具有重要的推动作用。
The deployment of hadoop cluster is complicated, time-consuming and difficult, and is not conducive to the rapid expansion of the cluster. In order to solve these problems, the paper proposed a new solution of combining docker container based on fast, simple and flexible deployment of cluster. Docker images built with ambari and its operating environment and configuration. Write shell script with the multi-node container and process of hadoop deployment. With one command, it could realize the cross platform of arbitrary nodes Hadoop cluster' s automation installation and deployment. Experiments show that the proposed scheme is simple and reliable and can greatly improve the efficiency of cluster deployment. As a result, the pro- posed scheme has an important role in promoting the processing and analysis of mass data.