Hadoop是一个免费、可靠、高效、可扩展的开源云平台,允许在分布式集群上处理大数据的软件框架。本文以Hadoop为基础,详细介绍了虚拟机VMware、JDK、Cent OS、Hadoop等技术。在伪分布式环境下搭建虚拟云平台,经过测试,本系统能正常运行MapReduce化的分布式程序,本文还针对用户权限、路径配置和使用SSH服务程序等问题进行了详细的阐述,为基于Hadoop的云平台研究和应用程序开发提供了基础。
Hadoop is a free,reliable,efficient and scalable open source cloud platform,which allows the software framework to deal with large data on a distributed cluster. Based on Hadoop,this paper introduces the technology of Cent OS,JDK,Hadoop and VMware in virtual machine. Virtual cloud platform is built in the pseudo distributed environment. After testing,the system can run the MapReduce oriented distributed program. This paper also provides a basis for the research of the SSH based cloud platform and application program based on Hadoop.