简述了当前大型数据中心普遍采用的计算节点集群与存储系统模块化设计的系统结构,说明了部署在各模块上的主要集群系统。分析了具有独立性的结构化数据本地化存储于计算节点的可能性,给出了系统基本框架,从总体拥有成本(TCO)的角度分析了其价值。结合高能物理研究的原始数据特点,认为数据本地化存储在节点上,有利于提高整体利用率,指出了关键部件——文件元数据管理系统的设计要点,分析了PBS作业批处理系统集成文件元数据管理系统的三种方案,给出第一种方案的详细设计,相应的用户提交作业方式的改变。在测试环境下,初步部署了文件元数据管理系统,测试了三种集成方案,给出了简要的分析比较。
The large data centers apply the modularized architecture that the computing cluster and storage systems are isolated and connected by the high speed network.A few of the popular implements are listed.On the basis of analyzing the independent structured data feature,a node localized data storage and processing system is proposed,which has better TCO(Total Cost of Ownership)and can save much more network bandwidth for huge data transfer than these existing systems.The distributed file metadata manager is important for the job schedule,and the feature of the manager is discussed.The PBS(Portable Batch System),as the cluster resource manager,is briefly introduced.How to query the file metadata manager is discussed in detail.The system of computing and storage merged to one node leads to the different ways how the user submits the jobs.The test results of three mechanisms on the prototype system are discussed in short,and show that the file metadata manager is stable and all of three solutions are acceptable.