文件系统的元数据主要是用来描述它的命名空间,访问权限和数据定位等信息的数据。由于50%-80%的文件系统访问要涉及到元数据,元数据服务的性能将极大地影响整个分布式文件系统的性能。为此,文章重点讨论元数据管理面临的问题,从元数据服务的高可扩展技术、高性能技术和高可用技术三个主要方向进行综述,重点分析了各自的主要问题以及目前发展起来的一些主流技术,同时对未来分布式文件系统的元数据管理一些值得关注的问题进行了梳理和展望,为相关研究提供一定的参考。
Metadata of file systems is the data in charge of maintaining namespace, permission semantics and location of file data blocks. Metadata operations can account for up to 80% of total file system operations. As such, the performance of metadata services substantially impacts the overall performance of distributed file systems, especially with the advent of big data era, posing great pressure on the underlying storage systems. This paper reports the state-of-the-art research on the metadata services in large-scale distributed file systems. The study was conducted from three perspectives that are always used to characterize the file systems: high-scalability, high-performance and high-availability, with focus squarely on their respective major challenges as well as their developed mainstream technologies. Additionally, some existing problems in this research were also identified and analyzed, which could be used as a reference for related studies.