为解决大数据环境下种类多、分布广、大小不一的文件生命周期的快速有效追踪问题,提出一种基于分布式数据库的树形数据存储结构与追踪方案。增加实现系统的数据可视化分析功能,提供文件操作频度、热点分布、数据类型、生命周期树广度与深度的统计排名,以及数据量趋势预测等可视化分析结果。与普通的数据管理系统相比,该系统能够更加高效快速的对数据全生命周期树进行查询和构建,快速实现文件的可视化分析。
To achieve fast and efficient tracking of file lifecycle in the environment of big data,a tree data storage structure and tracking scheme based on distributed database was proposed.The visual analysis function was provided in the implemented system,which obtained the statistical ranking of document operation frequency,hot spot distribution,data type,breadth and depth of the lifecycle tree,as well as the data volume trend prediction and other visual analysis results.Compared with the common file management system,the proposed system can be more efficient when querying and building the lifecycle tree,and quickly realize the document visualization analysis.