In cloud computing,the number of replicas and deployment strategy have extensive impacts on user’s requirement and storage efficiency.Therefore,in this paper,a new definition of file access popularity according to users’ preferences,and its prediction algorithm are provided to predict file access trend with historical data.Files are sorted by priority depending on their popularity.A mathematical model between file access popularity and the number of replicas is built so that the reliability is increased efficiently.Most importantly,we present an optimal strategy of dynamic replicas deployment based on the file access popularity strategy with the overall concern of nodes’ performance and load condition.By this strategy,files with high priority will be deployed on nodes with better performance therefore higher quality of service is guaranteed.The strategy is realized in the Hadoop platform.Performance is compared with that of default strategy in Hadoop and CDRM strategy.The result shows that the proposed strategy can not only maintain the system load balance,but also supply better service performance,which is consistent with the theoretical analysis.
In cloud computing, the number of replicas and deployment strategy have extensive impacts on user's requirement and storage efficiency. Therefore, in this paper, a new definition of file access popularity according to users' preferences, and its prediction algorithm are provided to predict file access trend with historical data. Files are sorted by priority depending on their popularity. A math- ematical model between file access popularity and the number of replicas is built so that the reliabili- ty is increased efficiently. Most importantly, we present an optimal strategy of dynamic replicas de- ployment based on the file access popularity strategy with the overall concern of nodes' performance and load condition. By this strategy, files with high priority will be deployed on nodes with better performance therefore higher quality of service is guaranteed. The strategy is realized in the Hadoop platform. Performance is compared with that of default strategy in Hadoop and CDRM strategy. The result shows that the proposed strategy can not only maintain the system load balance, but also sup- ply better service performance, which is consistent with the theoretical analysis.