复制技术常用于无结构覆盖网络,用以提高系统性能.在复制技术中有一个基础性问题经常被论及:给定访问频率和存储空间,系统该为每个数据保留多少副本?平方根分布在过去通常被认为最优,即当每个数据的副本数量正比于数据大小和访问频率的平方根时,系统在搜索过程中转发的消息数量最少.但文中工作表明,该观点并非总是正确的.首先,我们认为,为了达到理论最优,每个数据的副本数量应该反比于数据大小的平方根.其次,在现实环境中,当TTL较小或副本密度较小时,平方根分布并非最优.文中首先对问题进行形式化描述和建模,给出理论答案,然后用模拟实验验证了提出的观点,并分析了文中结论与平方根分布不一致的原因.尽管文中结论是以P2P背景得出的,但它同样适用于那些以应用层无结构覆盖网络管理资源的分布式系统.
Replication is a widely used technique in unstructured overlays to improve the system performance.A fundamental question on replication is often addressed: how many replicas should be kept for each data item if given the fixed file sizes,request rates and the limited storage capability? The Square-Root Replication,in which the replica number of an item is proportional to the square root of its global request rate and proportional to its item size,is usually considered to be optimal as far as the minimization of the search size is concerned.However,our work shows that this viewpoint is not always true.Firstly,we hold that the replica number should be inversely proportional to the square root of the item size in the optimal replication under the theoretical settings.Secondly,the Square-Root Replication is not optimal when TTL(Time to Live) is small or replica density is low in the practical applications.In this paper,we firstly formulate the questions and present the formal proofs,and finally provide some simulations to validate our conclusions.Although our conclusions are drawn under the background of P2P(Peer-to-Peer),they also apply to those fully distributed systems,whose resources are managed by means of the unstructured application-layer overlay.