针对非一致性内存访问架构(NUMA)在垃圾回收(GC)过程中存在大量的远程内存读写导致GC性能降低的问题,对GC过程的各个阶段进行分析与研究,提出了一种基于NUMA结构的高效实时稳定的GC算法。该算法首先基于NUMA结构改进传统分代GC机制的堆空间布局,然后通过控制GC过程中扫描活跃对象阶段的初始根对象选取、动态负载均衡阶段截取任务队列的选取以及复制活跃对象阶段对象复制位置的选取,大大减少GC过程中的远程访问次数。这种改进的GC机制对所有NUMA结构具有通用性。以Godson-3处理器的NUMA平台为例进行的实验结果显示,优化的GC机制极大地缩短了GC的时间,而且提高了应用程序的性能以及稳定性。在SPECjvm2008测试中,GC时间平均缩短了14.6%(GC总时间缩短4.1%-41.58%),应用程序的性能平均提升了4.68%(最高提升17.8%),应用程序的性能稳定性提升了76.2%。
In order to solve a non-uniform memory access architecture (NUMA) ' s performance degradation in garbage collection (GC) caused by a large amount of remote access during GC, each phase of the GC process was analyzed and studied, and a high-efficient, real-time ana stable GC algorithm was proposed for NUMA. The algorithm im- proves the traditional generational GC mechanism' s heap space based on the non-uniform memory access architec- ture first, and then, greatly decreases the number of remote access in the course of GC by controlling the selection of initial root objects during the live object scanning phase, the stealing task queue in the phase of dynamic load balance, and the object copying location during the procedure of copying live objects. The advanced GC algorithm can be applied to all NUMA platforms. The final results of the experiments on the Godson-3 NUMA platform show that the proposed algorithm can reduce the stop-the-world (STW) time during GC, and enhance the performance and stability of the application program. For the SpecJVM2008 benchmarks, the new algorithm averagely reduced the STW time by 14.6% (reduced the total time by 4.1% to 41.58% ), averagely increased the performance of the application program by 4.68% (the ceiling value was 17.8% ), and improved its stability by 76.2%.