随着基于闪存的固态硬盘在个人计算机和企业服务器上的广泛应用,固态硬盘受到学术界和工业界越来越多的关注.除了具有闪存存储器的优良特性之外,固态硬盘内部还具有丰富的并行特性.传统数据库系统的物理操作表扫描和上层聚集操作是针对磁盘的机械特性和对称读写特性而设计的,并不能发挥固态硬盘内部并行特性的优势.文中首先将固态硬盘作为一个黑盒进行探测以了解其内部的并行特性.在此基础上,对传统数据库表扫描操作进行相应的改进,提出一种并行表扫描模型ParaSSDScan以充分利用固态硬盘内部丰富的并行特性.其次,基于并行表扫描模型,文中还提出一种高效的并行聚集操作模型ParaSSDAggr,并利用该聚集操作模型实现几种常见聚集操作.最后,通过实验表明并行表扫描和并行聚集操作的性能较之传统数据库表扫描和聚集操作的性能分别提高了3倍和4倍,同时实验结果还表明并行聚集操作对内存的需求不大.并行表扫描和并行聚集操作大大提高了表扫描和聚集操作的性能,充分说明了固态硬盘内部并行特性的优越性.
With the extensive application of flash based SSDs in personal computers and enterprise servers, SSDs have attracted more and more attention from the academia and industry. In addition to the excellent characteristics of flash memory, there is wealth internal parallelism in SSDs. Table scan and aggregation in traditional database systems are designed based on properties of hard disk, such as mechanical property and symmetrical read/write property. They can't take advantages of the internal parallelism of SSDs when traditional database systems are built on SSDs. Firstly, we detect internal parallelism of SSDs seemed as a black box. And then, we pro- pose a parallel table scan model, ParaSSDScan, to take advantages of the internal parallelism of SSDs. Secondly, based on ParaSSDScan, we also propose an efficient parallel aggregation model, ParaSSDAggr, and achieve several common aggregation operations with ParaSSDAggr. Finally, experiments show that, compared to traditional table scan and aggregation operations, there are 3x and 4x improvement of the performance for ParaSSDScan and ParaSSDAggr which cost little memory. ParaSSDScan and" ParaSSDAggr largely speed up table scan and aggregation operations, which fully show the superiority of internal parallelism of SSDs.