东篱科研大数据发现系统（DRDS）

位置：成果数据库 > 期刊 > 期刊详情页

Approximate Continuous Top-k Query over Sliding Window

ISSN号：1000-9000
期刊名称：《计算机科学技术学报：英文版》
时间：0
分类：TP393[自动化与计算机技术—计算机应用技术;自动化与计算机技术—计算机科学与技术] TP311.13[自动化与计算机技术—计算机软件与理论;自动化与计算机技术—计算机科学与技术]
作者机构：College of Computer Science and Engineering, Northeastern University, Shenyang 110819, China
相关基金：This work is partially supported by the National Natural Science Fund for Distinguish Young Scholars of China under Grant No. 61322208, the National Basic Research 973 Program of China under Grant No. 2012CB316201, the National Natural Science Foundation of China under Grant Nos. 61272178 and 61572122, and the Key Program of the National Natural Science Foundation of China under Grant No. 61532021.

作者： Rui Zhu, Bin Wang, Shi-Ying Luo, Xiao-Chun Yang, Guo-Ren Wang

关键词：滑动窗口, 查询, 合并算法, 增量维护, 数据分布, 数据库, 实时性, 框架, continuous top-k query, approximate, sliding window

中文摘要：

当窗口滑动时，在滑动窗口上的连续 top-k 质问是在数据库的一个基本问题，它与最高的分数检索 k 对象。存在研究主要采用准确算法处理这类询问，谁的给想法调音，是在窗户中维持目标的一个子集，并且试着检索从它的答案。然而，所有存在算法是敏感的查询参数和数据分发。另外，他们为增长维护受不了昂贵的开销，并且不能因此满足即时要求。在这份报纸，我们定义说出的新奇询问(，) 近似连续 top-k 为 top-k 查询近似答案，它回来质问。以便高效地支持这询问，我们建议一个有效框架，命名 PABF (概率的近似基于的框架) ，在滑动上支持近似 top-k 质问窗户。我们第一维持自我适应的修剪值，它能滤出最新到达的对象有可能性是询问中的不到 1 个结果。为没被过滤的那些目标，我们一起联合他们，如果在他们之中的 20 差别是不到阀值。为了高效地维持这些，联合了结果，框架 PABF 也建议一个多相的合并算法。理论分析显示甚至在最糟的盒子中，我们为维持每个候选人要求仅仅对数的复杂性。

英文摘要：

Continuous top-k query over sliding window is a fundamental problem in database, which retrieves k objects with the highest scores when the window slides. Existing studies mainly adopt exact algorithms to tackle this type of queries, whose key idea is to maintain a subset of objects in the window, and try to retrieve answers from it. However, all the existing algorithms are sensitive to query parameters and data distribution. In addition, they suffer from expensive overhead for incremental maintenance, and thus cannot satisfy real-time requirement. In this paper, we define a novel query named （ε, δ）-approximate continuous top-κ query, which returns approximate answers for top-κ query. In order to efficiently support this query, we propose an efficient framework, named PABF （Probabilistic Approximate Based Framework）, to support approximate top-κ query over sliding window. We firstly maintain a self-adaptive pruning value, which could filter out newly arrived objects who have a probability less than 1 - 5 of being a query result. For those objects that are not filtered, we combine them together, if the score difference among them is less than a threshold. To efficiently maintain these combined results, the framework PABF also proposes a multi-phase merging algorithm. Theoretical analysis indicates that even in the worst case, we require only logarithmic complexity for maintaining each candidate.

同期刊论文项目