Efficient Skyline and Top-k Retrieval in Subspaces

作者: Yufei Tao , Xiaokui Xiao , Jian Pei

DOI: 10.1109/TKDE.2007.1051

关键词:

摘要: Skyline and top-k queries are two popular operations for preference retrieval. In practice, applications that require these usually provide numerous candidate attributes, whereas, depending on their interests, users may issue regarding different subsets of the dimensions. The existing algorithms inadequate subspace skyline/top-k search because they have at least one following defects: 1) scanning entire database once, 2) optimized but incur significant overhead other subspaces, or 3) demand expensive maintenance cost space consumption. this paper, we propose a technique SUBSKY, which settles both types by using purely relational technologies. core SUBSKY is transformation converts multidimensional data to one-dimensional (1D) values. These values indexed simple B-tree, allows us answer accessing fraction database. entails low overhead, equals updating traditional B-tree. Extensive experiments with real confirm our outperforms alternative solutions significantly in efficiency scalability.

参考文章(33)
Gerhard Weikum, Sebastian Michel, Peter Triantafillou, KLEE: a framework for distributed top-k query algorithms very large data bases. pp. 637- 648 ,(2005)
Jarek Gryz, Ryan Shipley, Parke Godfrey, Maximal vector computation in large data sets very large data bases. pp. 229- 240 ,(2005)
Wolf-Tilo Balke, Ulrich Güntzer, Jason Xin Zheng, Efficient Distributed Skylining for Web Information Systems extending database technology. ,vol. 2992, pp. 256- 273 ,(2004) , 10.1007/978-3-540-24741-8_16
Peter Triantafillou, Gerhard Weikum, Beng Chin Ooi, Martin L. Kersten, Sebastian Michel, Per-{ AA}ke Larson, Klemens Böhm, Laura M. Haas, Christian S. Jensen, KLEE: A Framework for Distributed Top-k Query Algorithms Untitled Event. pp. 637- 648 ,(2005)
Beng Chin Ooi, Pin-Kwang Eng, Kian-Lee Tan, Efficient Progressive Skyline Computation very large data bases. pp. 301- 310 ,(2001)
Dan Pelleg, Andrew W. Moore, X-means: Extending K-means with Efficient Estimation of the Number of Clusters international conference on machine learning. pp. 727- 734 ,(2000)
Parke Godfrey, Skyline Cardinality for Relational Processing foundations of information and knowledge systems. pp. 78- 97 ,(2004) , 10.1007/978-3-540-24627-5_7
Yufei Tao, Vagelis Hristidis, Dimitris Papadias, Yannis Papakonstantinou, Branch-and-bound processing of ranked queries Information Systems. ,vol. 32, pp. 424- 445 ,(2007) , 10.1016/J.IS.2005.12.001
Gísli R. Hjaltason, Hanan Samet, Distance browsing in spatial databases ACM Transactions on Database Systems. ,vol. 24, pp. 265- 318 ,(1999) , 10.1145/320248.320255
Ronald Fagin, Combining fuzzy information from multiple systems (extended abstract) symposium on principles of database systems. pp. 216- 226 ,(1996) , 10.1145/237661.237715