Efficient bundle sorting

作者: Jeffrey Scott Vitter , Yossi Matias , Eran Segal

DOI: 10.5555/338219.338647

关键词: BundleBlock sizeDisk spaceEfficient algorithmInternal memoryCounting sortUpper and lower boundsAuxiliary memoryCombinatoricsMathematics

摘要: Many data sets to be sorted consist of a limited number distinct keys. Sorting such can thought as bundling together identical keys and having the bundles placed in order; we therefore denote this bundle sorting. We describe an efficient algorithm for sorting external memory that requires at most c(N/B)logM/Bk disk accesses, where N is keys, M size internal memory, k B transfer block size, 2

参考文章(13)
Richard Franklin Lary, Brian Charles Edem, Richard Perham Helliwell, John Thomas Johnston, Sorting system for serially processing records ,(1995)
Weiye Zhang, Per-Åke Larson, Buffering and Read-Ahead Strategies for External Mergesort very large data bases. pp. 523- 533 ,(1998)
B. A. W. Baugstø, J. F. Greipsland, J. Kamerbeek, Sorting Large Data Files on POOMA joint international conference on vector and parallel processing parallel processing. pp. 536- 547 ,(1990) , 10.1007/3-540-53065-7_131
Lars Arge, Paolo Ferragina, Roberto Grossi, Jeffrey Scott Vitter, On sorting strings in external memory (extended abstract) symposium on the theory of computing. pp. 540- 548 ,(1997) , 10.1145/258533.258647
Alok Aggarwal, Jeffrey,S. Vitter, The input/output complexity of sorting and related problems Communications of The ACM. ,vol. 31, pp. 1116- 1127 ,(1988) , 10.1145/48529.48535
Jeffrey Scott Vitter, External memory algorithms and data structures External memory algorithms. pp. 1- 38 ,(1999)
Ramesh C. Agarwal, A super scalar sort algorithm for RISC processors international conference on management of data. ,vol. 25, pp. 240- 246 ,(1996) , 10.1145/233269.233336
Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau, David E. Culler, Joseph M. Hellerstein, David A. Patterson, High-performance sorting on networks of workstations international conference on management of data. ,vol. 26, pp. 243- 254 ,(1997) , 10.1145/253260.253322
Min Wang, Bala Iyer, Jeffrey Scott Vitter, Scalable mining for classification rules in relational databases international database engineering and applications symposium. pp. 58- 67 ,(1998) , 10.1214/LNMS/1196285404
M. Farach, P. Ferragina, S. Muthukrishnan, Overcoming the memory bottleneck in suffix tree construction foundations of computer science. pp. 174- 183 ,(1998) , 10.1109/SFCS.1998.743441