Performance of database workloads on shared-memory systems with out-of-order processors

作者: Parthasarathy Ranganathan , Kourosh Gharachorloo , Sarita V. Adve , Luiz André Barroso

DOI: 10.1145/291069.291067

关键词: MultiprocessingCacheConsistency modelInstruction prefetchOut-of-order executionDatabaseDatabase engineShared memoryComputer scienceOperating systemOracleServerOnline transaction processing

摘要: Database applications such as online transaction processing (OLTP) and decision support systems (DSS) constitute the largest fastest-growing segment of market for multiprocessor servers. However, most current system designs have been optimized to perform well on scientific engineering workloads. Given radically different behavior database workloads (especially OLTP), it is important re-evaluate key design decisions in context this class applications.This paper examines shared-memory multiprocessors with aggressive out-of-order processors, considers simple optimizations that can provide further performance improvements. Our study based detailed simulations Oracle commercial engine. The results show combination execution multiple instruction issue indeed effective improving workloads, providing gains 1.5 2.6 times over an in-order single-issue processor OLTP DSS, respectively. In addition, speculative techniques enable implementations memory consistency models significantly improve stricter models, bringing within 10--15% more relaxed models.The second part our focuses challenging workload. We stream buffer reducing remaining stalls OLTP, a 17% reduction time (approaching perfect cache 15%). Furthermore, characterization shows large fraction data communication misses exhibit migratory behavior; preliminary software prefetch writeback/flush hints be used reduce by 12%.

参考文章(29)
Vijay S. Pai, Sarita V. Adve, Parthasarathy Ranganathan, RSIM Reference Manual: Version 1.0 Rice University ECE Technical Report. ,(1997)
Jonas Skeppstedt, Per Stenström, A compiler algorithm that reduces read latency in ownership-based cache coherence protocols international conference on parallel architectures and compilation techniques. pp. 69- 78 ,(1995) , 10.5555/224659.224690
Kourosh Gharachorloo, John L. Hennessy, Anoop Gupta, Two Techniques to Enhance the Performance of Memory Consistency Models. international conference on parallel processing. pp. 355- 364 ,(1991)
Kourosh Gharachorloo, Henry M. Levy, Luiz André Barroso, Sujay S. Parekh, Jack L. Lo, Susan J. Eggers, An analysis of database workload performance on simultaneous multithreaded processors international symposium on computer architecture. ,vol. 26, pp. 39- 50 ,(1998) , 10.1145/279358.279367
Alan L. Cox, Robert J. Fowler, Adaptive cache coherency for detecting migratory shared data Proceedings of the 20th annual international symposium on Computer architecture - ISCA '93. ,vol. 21, pp. 98- 108 ,(1993) , 10.1145/165123.165146
Kimberly Keeton, Roger C. Raphael, David A. Patterson, Walter E. Baker, Yong Qiang He, Performance characterization of a Quad Pentium Pro SMP using OLTP workloads international symposium on computer architecture. ,vol. 26, pp. 15- 26 ,(1998) , 10.1145/279358.279364
Amitabh Srivastava, Alan Eustace, ATOM: a system for building customized program analysis tools programming language design and implementation. ,vol. 39, pp. 196- 205 ,(1994) , 10.1145/178243.178260
Richard J. Eickemeyer, Ross E. Johnson, Steven R. Kunkel, Mark S. Squillante, Shiafun Liu, Evaluation of Multithreaded Uniprocessors for Commercial Application Environments international symposium on computer architecture. ,vol. 24, pp. 203- 212 ,(1996) , 10.1145/232973.232994
Mark D. Hill, James R. Larus, Steven K. Reinhardt, David A. Wood, Cooperative shared memory: software and hardware for scalable multiprocessors ACM Transactions on Computer Systems. ,vol. 11, pp. 300- 318 ,(1993) , 10.1145/161541.161544