Hardware-based profiling: an effective technique for profile-driven optimization

作者: Thomas M. Conte , Burzin A. Patel , Kishore N. Menezes , J. Stan Cox

DOI: 10.1007/BF03356747

关键词:

摘要: Profile-based optimization can be used for instruction scheduling, loop data preloading, function in-lining, and cache performance enhancement. However, these techniques have not been embraced by software vendors because programs instrumented profiling run significantly slower, an awkward compile-run-recompile sequence is required, a test input suite must collected validated each program. This paper introduces hardware-based that uses traditional branch handling hardware to generate profile information in real time. Techniques are presented both one-level two-level organizations. The approach produces high accuracy with small slowdown execution (0.4%---4.6%). allows program profiled while it used, eliminating the need suite. With contemporary processors driven increasingly compiler support, important high-performance systems.

参考文章(25)
Jr. William Yu-Wei Chen, Data preload for superscalar and VLIW processors University of Illinois at Urbana-Champaign. ,(1993)
S.E. Shladover, Research and development needs for advanced vehicle control systems IEEE Micro. ,vol. 13, pp. 11- 19 ,(1993) , 10.1109/40.210521
Thomas Ball, James R. Larus, Branch prediction for free Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation - PLDI '93. ,vol. 28, pp. 300- 313 ,(1993) , 10.1145/155090.155119
S.P. Song, M. Denman, J. Chang, The PowerPC 604 RISC microprocessor. IEEE Micro. ,vol. 14, pp. 8- 17 ,(1994) , 10.1109/MM.1994.363071
Pohua P. Chang, Scott A. Mahlke, Wen-Mei W. Hwu, Using profile information to assist classic code optimizations Software - Practice and Experience. ,vol. 21, pp. 1301- 1321 ,(1991) , 10.1002/SPE.4380211204
Thomas Ball, James R. Larus, Optimally profiling and tracing programs symposium on principles of programming languages. pp. 59- 70 ,(1992) , 10.1145/143165.143180
James R. Larus, Thomas Ball, Rewriting executable files to measure program behavior Software - Practice and Experience. ,vol. 24, pp. 197- 218 ,(1994) , 10.1002/SPE.4380240204
Tim A. Wagner, Vance Maverick, Susan L. Graham, Michael A. Harrison, Accurate static estimators for program optimization programming language design and implementation. ,vol. 29, pp. 85- 96 ,(1994) , 10.1145/178243.178251
P. P. Chang, W.-W. Hwu, Inline function expansion for compiling C programs Proceedings of the ACM SIGPLAN 1989 Conference on Programming language design and implementation - PLDI '89. ,vol. 24, pp. 246- 257 ,(1989) , 10.1145/73141.74840
Wen -Mei W. Hwu, Scott A. Mahlke, William Y. Chen, Pohua P. Chang, Nancy J. Warter, Roger A. Bringmann, Roland G. Ouellette, Richard E. Hank, Tokuzo Kiyohara, Grant E. Haab, John G. Holm, Daniel M. Lavery, The superblock: an effective technique for VLIW and superscalar compilation The Journal of Supercomputing. ,vol. 7, pp. 229- 248 ,(1993) , 10.1007/BF01205185