Scalar optimizations for shaders

作者： Yuri Dotsenko , Derek Sessions , Andy Glaister , Blaise Pascal Tine , Mikhail Lyapunov

DOI:

关键词:

摘要: Described herein are optimizations of thread loop intermediate representation (IR) code. One embodiment involves an algorithm that, based on data-flow analysis, computes sets temporary variables that loaded at the beginning a and stored upon exit from loop. Another reducing size trip for commonly-found case where piece compute shader is executed by single (or compiler-analyzable range threads). In yet another embodiment, indices cached to avoid excessive divisions, further improving execution speed.

google.com 本地加速

freepatentsonline.com 本地加速

lens.org UNKNOWN 下载加速

freepatentsonline.com UNKNOWN 下载加速

参考文章(13)

Sumanranjan S. Mitra, System and method to concurrently execute a plurality of object oriented platform independent programs by utilizing memory accessible by both a processor and a co-processor ,(2011)

Akira Tanaka, Program conversion apparatus and program conversion method ,(2011)

Manish Kurhekar, Rajkishore Barik, Pradeep Varma, Compilation of unified parallel C-language programs ,(2003)

Peng Di, Jingling Xue, Model-driven tile size selection for DOACROSS loops on GPUs international conference on parallel processing. pp. 401- 412 ,(2011) , 10.1007/978-3-642-23397-5_40

Norbert Juffa, Brett W. Coon, Maximized memory throughput on parallel processing devices ,(2011)

Vinod Grover, Michael Murphy, Bastiaan Joannes Matheus Aarts, Partitioning CUDA code for execution by a general purpose processor ,(2009)

Naga K. Govindaraju, Yuri Dotsenko, John Manferdelli, Brandon Lloyd, Burton Smith, High performance discrete Fourier transforms on graphics processors ieee international conference on high performance computing data and analytics. pp. 2- ,(2008) , 10.5555/1413370.1413373

Allen Leung, Nicolas Vasilache, Benoît Meister, Muthu Baskaran, David Wohlford, Cédric Bastoul, Richard Lethin, A mapping path for multi-GPGPU accelerated computers from a portable high level programming abstraction general purpose processing on graphics processing units. ,vol. 425, pp. 51- 61 ,(2010) , 10.1145/1735688.1735698

Yi Yang, Ping Xiang, Jingfei Kong, Huiyang Zhou, A GPGPU compiler for memory optimization and parallelism management Proceedings of the 2010 ACM SIGPLAN conference on Programming language design and implementation - PLDI '10. ,vol. 45, pp. 86- 97 ,(2010) , 10.1145/1806596.1806606

10.

Erik Lindholm, John Nickolls, Stuart Oberman, John Montrym, NVIDIA Tesla: A Unified Graphics and Computing Architecture IEEE Micro. ,vol. 28, pp. 39- 55 ,(2008) , 10.1109/MM.2008.31

Scalar optimizations for shaders

来源期刊

我的账户

Scalar optimizations for shaders

来源期刊

相似文章 6

Program execution optimization using uniform variable identification

System and method for using ubershader variants without preprocessing macros

Thread scheduling over compute blocks for power optimization

Method and system for transforming source code into target code on computer

Apparatus that generates optimal launch configurations

METHOD AND APPARATUS FOR PROCESSING IMAGE DATA

我的账户