摘要: Even with a powerful hardware in parallel execution, it is still difficult to improve the application performance and reduce energy consumption without realizing bottlenecks of programs on GPU architectures. To help programmers have better insight into energy-saving bottleneck applications architectures, we propose two models: an execution time prediction model model. The model(ETPM) can estimate massively which take instruction-level thread-level parallelism consideration. ETPM contains components: memory sub-model computation sub-model. estimating cost instructions by considering number active threads bandwidth. Correspondingly, application's arithmetic intensity. We use ocelot analysis PTX codes obtain several input parameters for sub-models such as transaction data size. Basing sub-models, analytical estimates each instruction while parallelism, thereby overall application. model(ECPM) total basing from ETPM. compare outcome models actual GTX260 Tesla C2050. results show that reach almost 90 percentage accuracy average benchmarks used.