作者: Anthony Roide , Antonio Valles , Dattatraya Kulkarni , Gautam Doshi
DOI:
关键词:
摘要: An embodiment of a compiler technique for decreasing sparse matrix computation runtime parallelizes loads from adjacent iterations unrolled loop code. A dependence check code is statically inserted to identify between store and load dynamically, information passed scheduler scheduling independent parallel potentially dependent computations at suitable latencies.