From High-Level Specification to High-Performance Code

Z Wang , M O’Boyle , PS Rawat , M Vaidya
Proceedings of the IEEE 106 ( 11)

2018
APOLLO: Automatic speculative POLyhedral Loop Optimizer

Philippe Clauss , Juan Manuel Martinez Caamaño , Manuel Selva , Artiom Baloian
IMPACT 2017 - 7th International Workshop on Polyhedral Compilation Techniques 8

7
2017
Associative instruction reordering to alleviate register pressure

Prashant Singh Rawat , Aravind Sukumaran-Rajam , Atanas Rountev , Fabrice Rastello
ieee international conference on high performance computing data and analytics 46

4
2018
A code generator for high-performance tensor contractions on GPUs

Jinsung Kim , Aravind Sukumaran-Rajam , Vineeth Thumma , Sriram Krishnamoorthy
symposium on code generation and optimization 85 -95

11
2019
Efficient Tiled Sparse Matrix Multiplication through Matrix Signatures

Süreyya Emre Kurt , Aravind Sukumaran-Rajam , Fabrice Rastello , Ponnuswamy Sadayyapan
ieee international conference on high performance computing data and analytics 1 -14

14
2020
Analytical Characterization and Design Space Exploration for Optimization of CNNs.

Rui Li , Yufan Xu , Aravind Sukumaran-Rajam , Atanas Rountev
arXiv: Learning

2021
Efficient distributed algorithms for Convolutional Neural Networks.

Rui Li , Yufan Xu , Aravind Sukumaran-Rajam , Atanas Rountev

2021
ALO-NMF: Accelerated Locality-Optimized Non-negative Matrix Factorization

Gordon E Moon , J Austin Ellis , Aravind Sukumaran-Rajam , Srinivasan Parthasarathy
knowledge discovery and data mining 1758 -1767

2020
Adaptive sparse tiling for sparse matrix multiplication

Changwan Hong , Aravind Sukumaran-Rajam , Israt Nisa , Kunal Singh
acm sigplan symposium on principles and practice of parallel programming 300 -314

89
2019
An efficient mixed-mode representation of sparse tensors

Israt Nisa , Jiajia Li , Aravind Sukumaran-Rajam , Prasant Singh Rawat
ieee international conference on high performance computing data and analytics 49

7
2019
Analytical cache modeling and tilesize optimization for tensor contractions

Rui Li , Aravind Sukumaran-Rajam , Richard Veras , Tze Meng Low
ieee international conference on high performance computing data and analytics 74

5
2019
Domain-Specific Optimization and Generation of High-Performance GPU Code for Stencil Computations

Prashant Singh Rawat , Miheer Vaidya , Aravind Sukumaran-Rajam , Mahesh Ravishankar
Proceedings of the IEEE 106 ( 11) 1902 -1920

18
2018
On improving performance of sparse matrix-matrix multiplication on GPUs

Rakshith Kunchum , Ankur Chaudhry , Aravind Sukumaran-Rajam , Qingpeng Niu
international conference on supercomputing 14

9
2017
The Polyhedral Model of Nonlinear Loops

Aravind Sukumaran-Rajam , Philippe Clauss
ACM Transactions on Architecture and Code Optimization 12 ( 4) 48

10
2015
GPU code optimization using abstract kernel emulation and sensitivity analysis

Changwan Hong , Aravind Sukumaran-Rajam , Jinsung Kim , Prashant Singh Rawat
programming language design and implementation 53 ( 4) 736 -751

5
2018
Register optimizations for stencils on GPUs

Prashant Singh , Aravind Sukumaran-Rajam , Atanas Rountev , Fabrice Rastello
acm sigplan symposium on principles and practice of parallel programming 53 ( 1) 168 -182

20
2018
Performance modeling for GPUs using abstract kernel emulation

Changwan Hong , Aravind Sukumaran-Rajam , Jinsung Kim , Prashant Singh Rawat
acm sigplan symposium on principles and practice of parallel programming 53 ( 1) 397 -398

1
2018
Parallel CCD++ on GPU for Matrix Factorization

Israt Nisa , Aravind Sukumaran-Rajam , Rakshith Kunchum , P Sadayappan
acm sigplan symposium on principles and practice of parallel programming 73 -83

15
2017
Speculative Program Parallelization with Scalable and Decentralized Runtime Verification

Aravind Sukumaran-Rajam , Juan Manuel Martinez Caamaño , Willy Wolff , Alexandra Jimborean
runtime verification 8734 124 -139

8
2014
Load-Balanced Sparse MTTKRP on GPUs

Israt Nisa , Jiajia Li , Aravind Sukumaran-Rajam , Richard Vuduc
international parallel and distributed processing symposium 123 -133

37
2019