GPUBenchmark: un banco de pruebas para GPUs

Sergio Barrachina Mir , Marıa Isabel Castillo Catalán , Adrián Castelló Gimeno , Rafael Mayo Gual

On the Use of Remote GPUs and Low-Power Processors for the Acceleration of Scientific Applications

Enrique S. Quintana-Ortí , José Duato , Federico Silla , Adrián Castelló
ENERGY 2014, The Fourth International Conference on Smart Grids, Green Communications and IT Energy-aware Technologies 57 -62

11
2014
05. Vídeo Sesión 1, parte 3

Sergio Iserte , Adrián Castelló

2017
PyDTNN: A user-friendly and extensible framework for distributed deep learning

Manuel F. Dolz , Adrián Castelló , Sergio Barrachina , Jose I. Mestre
The Journal of Supercomputing 1 -17

2021
Analysis of model parallelism for distributed neural networks

Adrián Castelló , Manuel F. Dolz , Enrique S. Quintana-Ortí , José Duato
Proceedings of the 26th European MPI Users' Group Meeting

1
2019
On the adequacy of lightweight thread approaches for high-level parallel programming models

Adrián Castelló , Rafael Mayo , Kevin Sala , Vicenç Beltran
Future Generation Computer Systems 84 22 -31

4
2018
Programming parallel dense matrix factorizations with look-ahead and OpenMP

Sandra Catalán , Adrián Castelló , Francisco D. Igual , Rafael Rodríguez-Sánchez
Cluster Computing 23 ( 1) 359 -375

3
2020
Reformulating the direct convolution for high-performance deep learning inference on ARM processors

Sergio Barrachina , Adrián Castelló , Manuel F Dolz , Tze Meng Low
Journal of Systems Architecture 135 102806

2
2023
Efficient and portable Winograd convolutions for multi-core processors

Manuel F Dolz , Héctor Martínez , Adrián Castelló , Pedro Alonso-Jordá
The Journal of Supercomputing 1 -22

2023
A BLIS-like matrix multiplication for machine learning in the RISC-V ISA-based GAP8 processor

Cristian Ramírez , Adrián Castelló , Enrique S Quintana-Orti
The Journal of Supercomputing 78 ( 16) 18051 -18060

3
2022
Accelerating distributed deep neural network training with pipelined MPI allreduce

Adrián Castelló , Enrique S Quintana-Ortí , José Duato
Cluster Computing 24 ( 4) 3797 -3813

10
2021
Analyzing the impact of the MPI allreduce in distributed training of convolutional neural networks

Adrián Castelló , Mar Catalán , Manuel F Dolz , Enrique S Quintana-Ortí
Computing 105 ( 5) 1101 -1119

2
2023
Un planificador de GPUs remotas para clusters HPC

Sergio Iserte , Adrián Castelló , Carlos Reano , Antonio J Pena
Actas de las XXIV Jornadas de Paralelismo 193 -198

2013
RED-SEA: Network Solution for Exascale Architectures

Andrea Biagioni , Paolo Cretaro , Ottorino Frezza , Francesca Lo Cicero
2022 25th Euromicro Conference on Digital System Design (DSD) 712 -719

3
2022
GLT: A unified API for lightweight thread libraries

Adrián Castelló , Sangmin Seo , Rafael Mayo , Pavan Balaji
Euro-Par 2017: Parallel Processing: 23rd International Conference on Parallel and Distributed Computing, Santiago de Compostela, Spain, August 28–September 1, 2017, Proceedings 23 470 -481

8
2017
Anatomy of the BLIS family of algorithms for matrix multiplication

Adrián Castelló , Enrique S Quintana-Ortí , Francisco D Igual
2022 30th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP) 92 -99

12
2022
Micro-kernels for portable and efficient matrix multiplication in deep learning

Guillermo Alaejos , Adrián Castelló , Héctor Martínez , Pedro Alonso-Jordá
The Journal of Supercomputing 79 ( 7) 8124 -8147

9
2023
Automatic generation of ARM NEON micro-kernels for matrix multiplication

Guillermo Alaejos , Héctor Martínez , Adrián Castelló , Manuel F Dolz
The Journal of Supercomputing 1 -27

1
2024
Experiences with nested parallelism in task-parallel applications using malleable BLAS on multicore processors

Rafael Rodríguez-Sánchez , Adrián Castelló , Sandra Catalán , Francisco D Igual
The International Journal of High Performance Computing Applications 38 ( 2) 55 -68

1
2024