StreamDrive: a Dynamic Dataflow Framework for Clustered Embedded Architectures

作者: Arthur Stoutchinin , Luca Benini

DOI: 10.1007/S11265-018-1351-1

关键词: Orb (optics)Multi-core processorSpeedupComputer scienceMultiprocessingParallel computingDataflowRuntime systemOverhead (computing)Shared memory

摘要: In this paper, we present StreamDrive, a dynamic dataflow framework for programming clustered embedded multicore architectures. StreamDrive simplifies development of applications starting from sequential reference C code and allows seamless handling heterogeneous application-specific processing elements by applications. We address issues efficient implementation the runtime system in context constrained environments, which have not been sufficiently addressed previous research. conducted detailed performance evaluation on our Application Specific MultiProcessor (ASMP) cluster using Oriented FAST Rotated BRIEF (ORB) algorithm typical image domain. used proposed incremental flow transformation ORB original into an optimized implementation. Our has less than 10% parallelization overhead, near-linear speedup when number processors increases 1 to 8, achieves 15 VGA frames per second with small configuration 4 64KB shared memory, 30 8 128KB memory.

参考文章(60)
Antoniu Pop, Albert Cohen, OpenStream: Expressiveness and data-flow compilation of OpenMP streaming programs high performance embedded architectures and compilers. ,vol. 9, pp. 53- ,(2013) , 10.1145/2400682.2400712
Johan Eker, Jorn W. Janneck, Dataflow programming in CAL—balancing expressiveness, analyzability, and implementability 2012 Conference Record of the Forty Sixth Asilomar Conference on Signals, Systems and Computers (ASILOMAR). pp. 1120- 1124 ,(2012) , 10.1109/ACSSC.2012.6489194
Jocelyn Sérot, François Berry, Cédric Bourrasset, High-level dataflow programming for real-time image processing on smart cameras Journal of Real-time Image Processing. ,vol. 12, pp. 635- 647 ,(2016) , 10.1007/S11554-014-0462-6
Masoud Dehyadegari, Andrea Marongiu, Mohammad Reza Kakoee, Luca Benini, Siamak Mohammadi, Naser Yazdani, A tightly-coupled multi-core cluster with shared-memory HW accelerators 2012 International Conference on Embedded Computer Systems (SAMOS). pp. 96- 103 ,(2012) , 10.1109/SAMOS.2012.6404162
Shuvra S. Bhattacharyya, Sundararajan Sriram, Embedded Multiprocessors: Scheduling and Synchronization ,(2000)
Ab Al-Hadi Ab Rahman, Simone Casale-Brunet, Claudio Alberti, Marco Mattavelli, A Methodology For Optimizing Buffer Sizes Of Dynamic Dataflow Fpgas Implementations international conference on acoustics, speech, and signal processing. pp. 5003- 5007 ,(2014) , 10.1109/ICASSP.2014.6854554
Maxime Pelcat, Karol Desnos, Julien Heulot, Clement Guy, Jean-Francois Nezan, Slaheddine Aridhi, Preesm: A dataflow-based rapid prototyping framework for simplifying multicore DSP programming 2014 6th European Embedded Design in Education and Research Conference (EDERC). pp. 36- 40 ,(2014) , 10.1109/EDERC.2014.6924354
Johan Cockx, Kristof Denolf, Bart Vanhoof, Richard Stahl, SPRINT: a tool to generate concurrent transaction-level models from sequential code EURASIP Journal on Advances in Signal Processing. ,vol. 2007, pp. 213- 213 ,(2007) , 10.1155/2007/75373
Andreas Olofsson, Tomas Nordstrom, Zain Ul-Abdin, Kickstarting high-performance energy-efficient manycore architectures with Epiphany 2014 48th Asilomar Conference on Signals, Systems and Computers. pp. 1719- 1726 ,(2014) , 10.1109/ACSSC.2014.7094761
Andy D. Pimentel, The Artemis workbench for system-level performance evaluation of embedded systems International Journal of Embedded Systems. ,vol. 3, pp. 181- 196 ,(2008) , 10.1504/IJES.2008.020299