作者: Rahul Mangharam , Aminreza Abrahimi Saba
DOI: 10.1109/RTSS.2011.41
关键词: CUDA 、 Flow network 、 Instrumentation (computer programming) 、 Graphics processing unit 、 Algorithm 、 Traffic congestion 、 Parallel algorithm 、 Computer science 、 Algorithm design 、 Distributed computing 、 Tardiness
摘要: Most algorithms are run-to-completion and provide one answer upon completion no if interrupted before completion. On the other hand, anytime have a monotonic increasing utility with length of execution time. Our investigation focuses on development time-bounded Graphics Processing Units (GPUs) to trade-off quality output Given time-varying workload, algorithm continually measures its progress remaining contract time decide pathway select system resources required maximize result. To exploit quality-time tradeoff, focus is construction, instrumentation, on-line measurement decision making capable efficiently managing GPU resources. We demonstrate this Parallel A* routing CUDA-enabled GPU. The resource usage described in terms CUDA kernels constructed at design-time. At runtime, selects subset composes them for feedback-control between GPU-CPU achieve controllable computation tardiness by throttling request admissions processing precision. As case study, we implemented AutoMatrix, GPU-based vehicle traffic simulator real-time congestion management which scales up 16 million vehicles US street map. This an early effort enable imprecise approximate parallel architectures stream-based timebounded applications such as prediction route allocation large transportation networks.