作者: Antonino Laudani , Gabriele Maria Lozito , Francesco Riganti Fulginei , Alessandro Salvini
关键词:
摘要: The present paper documents the research towards development of an efficient algorithm to compute result from a multiple-input-single-output Neural Network using floating-point arithmetic on FPGA. proposed focus optimizing pipeline delays by splitting "Multiply and accumulate" into separate steps partial products. It is revisit classical for NN computation, able overcome main computation bottleneck in FPGA environment. can be implemented architecture that fully exploits performance blocks, thus allowing very fast neural network. presented as target Cyclone II Device.