Temporally Consistent Depth Prediction with Flow-Guided Memory Units.

作者: Bumsub Ham , Chanho Eom , Hyunjong Park

DOI:

关键词: Artificial intelligenceCoherence (signal processing)Leverage (statistics)Memory moduleConvolutional neural networkPattern recognitionMemorizationOptical flowComputer science

摘要: Predicting depth from a monocular video sequence is an important task for autonomous driving. Although it has advanced considerably in the past few years, recent methods based on convolutional neural networks (CNNs) discard temporal coherence and estimate independently each frame, which often leads to undesired inconsistent results over time. To address this problem, we propose memorize consistency sequence, leverage of prediction. end, introduce two-stream CNN with flow-guided memory module, where stream encodes visual features, respectively. The implemented using gated recurrent units (ConvGRUs), inputs features sequentially together optical flow tailored our task. It memorizes trajectories individual selectively propagates spatial information time, enforcing long-term prediction results. We evaluate method KITTI benchmark dataset terms accuracy, runtime, achieve new state art. also provide extensive experimental analysis, clearly demonstrating effectiveness approach memorizing

参考文章(64)
Brian Rogers, Maureen Graham, Motion parallax as an independent cue for depth perception Perception. ,vol. 8, pp. 125- 134 ,(1979) , 10.1068/P080125
Tunç Ozan Aydin, Nikolce Stefanoski, Simone Croci, Markus Gross, Aljoscha Smolic, Temporally coherent local tone mapping of HDR video international conference on computer graphics and interactive techniques. ,vol. 33, pp. 196- ,(2014) , 10.1145/2661229.2661268
Shigang Li, Binocular Spherical Stereo IEEE Transactions on Intelligent Transportation Systems. ,vol. 9, pp. 589- 600 ,(2008) , 10.1109/TITS.2008.2006736
Sepp Hochreiter, Jürgen Schmidhuber, Long short-term memory Neural Computation. ,vol. 9, pp. 1735- 1780 ,(1997) , 10.1162/NECO.1997.9.8.1735
Kevin Karsch, Ce Liu, Sing Bing Kang, Depth Transfer: Depth Extraction from Video Using Non-Parametric Sampling. IEEE Transactions on Pattern Analysis and Machine Intelligence. ,vol. 36, pp. 2144- 2158 ,(2014) , 10.1109/TPAMI.2014.2316835
Miaomiao Liu, Mathieu Salzmann, Xuming He, None, Discrete-Continuous Depth Estimation from a Single Image computer vision and pattern recognition. pp. 716- 723 ,(2014) , 10.1109/CVPR.2014.97
Ce Liu, Jenny Yuen, Antonio Torralba, SIFT Flow: Dense Correspondence across Scenes and Its Applications IEEE Transactions on Pattern Analysis and Machine Intelligence. ,vol. 33, pp. 978- 994 ,(2011) , 10.1109/TPAMI.2010.147
Frank Steinbrucker, Jurgen Sturm, Daniel Cremers, Real-time visual odometry from dense RGB-D images international conference on computer vision. pp. 719- 722 ,(2011) , 10.1109/ICCVW.2011.6130321
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, Li Fei-Fei, ImageNet: A large-scale hierarchical image database computer vision and pattern recognition. pp. 248- 255 ,(2009) , 10.1109/CVPR.2009.5206848
A Geiger, P Lenz, C Stiller, R Urtasun, Vision meets robotics: The KITTI dataset The International Journal of Robotics Research. ,vol. 32, pp. 1231- 1237 ,(2013) , 10.1177/0278364913491297