Efficiently Annotating Object Images with Absolute Size Information Using Mobile Devices

作者: Martin Hofmann , Marco Seeland , Patrick Mäder

DOI: 10.1007/S11263-018-1093-3

关键词: Mobile deviceProjection (set theory)Artificial intelligenceScale (map)Process (computing)Object (computer science)Image sensorFocus (optics)Computer scienceComputer visionPattern recognition (psychology)

摘要: The projection of a real world scenery to planar image sensor inherits the loss information about 3D structure as well absolute dimensions scene. For analysis and object classification tasks, however, size can make results more accurate. Today, creation annotated datasets is effort intensive typically requires measurement equipment not available public contributors. In this paper, we propose an effective annotation method that utilizes camera within smart mobile devices capture missing along with image. approach builds on fact camera, calibrated specific distance, lengths be measured in object’s plane. We use camera’s minimum focus distance calibration adaptive feature matching process for precise computation scale change between two images facilitating measurements larger distances. Eventually, segmented its later analysis. A user study showed humans are able retrieve low variance. proposed facilitates accuracy comparable manual ruler outperforms state-of-the-art methods terms repeatability. Consequently, allows in-situ objects without need additional or artificial reference

参考文章(59)
S Kyle, S Robson, T Luhmann, I Harley, Close Range Photogrammetry Principles, Methods and Applications ,(2006)
Masahiro Watanabe, Shree K. Nayar, Rational Filters for Passive Depth from Defocus International Journal of Computer Vision. ,vol. 27, pp. 203- 225 ,(1998) , 10.1023/A:1007905828438
Henrik Aanæs, Anders Lindbjerg Dahl, Kim Steenstrup Pedersen, Interesting Interest Points International Journal of Computer Vision. ,vol. 97, pp. 18- 35 ,(2012) , 10.1007/S11263-011-0473-8
Sebastian Thrun, Robotic mapping: a survey Exploring artificial intelligence in the new millennium. pp. 1- 35 ,(2003)
Antonio Criminisi, Ian Reid, Andrew Zisserman, Single View Metrology International Journal of Computer Vision. ,vol. 40, pp. 123- 148 ,(2000) , 10.1023/A:1026598000963
Raul Mur-Artal, J. M. M. Montiel, Juan D. Tardos, ORB-SLAM: A Versatile and Accurate Monocular SLAM System IEEE Transactions on Robotics. ,vol. 31, pp. 1147- 1163 ,(2015) , 10.1109/TRO.2015.2463671
Michael Moeller, Martin Benning, Carola Schönlieb, Daniel Cremers, Variational Depth From Focus Reconstruction IEEE Transactions on Image Processing. ,vol. 24, pp. 5369- 5378 ,(2015) , 10.1109/TIP.2015.2479469
Annika Kuhl, Christian Wöhler, Lars Krüger, Pablo d’Angelo, Horst-Michael Groß, Monocular 3D Scene Reconstruction at Absolute Scales by Combination of Geometric and Real-Aperture Methods Lecture Notes in Computer Science. pp. 607- 616 ,(2006) , 10.1007/11861898_61
Jingming Dong, Stefano Soatto, Domain-size pooling in local descriptors: DSP-SIFT computer vision and pattern recognition. pp. 5097- 5106 ,(2015) , 10.1109/CVPR.2015.7299145
David Eigen, Rob Fergus, Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-scale Convolutional Architecture 2015 IEEE International Conference on Computer Vision (ICCV). pp. 2650- 2658 ,(2015) , 10.1109/ICCV.2015.304