Automated Detection of Handwritten Whiteboard Content in Lecture Videos for Summarization

作者: Bhargava Urala Kota , Kenny Davila , Alexander Stone , Srirangaraj Setlur , Venu Govindaraju

DOI: 10.1109/ICFHR-2018.2018.00013

关键词:

摘要: Online lecture videos are a valuable resource for students across the world. The ability to find based on their content could make them even more useful. Methods automatic extraction of this reduce amount manual effort required indexing and retrieval such possible. We adapt deep learning method scene text detection, purpose detection handwritten text, math expressions sketches in videos. detect elements whiteboard generate summary all over time lecture, while also dealing with occluded due motion lecturer. train, test publicly available AccessMath video dataset evaluate our framework basis number frames, as well recall precision set found that increases state-of-the-art there is potential increase well. have added existing ground truth by providing timestamp-based, semantically meaningful bounding box annotations content, which has been released.

参考文章(25)
Yingying Zhu, Cong Yao, Xiang Bai, Scene text detection and recognition: recent advances and future trends Frontiers of Computer Science. ,vol. 10, pp. 19- 36 ,(2016) , 10.1007/S11704-015-4488-0
Lukas Neumann, Jiri Matas, A method for text localization and recognition in real-world images asian conference on computer vision. pp. 770- 783 ,(2010) , 10.1007/978-3-642-19318-7_60
Szilárd Vajda, Leonard Rothacker, Gernot A. Fink, A method for camera-based interactive whiteboard reading CBDAR'11 Proceedings of the 4th international conference on Camera-Based Document Analysis and Recognition. pp. 112- 125 ,(2011) , 10.1007/978-3-642-29364-1_9
Purnendu Banerjee, Ujjwal Bhattacharya, Bidyut B. Chaudhuri, Automatic Detection of Handwritten Texts from Video Frames of Lectures international conference on frontiers in handwriting recognition. pp. 627- 632 ,(2014) , 10.1109/ICFHR.2014.110
C. Choudary, Tiecheng Liu, Summarization of Visual Content in Instructional Videos IEEE Transactions on Multimedia. ,vol. 9, pp. 1443- 1455 ,(2007) , 10.1109/TMM.2007.906602
D. Comaniciu, P. Meer, Mean shift: a robust approach toward feature space analysis IEEE Transactions on Pattern Analysis and Machine Intelligence. ,vol. 24, pp. 603- 619 ,(2002) , 10.1109/34.1000236
Rajiv Ratn Shah, Yi Yu, Anwar Dilawar Shaikh, Suhua Tang, Roger Zimmermann, ATLAS: Automatic Temporal Segmentation and Annotation of Lecture Videos Based on Modelling Transition Time acm multimedia. pp. 209- 212 ,(2014) , 10.1145/2647868.2656407
M. Onishi, M. Izumi, K. Fukunaga, Blackboard segmentation using video image of lecture and its applications international conference on pattern recognition. ,vol. 4, pp. 615- 618 ,(2000) , 10.1109/ICPR.2000.902994
Keni Bernardin, Rainer Stiefelhagen, Evaluating multiple object tracking performance: the CLEAR MOT metrics Eurasip Journal on Image and Video Processing. ,vol. 2008, pp. 1- 10 ,(2008) , 10.1155/2008/246309
Nobuyuki Otsu, A Threshold Selection Method from Gray-Level Histograms IEEE Transactions on Systems, Man, and Cybernetics. ,vol. 9, pp. 62- 66 ,(1979) , 10.1109/TSMC.1979.4310076