作者: Irfan Mehmood , Muhammad Sajjad , Seungmin Rho , Sung Wook Baik
DOI: 10.1016/J.NEUCOM.2015.05.126
关键词:
摘要: Recent advances in multimedia technology have led to tremendous increases the available volume of video data, thereby creating a major requirement for efficient systems manage such huge data volumes. Video summarization is one key techniques accessing and managing large libraries. can be used extract affective contents sequence generate concise representation its content. Human attention models are an means content extraction. Existing visual driven frameworks high computational cost memory requirements, as well lack efficiency accurately perceiving human attention. To cope with these issues, we propose divide-and-conquer based framework big data. We divide original into shots, where model computed from each shot parallel. Viewer's on multiple sensory perceptions, i.e., aural visual, viewer's neuronal signals. The Teager energy, instant amplitude, frequency, whereas employs multi-scale contrast motion intensity. Moreover, using beta-band frequencies Next, aggregated curve generated intra- inter-modality fusion mechanism. Finally, extracted. signals provides bridge that links digital perceptions. Our experimental results indicate proposed shot-detection strategy mitigates time complexity. accurate reflection user preferences facilitates extraction highly personalized summaries.