作者: Karan Sikka , Abhinav Dhall , Marian Bartlett
关键词:
摘要: Automatic pain recognition from videos is a vital clinical application and, owing to its spontaneous nature, poses interesting challenges automatic facial expression (AFER) research. Previous vs no-pain systems have highlighted two major challenges: (1) ground truth provided for the sequence, but presence or absence of target given frame unknown, and (2) time point duration event(s) in each video are unknown. To address these issues we propose novel framework (referred as MS-MIL) where sequence represented bag containing multiple segments, instance learning (MIL) employed handle this weakly labeled data form level ground-truth. These segments generated via clustering running multi-scale temporal scanning window, using state-of-the-art Bag Words (BoW) representation. This work extends idea detecting expressions through `concept frames' segments' argues extensive experiments that algorithms like MIL needed reap benefits such The key advantages our approach are: joint detection localization painful frames only sequence-level ground-truth, incorporation dynamics by representing not individual (3) extraction which well suited signals with uncertain location video. Experiments on UNBC-McMaster Shoulder Pain dataset highlight effectiveness achieving promising results problem videos.