作者: John See , Raphael C.W.Phan , Weiyao Lin , Huai-Qian Khor
DOI:
关键词:
摘要: Facial micro-expression (ME) recognition has posed a huge challenge to researchers for its subtlety in motion and limited databases. Recently, handcrafted techniques have achieved superior performance but at the cost of domain specificity cumbersome parametric tunings. In this paper, we propose an Enriched Long-term Recurrent Convolutional Network (ELRCN) that first encodes each frame into feature vector through CNN module(s), then predicts by passing Long Short-term Memory (LSTM) module. The framework contains two different network variants: (1) Channel-wise stacking input data spatial enrichment, (2) Feature-wise features temporal enrichment. We demonstrate proposed approach is able achieve reasonably good performance, without augmentation. addition, also present ablation studies conducted on visualizations what "sees" when predicting classes.