作者: Xiaoping Wang , Yufeng Hao , Degang Fu , Chunwei Yuan
DOI: 10.1109/ICNNSP.2008.4590335
关键词: Hidden Markov model 、 Image processing 、 Feature extraction 、 Region of interest 、 Artificial intelligence 、 Discrete cosine transform 、 Edge detection 、 Computer vision 、 Pattern recognition 、 Edge enhancement 、 Image segmentation 、 Computer science
摘要: Region of interest (ROI) is the key basis visual features extraction in lip-reading process. In this paper, we discussed ROI processing method and explored its impact on recognition accuracy with comparison four kinds processed ROIs obtained by using basic image methods: gray-scale normalization, difference enhancement, edge enhancement segmentation. Then tasks for speaker-independent were carried out aid continuous hidden Markov model (CHMM). The experimental results show that discrete cosine transform (DCT) based features, normalized can achieve best performance among these ROIs.