An image/speech relational database and its application

作者: D. Shah

DOI: 10.1049/IC:19961150

关键词: Motor theory of speech perceptionCued speechComputer scienceVoice activity detectionSpeech synthesisArtificial intelligenceSpeech recognitionIntelligibility (communication)Natural language processingSpeech processingSpeech corpusSpeech analytics

摘要: There are many reasons to investigate the simultaneous analysis of corresponding speech and image information. For example, in case video telephone/conferencing there is clearly a strong connection between sound phonemes voiced data mouth shape speaker. Also it known that most verbal communications use cues from both visual acoustic modalities convey messages. During production speech, visible information provided by external articulatory organs can influence understanding language interpreting combined into meaningful linguistic expressions. Although belief only hearing impaired make stimuli percepting reports have shown normal people all available accompany especially when degradation speech. Therefore objective this project quantify relationship such knowledge gained will assist longer term multimedia videophone research. To achieve above, statistical database key parameters derived channels discussed.

参考文章(0)