作者: Samuel Thomas , Sriram Ganapathy , Hynek Hermansky
DOI: 10.1109/ICASSP.2012.6288862
关键词: Low resource 、 Speech recognition 、 Natural language processing 、 Feature extraction 、 Vocabulary 、 Computer science 、 Perceptron 、 Artificial intelligence 、 Training set
摘要: We introduce a new approach to training multilayer perceptrons (MLPs) for large vocabulary continuous speech recognition (LVCSR) in languages which have only few hours of annotated in-domain data (for example, 1 hour data). In our approach, amounts out-of-domain from multiple are used train multilingual MLP systems without dealing with the different phoneme sets these languages. Features extracted LVCSR low-resource language similar Tandem approach. experiments, proposed features provide relative improvement about 30% an setting one data.