The L2F Broadcast News Speech Recognition System

作者: Hugo Meinedo , Thomas Pellegrini , Alberto Abad , Inesc-Id Lisboa , Isabel Trancoso

DOI:

关键词:

摘要: Abstract Broadcast news play an important role in our lives provid-ing access to news, information and entertainment. The ex-istence of automatic transcription is mediumthat not only can provide subtitles for inclusion people withspecial needs or be advantage on noisy populated envi-ronments, but also because it enables data search retrievecapabilities over the multimedia streams. In this work we willdescribe evaluate speech recognition systemsdeveloped two Iberian languages, European Portuguese andSpanish Brazilian Portuguese, African Portugueseand English. developed systems are fully andcapable subtitling real-time News stream with avery small delay.Index Terms: Speech Recognition, News, Iberianlanguages, Accent, Online processing 1. Introduction (BN) system at theSpoken Language Systems Lab INESC-ID integrates sev-eral core technologies, a pipeline architecture: jingle detec-tion, audio segmentation, recognition, punc-tuation, capitalization, topic segmentation/indexation, summa-rization, translation. first modules wereoptimized on-line performance, given their deployment inthe that isrunning main shows public TV channel inPortugal (RTP), since March 2008.To knowledge, majority de-scribed literature rely speech-to-text alignment ratherthan full [1]. Re-speakers alsoare commonly used simplify original speech, speechrecognition engines adapted captioner voice [2].This paper concerns third module -speech emphasizing most recent improvements,and efforts port other languages (English Span-ish), varieties namely those spokenin South American continents.The development new language chal-lenging task due need acoustic training data, vo-cabulary definition, lexicon generation model es-timation [3].The starts description ofour engine, indepen-dent components - feature extraction decoder. nextthree sections devoted three Portuguesecovered by system: one (European Portuguese,henceforth designated as EP), (BP), andAfrican (AP). porting twolanguages Spanish English) Sections 6 7, respectively. For each these sec-tions, shall detail corpora, vocabulary, lexical andlanguage generation, ending performance results.The final section discusses advantages shortcom-ings systems, what real time closecaptioning applications.

参考文章(7)
Céu Viana, Alberto Abad, Nelson Neto, Isabel Trancoso, Porting an european portuguese broadcast news recognition system to brazilian portuguese. conference of the international speech communication association. pp. 92- 95 ,(2009)
Céu Viana, Oscar Koller, Alberto Abad, Isabel Trancoso, Exploiting variety-dependent phones in portuguese variety identification applied to broadcast news transcription. conference of the international speech communication association. pp. 749- 752 ,(2010)
Eduardo Lleida, Antonio Miguel, Alfonso Ortega, José Enrique García Laínez, Real-time live broadcast news subtitling system for Spanish. conference of the international speech communication association. pp. 2095- 2098 ,(2009)
Sameer Badaskar, John Kominek, Alan W Black, Tanja Schultz, Matthew Hornyak, SPICE: Web-based Tools for Rapid Language Adaptation in Speech Processing Systems conference of the international speech communication association. pp. 2125- 2128 ,(2007)
Ciro Martins, Antonio Teixeira, Joao Neto, Dynamic language modeling for a daily broadcast news transcription system ieee automatic speech recognition and understanding workshop. pp. 165- 170 ,(2007) , 10.1109/ASRU.2007.4430103
J. Neto, H. Meinedo, M. Viveiros, R. Cassaca, C. Martins, D. Caseiro, Broadcast news subtitling system in Portuguese international conference on acoustics, speech, and signal processing. pp. 1561- 1564 ,(2008) , 10.1109/ICASSP.2008.4517921
D. Caseiro, I. Trancoso, A specialized on-the-fly algorithm for lexicon and language model composition IEEE Transactions on Audio, Speech, and Language Processing. ,vol. 14, pp. 1281- 1291 ,(2006) , 10.1109/TSA.2005.860838