作者: Mikel Penagarikano , Luis Javier Rodriguez-Fuentes , German Bordel , Mireia Diez , Amparo Varona
DOI:
关键词: Czech 、 Average cost 、 Language recognition 、 Software 、 Natural language processing 、 Computer science 、 Artificial intelligence 、 Speech recognition 、 Set (abstract data type) 、 NIST 、 Measure (data warehouse) 、 Duration (project management)
摘要: This paper describes the systems developed by Software Technologies Working Group (http://gtts.ehu.es) of University Basque Country for 2011 NIST Language Recognition Evaluation. Four different (one primary and three contrastive) were submitted, consisting a fusion five subsystems: Linearized Eigenchannel GMM (LE-GMM) subsystem, an iVector subsystem phone-lattice-SVM subsystems based on publicly available BUT decoders Czech, Hungarian Russian. The four submitted identical except backend approach development dataset used to estimate parameters. Multiclass was performed separately each nominal duration. A set defined, including evaluation sets LRE07 LRE09 data provided 9 additional languages in LRE11. Systems evaluated 10 random partitions set, using one half estimating parameters other testing. average cost as defined LRE11 plan performance measure. system yielded actual 0.038 (±0.002), being Hindi-Urdu, far, most challenging pair, with 0.222.