作者: Niko Brummer , Lukas Burget , Jan Cernocky , Ondrej Glembek , Frantisek Grezl
关键词:
摘要: This paper describes and discusses the "STBU" speaker recognition system, which performed well in NIST Speaker Recognition Evaluation 2006 (SRE). STBU is a consortium of four partners: Spescom DataVoice (Stellenbosch, South Africa), TNO (Soesterberg, The Netherlands), BUT (Brno, Czech Republic), University Stellenbosch Africa). system was combination three main kinds subsystems: 1) GMM, with short-time Mel frequency cepstral coefficient (MFCC) or perceptual linear prediction (PLP) features, 2) Gaussian mixture model-support vector machine (GMM-SVM), using GMM mean supervectors as input to an SVM, 3) maximum-likelihood regression-support (MLLR-SVM), MLLR adaptation coefficients derived from English large vocabulary continuous speech (LVCSR) system. All subsystems made use supervector subspace channel compensation methods-either eigenchannel nuisance attribute projection. We document design performance all subsystems, their fusion calibration via logistic regression. Finally, we also present cross-site that done several additional systems other SRE-2006 participants.