作者: Upendra V. Chaudhari , Stephane H. Maes , Jeffrey S. Sorensen , Homayoon S. Beigi
DOI:
关键词: Speaker verification 、 Speech processing 、 Hierarchical clustering 、 Computer science 、 Normalization (statistics) 、 Speech recognition 、 Speaker diarisation 、 Speaker recognition
摘要: A method for unsupervised environmental normalization speaker verification using hierarchical clustering is disclosed. Training data (speech samples) are taken from T enrolled (registered) speakers over any one of M channels, e.g., different microphones, communication links, etc. For each speaker, a model generated, containing collection distributions audio feature derived the speech sample that speaker. tree created, by merging similar models on layer basis. Each also grouped into cohort speakers. cohort, or more complementary generated outside cohort. When training new to be received channel, as well updated. Consequently, adaptation environments possible incorporating such whenever it encountered.