作者: Sree Hari Krishnan Parthasarathi , Hynek Hermansky , None
DOI:
关键词:
摘要: We present a data-driven approach to weighting the temporal context of signal energy be used in simple speech/non-speech detector (SND). The optimal weights are obtained using linear discriminant analysis (LDA). Regularization is performed handle numerical issues inherent usage correlated features. so interpreted as filter modulation spectral domain. Experimental evaluations on test data set, terms average frame-level error rate over different SNR levels, show that proposed method yields an absolute performance gain $10.9%$, $17.5%$, $7.9%$ and $8.3%$ ITU's G.729B, ETSI's AMR1, AMR2 state-of-the-art multi-layer perceptron based system, respectively. This shows even feature such full-band energy, when employed with large-enough context, promise for applications.