Device and method for processing audio information

作者: Satoshi Hasegawa

DOI:

关键词: AlgorithmSignal levelRadio spectrumScalingCoding (social sciences)Dynamic rangeElectronic engineeringComputer scienceThreshold limit valueAudio signal flow

摘要: There is provided a device wherein scaling section calculates factor, which indicates multiplying power to reference value, of each the subbands that are audio information divided into plurality frequency bands align dynamic range, and an outputted signal from coded by MPEG system, comprising level calculating feature detection processing section. The using factor section, after finding maximum value minimum calculated levels, difference therebetween, determines interval be voice when greater than or equal predetermined threshold other less value. Thereby, it becomes possible extract features input during executing coding processes information.

参考文章(7)
Michael C. Pitman, Robert S. Germain, Blake G. Fitch, Steven Abrams, Feature-based audio content identification ,(2002)
Gregory L. Zick, Lawrence Yapp, Speech recognition of MPEG/audio encoded files Journal of the Acoustical Society of America. ,vol. 112, pp. 2520- ,(2002) , 10.1121/1.1536509
Shuwu Wu, John Mantegna, Audio codec using adaptive sparse vector quantization with subband vector classification Journal of the Acoustical Society of America. ,vol. 108, pp. 886- ,(1997) , 10.1121/1.1319427
L. Yapp, G. Zick, Speech recognition on MPEG/Audio encoded files international conference on multimedia computing and systems. pp. 624- 625 ,(1997) , 10.1109/MMCS.1997.609787
Richard L. Sebastian, Laura J. Simkins, Stephen C. Kenyon, Broadcast information classification system and method ,(1988)