作者: Eita Nakamura , Kazuyoshi Yoshii , Shigeki Sagayama
DOI: 10.1109/TASLP.2017.2662479
关键词: Hidden Markov model 、 Speech recognition 、 Rhythm 、 Piano 、 Inference 、 Source code 、 Computer science 、 Polyrhythm 、 Transcription (music) 、 MIDI
摘要: In a recent conference paper, we have reported rhythm transcription method based on merged-output hidden Markov model HMM that explicitly describes the multiple-voice structure of polyphonic music. This solves major problem conventional methods could not properly describe nature multiple voices as in polyrhythmic scores or phenomenon loose synchrony between voices. this present complete description proposed and develop an inference technique, which is valid for any HMMs, output probabilities depend past events. We also examine influence architecture parameters terms accuracies voice separation perform comparative evaluations with six other algorithms. Using MIDI recordings classical piano pieces, found outperformed by more than 12 points accuracy performances performed almost good best one non-polyrhythmic performances. reveals state-of-the-art first time literature. Publicly available source codes are provided future comparisons.