作者: H.P. Graf , E. Cosatto , T. Ezzat
关键词:
摘要: This paper describes techniques for extracting bitmaps of facial parts from videos a talking person. The goal is to synthesize photo-realistic heads high quality that show picture-perfect appearance and realistic head movements with good lip-sound synchronization. For the synthesis head, are combined form whole then sequences such images integrated audio text-to-speech synthesizer. seamless integration into an animation, their shape visual must be known accuracy. recognition system has find not only locations features, but also able determine head's orientation recognize expressions. Our face proceeds in multiple steps, each increased precision. Using motion, color information, position location main features determined first. Then smaller areas searched matched filters, order identify specific From this information 3D calculated. Facial cut image and, using orientation, warped 'normalized' scale.