EMAGE: Towards Unified Holistic Co-Speech Gesture Generation via Expressive Masked Audio Gesture Modeling

(*Equal Contribution)
1The University of Tokyo, 2Keio University, 4Japan Advanced Institute of Science and Technology,
3Max Planck Institute for Intelligent Systems, 5Tsinghua University


Holistic Dataset

Face Zoom In

Generated Results

Demo Video