EMAGE: Towards Unified Holistic Co-Speech Gesture Generation via Expressive Masked Audio Gesture Modeling
(*Equal Contribution)
1The University of Tokyo,
2Keio University,
4Japan Advanced Institute of Science and Technology,
3Max Planck Institute for Intelligent Systems,
5Tsinghua University