Multimodal AI Lab @ KAIST

Jihoon Kim et al. (2024), "Let There Be Sound: Reconstructing High Quality Speech from Silent Videos", Proc. AAAI

Yeonghyeon Lee et al. (2024), "VoiceLDM: Text-to-Speech with Environmental Context", Proc. ICASSP

Youngjoon Jang et al. (2023), "That's What I Said: Fully-Controllable Talking Face Generation", Proc. ACMMM

Suyeon Lee et al. (2024), "Seeing Through the Conversation: Audio-Visual Speech Separation based on Diffusion Model", Proc. ICASSP

Sooyoung Park et al. (2024), "Can CLIP Help Sound Source Localization?", Proc. WACV

Arda Senocak et al. (2023), "Sound Source Localization is All about Cross-Modal Alignment", Proc. ICCV

Multimodal AI Lab