About me
My name is Yihan Wu. I am a forth-year Ph.D. student in the Gaoling School of Artificial Intelligence, Renmin University of China. My advisor is Prof. Ruihua Song. Prior to my Ph.D. studies, I earned my B.S. degree from Shandong University in 2021.
I am broadly interested in speech related researches, including speech synthesis, speech recognition, and speech language models.
Recent news
- 🔍 I’m currently looking for summer internships!
- 🎉 Our paper “Enhancing Audiovisual Speech Recognition through Bifocal Preference Optimization” was accepted by AAAI 2025!
- 💼 I’m on the job market as well.
Work experience
Visiting Scholar
Language Technologies Institute, Carnegie Mellon University (Sep. 2023 - Sep. 2024)Worked with Prof. Shinji Watanabe.
Research Intern
Microsoft Research Asia (Oct. 2021 - Oct. 2022)Worked with Xu Tan.
Research Intern
Microsoft C+AI, Speech Team (May 2021 - Oct. 2021)
Selected papers
Please visit Google Scholar to see the full list.
Robust Audiovisual Speech Recognition Models with Mixture-of-Experts
Yihan Wu, Yifan Peng, Yichen Lu, Xuankai Chang, Ruihua Song, Shinji Watanabe
SLT 2024
paper
Espnet-codec: Comprehensive training and evaluation of neural codecs for audio, music, and speech
Jiatong Shi*, Jinchuan Tian*, Yihan Wu*, Jee-weon Jung, Jia Qi Yip, Yoshiki Masuyama, William Chen, Yuning Wu, Yuxun Tang, Massa Baali, Dareen Alharhi, Dong Zhang, Ruifan Deng, Tejes Srivastava, Haibin Wu, Alexander H Liu, Bhiksha Raj, Qin Jin, Ruihua Song, Shinji Watanabe
SLT 2024
paper
Tiva: Time-aligned video-to-audio generation
Xihua Wang*, Yuyue Wang*, Yihan Wu*, Ruihua Song, Xu Tan, Zehua Chen, Hongteng Xu, Guodong Sui
ACM MM 2024
paper
VideoDubber: Machine Translation with Speech-Aware Length Control for Video Dubbing
Yihan Wu, Junliang Guo, Xu Tan, Chen Zhang, Bohan Li, Ruihua Song, Lei He, Sheng Zhao, Arul Menezes, Jiang Bian
AAAI 2023
paper
Adaspeech 4: Adaptive text to speech in zero-shot scenarios
Yihan Wu, Xu Tan, Bohan Li, Lei He, Sheng Zhao, Ruihua Song, Tao Qin, Tie-Yan Liu
INTERSPEECH 2022
paper
Self-supervised context-aware style representation for expressive speech synthesis
Yihan Wu, Xi Wang, Shaofei Zhang, Lei He, Ruihua Song, Jian-Yun Nie
INTERSPEECH 2022
paper