CS Ph.D. student at HKU
I am a first-year Ph.D. student in Computer Science at The University of Hong Kong (HKU), privileged to be jointly supervised by Prof. Reynold C.K. Cheng, Prof. Francis C.M. Lau, and Prof. Yupeng Li. Before joining HKU, I had the honor of working under Prof. Zhizheng Wu at the Chinese University of Hong Kong, Shenzhen and the Shanghai AI Laboratory, where I contributed as a core developer and maintainer for the open-source project Amphion, and Prof. Zhen Ming (Jack) Jiang at York University as a Mitacs Research Intern.
I am the creator of Emilia, a leading dataset in expressive and spontaneous text-to-speech (TTS) synthesis, along with its preprocessing pipeline, Emilia-Pipe. As of December 2024, Emilia has been downloaded over 150k times by more than 700 research institutions/companies, including Stanford, CMU, OpenAI, Google, and NVIDIA. It has become a foundational training dataset for state-of-the-art TTS models such as F5-TTS, MaskGCT, and Vevo.
My current research interests revolve around Social Computing and Large Language Models (LLMs), where I aim to leverage LLMs to address critical societal challenges such as misinformation, fake news, and deepfakes.
* denotes equal contribution, † denotes corresponding.
Speech Generation 🎤
Emilia: An Large-Scale Extensive, Multilingual, and Diverse Speech Dataset for Speech Generation
Haorui He*, Zengqiang Shang*, Chaoren Wang*, Xuyuan Li*, Yicheng Gu, Hua Hua, Liwei Liu, Chen Yang, Jiaqi Li, Peiyang Shi, Yuancheng Wang, Kai Chen, Pengyuan Zhang, and Zhizheng Wu.
IEEE SLT 2024:
Extended Version (In Submission of IEEE TASLP):
Overview of the Amphion Toolkit (v0.2)
Jiaqi Li*, Xueyao Zhang*, Yuancheng Wang*, Haorui He*, Chaoren Wang*, Li Wang*, Huan Liao*, Junyi Ao*, Zeyu Xie*, Yiqiao Huang*, Junan Zhang*, Zhizheng Wu
Technical Report:
Amphion: An Open-Source Audio, Music and Speech Generation Toolkit
Xueyao Zhang*, Liumeng Xue*, Yicheng Gu*, Yuancheng Wang*, Haorui He, Chaoren Wang, Xi Chen, Zihao Fang, Haopeng Chen, Junan Zhang, Tze Ying Tang, Lexiao Zou, Mingxuan Wang, Jun Han, Kai Chen, Haizhou Li, and Zhizheng Wu.
IEEE SLT 2024:
SpMis: An Investigation of Synthetic Spoken Misinformation Detection
Peizhuo Liu, Li Wang, Renqiang He, Haorui He, Lei Wang, Huadi Zheng, Jie Shi, Tong Xiao, Zhizheng Wu.
IEEE SLT 2024 (Best Paper Finalist, Top 2.5%): [Paper]
Noro: A Noise-Robust One-shot Voice Conversion System with Hidden Speaker Representation Capabilities
Haorui He, Yuchen Song, Yuancheng Wang, Haoyang Li, Xueyao Zhang, Li Wang, Gongping Huang, Eng Siong Chng, Zhizheng Wu.
In Submission: [Paper]
Social Computing 🗞️
MCFEND: A Multi-source Benchmark Dataset for Chinese Fake News Detection
Yupeng Li, Haorui He†, Jin Bai, and Dacheng Wen.
ACM WWW 2024 (Oral Presentation, Top 9.4%): [Paper]
Contextual Target-Specific Stance Detection on Twitter: New Dataset and Method
Yupeng Li, Dacheng Wen, Haorui He, and Francis C. M. Lau.
IEEE ICDM 2023 (Regular Long Paper, Top 9.37%): [Paper]
Improved Target-specific Stance Detection on Social Media Platform by Delving into Conversation Threads
Yupeng Li, Haorui He†, Shaonan Wang, Francis C.M. Lau, and Yunya Song.
IEEE TCSS: [Paper]
Aug. 2023 -- Aug. 2024: Research Intern, supervised by Prof. Zhizheng Wu at CUHK-Shenzhen & Shanghai AI Laboratory.
Sep. 2022 -- Dec. 2022: MITACS-CSC Joint Globalink Research Intern, supervised by Prof. Zhen Ming (Jack) Jiang at YorkU. (Thanks to CSC and Mitacs for kindly sponsoring my internship).
June 2021 -- Aug. 2023: Research Intern, supervised by Prof. Francis C.M. Lau at HKU & Prof. Yupeng Li at HKBU.