π Biography
I am a 3rd-year PhD student at the Department of Data Science & AI, Monash University (2023 - now), supervised by Prof. Gholamreza Haffari, Dr. Ehsan Shareghi and Dr. Lizhen Qu.
My research interest focuses on the safety issues on audio LMMs and speech-specific risks. We red team current audio LMMs to reveal potential vulnerabilities and explore effective safeguarding mechanism for constructing safer audio LMMs.
π₯ News
- 2025.08: Β ππ Our paper βReshaping Representation Space to Balance the Safety and Over-rejection in Large Audio Language Modelsβ has been accepted to EMNLP 2025. The code and dataset are coming soon at here.
- 2025.07: Β ππ Our paper βJigsaw Puzzles: Splitting Harmful Questions to Jailbreak Large Language Models in Multi-turn Interactionsβ has been accepted to COLM 2025.
- 2025.01: Β ππ The repository for βAudio Is the Achillesβ Heel: Red Teaming Audio Large Multimodal Modelsβ is available at here.
- 2025.01: Β ππ Our paper βAudio Is the Achillesβ Heel: Red Teaming Audio Large Multimodal Modelsβ has been accepted to NAACL 2025.
- 2024.12: Β ππ Our codes for βJigsaw Puzzles: Splitting Harmful Questions to Jailbreak Large Language Models in Multi-turn Interactionsβ are available at here.
- 2024.11: Β ππ Our speech-specific risk dataset is available at here.
- 2024.09: Β ππ Our paper βTowards Probing Speech-Specific Risks in Large Multimodal Models: A Taxonomy, Benchmark, and Insightsβ has been accepted to EMNLP 2024.
π Publications
-
[EMNLP 2025] Reshaping Representation Space to Balance the Safety and Over-rejection in Large Audio Language Models
Hao Yang, Lizhen Qu, Ehsan Shareghi, Gholamreza Haffari
-
[NAACL 2025] Audio Is the Achillesβ Heel: Red Teaming Audio Large Multimodal Models
Hao Yang, Lizhen Qu, Ehsan Shareghi, Gholamreza Haffari
-
[COLM 2025] Jigsaw Puzzles: Splitting Harmful Questions to Jailbreak Large Language Models in Multi-turn Interactions
Hao Yang, Lizhen Qu, Ehsan Shareghi, Gholamreza Haffari
-
[INTERSPEECH 2025] Continual Speech Learning with Fused Speech Features
Guitao Wang, Jinming Zhao, Hao Yang, Guilin Qi, Tongtong Wu, Gholamreza Haffari
-
[EMNLP 2024] Towards Probing Speech-Specific Risks in Large Multimodal Models: A Taxonomy, Benchmark, and Insights
Hao Yang, Lizhen Qu, Ehsan Shareghi, Gholamreza Haffari
-
[INTERSPEECH 2023] Investigating pre-trained audio encoders in the low-resource condition
Hao Yang, Jinming Zhao, Gholamreza Haffari, Ehsan Shareghi
-
[EMNLP 2022-Findings] Self-supervised rewiring of pre-trained speech encoders: Towards faster fine-tuning with less labels in speech processing
Hao Yang, Jinming Zhao, Gholamreza Haffari, Ehsan Shareghi
-
[EMNLP 2022-Findings] RedApt: An Adaptor for wav2vec 2 Encoding: Faster and Smaller Speech Translation without Quality Compromise
Jinming Zhao, Hao Yang, Gholamreza Haffari, Ehsan Shareghi
-
[INTERSPEECH 2022] M-adapter: Modality adaptation for end-to-end speech-to-text translation
Jinming Zhao, Hao Yang, Ehsan Shareghi, Gholamreza Haffari