I am a 2nd year PhD student at the University of Edinburgh (start from Sept. 2023), a member of EdinburghNLP, supervised by Pasquale Minervini and Mirella Lapata. My research interests lie in foundation model pre-training and scalable mechanism interpretability. I am currently working to âopen the black boxâ to enable more efficient pre-training/inference.
I will start my internship at Microsoft Research Cambridge from May to August 2025. Feel free to reach out if youâd like to meet in person!
Selected Works
Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering
Yu Zhao, Alessio Devoto, Giwon Hong, Xiaotang Du, Aryo Pradipta Gema, Hongru Wang, Xuanli He, Kam-Fai Wong, Pasquale Minervini
NAACL 2025, Oral
A Simple and Effective L2 Norm-Based Strategy for KV Cache Compression
Alessio Devoto*, Yu Zhao*, Simone Scardapane, Pasquale Minervini
EMNLP 2024, Oral
Analysing The Impact of Sequence Composition on Language Model Pre-Training
Yu Zhao, Yuanbin Qu, Konrad Staniszewski, Szymon Tworkowski, Wei Liu, Piotr MiĆoĆ, Yuxiang Wu, Pasquale Minervini
ACL 2024, Oral
Analysing the Residual Stream of Language Models Under Knowledge Conflicts
Yu Zhao, Xiaotang Du, Giwon Hong, Aryo Pradipta Gema, Alessio Devoto, Hongru Wang, Xuanli He, Kam-Fai Wong, Pasquale Minervini
MINT @ NeurIPS 2024
Structured Packing in LLM Training Improves Long Context Utilization
Konrad Staniszewski, Szymon Tworkowski, Sebastian Jaszczur, Yu Zhao, Henryk Michalewski, Ćukasz KuciĆski, Piotr MiĆoĆ
AAAI 2025, Oral
Are We Done with MMLU?
Aryo Pradipta Gema, Joshua Ong Jun Leang, Giwon Hong, Alessio Devoto, Alberto Carlo Maria Mancino, Rohit Saxena, Xuanli He, Yu Zhao, Xiaotang Du, Mohammad Reza Ghasemi Madani, Claire Barale, Robert McHardy, Joshua Harris, Jean Kaddour, Emile van Krieken, Pasquale Minervini
NAACL 2025
Q-Filters: Leveraging QK Geometry for Efficient KV Cache Compression
Nathan Godey, Alessio Devoto, Yu Zhao, Simone Scardapane, Pasquale Minervini, Ăric de la Clergerie, BenoĂźt Sagot
Preprint 2025
Check out all my publications in Google Scholar or Semantic Scholar.