I am a 2nd year PhD student at the University of Edinburgh (start from Sept. 2023), a member of EdinburghNLP, supervised by Pasquale Minervini and Mirella Lapata. My research interests lie in foundation model pre-training and scalable mechanism interpretability. I am currently working to “open the black box” to enable more efficient pre-training/inference.

I will start my internship at Microsoft Research Cambridge from May to August 2025. Feel free to reach out if you’d like to meet in person!

Selected Works

Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering
Yu Zhao, Alessio Devoto, Giwon Hong, Xiaotang Du, Aryo Pradipta Gema, Hongru Wang, Xuanli He, Kam-Fai Wong, Pasquale Minervini
NAACL 2025, Oral

A Simple and Effective L2 Norm-Based Strategy for KV Cache Compression
Alessio Devoto*, Yu Zhao*, Simone Scardapane, Pasquale Minervini
EMNLP 2024, Oral

Analysing The Impact of Sequence Composition on Language Model Pre-Training
Yu Zhao, Yuanbin Qu, Konrad Staniszewski, Szymon Tworkowski, Wei Liu, Piotr MiƂoƛ, Yuxiang Wu, Pasquale Minervini
ACL 2024, Oral

Analysing the Residual Stream of Language Models Under Knowledge Conflicts
Yu Zhao, Xiaotang Du, Giwon Hong, Aryo Pradipta Gema, Alessio Devoto, Hongru Wang, Xuanli He, Kam-Fai Wong, Pasquale Minervini
MINT @ NeurIPS 2024

Structured Packing in LLM Training Improves Long Context Utilization
Konrad Staniszewski, Szymon Tworkowski, Sebastian Jaszczur, Yu Zhao, Henryk Michalewski, Ɓukasz KuciƄski, Piotr MiƂoƛ
AAAI 2025, Oral

Are We Done with MMLU?
Aryo Pradipta Gema, Joshua Ong Jun Leang, Giwon Hong, Alessio Devoto, Alberto Carlo Maria Mancino, Rohit Saxena, Xuanli He, Yu Zhao, Xiaotang Du, Mohammad Reza Ghasemi Madani, Claire Barale, Robert McHardy, Joshua Harris, Jean Kaddour, Emile van Krieken, Pasquale Minervini
NAACL 2025

Q-Filters: Leveraging QK Geometry for Efficient KV Cache Compression
Nathan Godey, Alessio Devoto, Yu Zhao, Simone Scardapane, Pasquale Minervini, Éric de la Clergerie, Benoüt Sagot
Preprint 2025


Check out all my publications in Google Scholar or Semantic Scholar.