I am a Research Scientist at Google Research, focusing on multimodal generative models and large multimodal language models. Prior to joining Google, I was a lead research scientist at JEN Music AI, where I led the development of text-to-music AI systems, built and managed the team, and spearheaded the creation of the JEN-1 series models. I hold a Ph.D. from the Australian Artificial Intelligence Institute (AAII) at the University of Technology Sydney (UTS), along with Ms.Eng and B.Eng degrees from Shanghai Jiao Tong University. My research interests span large-scale pre-training, transformer- and diffusion-based foundation models, with a particular emphasis on advancing AI-generated content across multiple modalities. My work has been published in top-tier venues including TPAMI, NeurIPS, CVPR, ICCV. I can work freely in the United States and Australia/New Zealand, and I welcome collaboration opportunities in both academia and industry.
To accelerate the advent of machine intelligence and sustainable future
PhD in Computer Science
University of Technology Sydney
Master of Engineering
Shanghai Jiao Tong University
Visiting Student
Karlsruher Institut für Technologie
Bachelor of Engineering
Shanghai Jiao Tong University