Jing Yu Koh
jingyuk@cs.cmu.edu
I am a PhD student in the Machine Learning Department at Carnegie Mellon University, advised by Daniel Fried and Ruslan Salakhutdinov. I work on grounded language understanding, usually in the context of vision-and-language problems.
Prior to this, I was a Research Engineer (and previously an AI Resident) at Google Research in Jason Baldridge's team, where I worked on vision-and-language problems and generative models. Before that, I completed my undergraduate studies at the Singapore University of Technology and Design summa cum laude (highest honors) in 2019.
My first name is "Jing Yu" and informally I go by the nickname "JY". 许靖宇 is my name in Chinese. I'm from Singapore.

News
- (Sep 2023) 1 paper accepted to NeurIPS 2023!
- (Summer 2023) Giving an invited talk about GILL at DLCT and Cohere For AI (slides, recording).
- (Apr 2023) 1 paper accepted to ICML 2023!
- (Spring 2023) Gave invited talks at Microsoft Research, Apple AI/ML, Georgia Tech, and the London ML Meetup (recording, slides).
- (Dec 2022) I made a bet on LLM capabilities with my office mate Ben Chugg. Bubble tea is on the line.
- (Nov 2022) Parti was accepted to TMLR with a Featured Certification!
- (Oct 2022) In the spirit of paying it forward, I'm sharing my Statement of Purpose publicly. Hope it helps future applicants!
- (Jul 2022) After 2.73 wonderful years at Google, I've left to pursue my PhD at Carnegie Mellon University!
- (January 2022) 1 paper accepted to ICLR 2022!
- (December 2021) Serving as a reviewer for CVPR 2022.
- (July 2021) 1 paper accepted to ICCV 2021!
- (July 2021) Presenting an invited talk at Microsoft Research.
- (July 2021) Serving as a reviewer for NeurIPS 2021.
- (March 2021) 1 paper accepted to CVPR 2021!
- (January 2021) 1 paper accepted to ICLR 2021!
- (October 2020) 1 paper accepted to WACV 2021!
- (July 2020) 1 paper accepted to ECCV 2020!
- (October 2019) Officially joined Google as an AI Resident in Mountain View, California.
Selected Publications [Google Scholar]
2023

Generating Images with Multimodal Language Models
Advances in Neural Information Processing Systems (NeurIPS), 2023.

Grounding Language Models to Images for Multimodal Inputs and Outputs
The International Conference on Machine Learning (ICML), 2023.

VQ3D: Learning a 3D-Aware Generative Model on ImageNet
The International Conference on Computer Vision (ICCV), 2023.
2022
2021

Vector-quantized Image Modeling with Improved VQGAN
The International Conference on Learning Representations (ICLR), 2022.

Text-to-Image Generation Grounded by Fine-Grained User Attention
The IEEE Winter Conference on Applications of Computer Vision (WACV), 2021.