Yuchao Gu

Research Scientist at NVIDIA

Email: yuchaogu9710 [at] gmail.com


Biography

I am a research scientist within the Efficient AI team at NVIDIA Research, working with Prof. Song Han. Prior to that, I completed my Ph.D. at Show Lab @ NUS, advised by Prof. Mike Shou. I obtained my M.S. from Nankai University in 2022, advised by Prof. Ming-Ming Cheng, and my B.E. from Beijing University of Chemical Technology in 2019, advised by Prof. Wei Hu.

My research focuses on visual generative modeling, especially in efficient generative models, representation for generation and controllable visual generation: My current interest lies in diffusion model post-training. I am open to collaboration and discussion, so please feel free to reach out!

Invited Talk

News

Selected Publications

⚡ Efficient Generative Models
AnyFlow: Any-Step Video Diffusion Model with On-Policy Flow Map Distillation
Yuchao Gu, Guian Fang, Yuxin Jiang, Weijia Mao, Song Han, Han Cai and Mike Zheng Shou.

Arxiv, 2026
[project] [paper] [demo] [code]GitHub Repo stars

DC-Gen: Post-Training Diffusion Acceleration with Deeply Compressed Latent Space
Wenkun He*, Yuchao Gu*, Junyu Chen*, Dongyun Zou, Yujun Lin, Zhekai Zhang, Haocheng Xi, Muyang Li, Ligeng Zhu, Jincheng Yu, Junsong Chen, Enze Xie, Song Han and Han Cai.

Arxiv, 2025
[project] [paper] [code]GitHub Repo stars

Long-Context Autoregressive Video Modeling with Next-Frame Prediction
Yuchao Gu, Weijia Mao and Mike Zheng Shou.

Arxiv, 2025
[project] [paper] [code]GitHub Repo stars

🧩 Representation for Generation
Olaf-World: Orienting Latent Actions for Video World Modeling
Yuxin Jiang, Yuchao Gu, Ivor W. Tsang and Mike Zheng Shou.

International Conference on Machine Learning (ICML), 2026
[project] [paper] [code]GitHub Repo stars

Show-O: One Single Transformer to Unify Multimodal Understanding and Generation
Jinheng Xie, Weijia Mao, Zechen Bai, David Junhao Zhang, Weihao Wang, Kevin Qinghong Lin, Yuchao Gu, Zhijie Chen, Zhenheng Yang and Mike Zheng Shou.

ICLR, 2025
[project] [paper] [code]GitHub Repo stars

Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation
David Junhao Zhang, Jay Zhangjie Wu, Jia-Wei Liu, Rui Zhao, Lingmin Ran, Yuchao Gu, Difei Gao and Mike Zheng Shou.

IJCV, 2024
[project] [paper] [code]GitHub Repo stars

Rethinking the Objectives of Vector-Quantized Tokenizers for Image Synthesis
Yuchao Gu, Xintao Wang, Yixiao Ge, Ying Shan, Xiaohu Qie and Mike Zheng Shou.

IEEE Computer Vision and Pattern Recognition Conference (CVPR), 2024
[paper]

VQFR: Blind Face Restoration with Vector-Quantized Dictionary and Parallel Decoder
Yuchao Gu, Xintao Wang, Liangbin Xie, Chao Dong, Gen Li, Ying Shan and Ming-Ming Cheng.

European Conference on Computer Vision (ECCV, Oral), 2022
[project] [paper] [code]GitHub Repo stars

🎮 Controllable Visual Generation
ROICtrl: Boosting Instance Control for Visual Generation
Yuchao Gu, Yipin Zhou, Yunfan Ye, Yixin Nie, Licheng Yu, Pingchuan Ma, Kevin Qinghong Lin and Mike Zheng Shou.

IEEE Computer Vision and Pattern Recognition Conference (CVPR), 2025
[project] [paper] [code]GitHub Repo stars

MotionDirector: Motion Customization of Text-to-Video Diffusion Models
Rui Zhao, Yuchao Gu, Jay Zhangjie Wu, David Junhao Zhang, Jiawei Liu, Weijia Wu, Jussi Keppo and Mike Zheng Shou.

European Conference on Computer Vision (ECCV, Oral), 2024
[project] [paper] [code]GitHub Repo stars

VideoSwap: Customized Video Subject Swapping with Interactive Semantic Point Correspondence
Yuchao Gu, Yipin Zhou, Bichen Wu, Licheng Yu, Jiawei Liu, Rui Zhao, Jay Zhangjie Wu, David Junhao Zhang, Mike Zheng Shou and Kevin Tang.

IEEE Computer Vision and Pattern Recognition Conference (CVPR), 2024
[project] [paper] [code]GitHub Repo stars

Mix-of-Show: Decentralized Low-Rank Adaptation for Multi-Concept Customization of Diffusion Models
Yuchao Gu, Xintao Wang, Jay Zhangjie Wu, Yujun Shi, Yunpeng Chen, Zihan Fan, Wuyou Xiao, Rui Zhao, Shuning Chang, Weijia Wu, Yixiao Ge, Ying Shan and Mike Zheng Shou.

Neural Information Processing Systems (NeurIPS), 2023
[project] [paper] [code]GitHub Repo stars

Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation
Jay Zhangjie Wu, Yixiao Ge, Xintao Wang, Weixian Lei, Yuchao Gu, Wynne Hsu, Ying Shan, Xiaohu Qie and Mike Zheng Shou.

International Conference on Computer Vision (ICCV), 2023
[project] [paper] [code]GitHub Repo stars

Service

Acknowledgment

I have been fortunate to work with these wonderful people who generously provided me with collaboration and mentorship.

@ ARC Lab, Tencent
2021.10 - 2023.6

Xintao Wang
Yixiao Ge

@ GenAI, Meta
2023.6 - 2024.10

Yipin Zhou
Bichen Wu
Licheng Yu

@ NVIDIA Research
2025.6 - 2026.2

Han Cai
Prof. Song Han



© Yuchao Gu