CS Colloquium: Advancing Safe Autonomous Driving In-the-Wild: Generative Learning, Trustworthiness, & Connectivity 

A portrait of Dr. Muhao Chen

Event Date

Location
Kemper Hall 1131

This talk explores key advancements in deploying practical autonomous driving systems for real-world conditions. We begin with OpenEMMA, our fully open-source, end-to-end multimodal model that mimics Waymo's closed-source EMMA model, aimed at democratizing research in large vision-language models. We then introduce a novel benchmark for assessing the trustworthiness of these Vision-Language Models (VLMs) in autonomous driving—focusing on safety, robustness, and privacy—thereby paving the way for secure foundation models for autonomy. Next, we address critical deployment challenges, including adverse weather conditions such as rain, snow, and haze, as well as difficult lighting and visibility environments. We will present two innovative solutions: Light Diffusion (CVPR'24), which leverages pre-trained diffusion priors to enhance low-light driving scenarios, and MWFormer (TIP'24), a foundational transformer model for prompt-based, all-in-one multi-weather image restoration. We then explore how connectivity and cooperative perception can overcome the limitations of single-agent systems in complex transportation networks. This includes V2X-ViT (ECCV'22), a unified transformer architecture designed for robust perception in noisy environments, and our recent framework (accepted to ICLR’25) for designing a scalable, task-agnostic collaborative perception protocol that fosters heterogeneous and secure mobility systems for future networks. Concluding, the talk looks to the future, proposing natural language as a universal communication protocol to unify automated vehicles, smart infrastructure, drones, and humans. This vision aims to advance human-centric autonomous driving amidst the complexities of the real world.

Bio

Dr. Zhengzhong Tu is an assistant professor in the Department of Computer Science and Engineering at Texas A&M University, since 2024. He received his Ph.D. degree in Electrical and Computer Engineering from the University of Texas at Austin, TX, USA, in 2022, advised by Cockrell Family Regents Endowed Chair Professor Alan Bovik. Before joining Texas A&M, Dr. Tu worked as a research engineer at Google Research. Dr. Tu is an expert in artificial intelligence and computer vision and has published more than 30 top-tier venues, including IEEE Transactions on Pattern Recognition and Machine Intelligence, IEEE Transactions on Image Processing, IEEE Open Journal of Signal Processing, IEEE Signal Processing Letters, and conference proceedings like NeurIPS, ICLR, CVPR, ECCV, ICCV, ICRA, WACV, CoRL, etc. He has co-organized the 2nd/3rd Workshop on Large Language and Vision Models for Autonomous Driving (LLVM-AD) at ITSC 2024/WACV 2025, the 1st Workshop on Distillation of Foundation Models for Autonomous Driving (WDFM-AD) at CVPR 2025, and the 2nd MetaFood Workshop at CVPR 2025. He has received the 1st place winning solution award for AI4Streaming 2024 UGC Video Quality Assessment Challenge at CVPR 2024, and the 1st place for the NTIRE 2025 challenge on short-form ugc video quality assessment and enhancement: track 2 KwaiSR. He has been serving as Area Chair for ICCV 2025, CVPR 2025 WDFM-AD, and NeurIPS 2025. His research contributions have received multiple recognitions, such as the CVPR 2022 Best Paper Nomination Award, headlining in Google Research Annual Blog, and featuring in Google I/O media outlets, as well as several press/media coverage by YOLOvX, Argo Vision, Hugging Face Daily Papers, etc.

Hosted by Dr. Muhao Chen (Department of Computer Science)

Event Category