Event Date
Dr. Yuchen Liang
AI-EDGE Postdoctoral Researcher
The Ohio State University
Abstract
Discrete diffusion models have shown strong performance on discrete data such as text, graphs, and molecules. In this talk, I will discuss two key design dimensions of such models: the choice of rate matrices and discretized samplers.
In the first part, I will focus on rate matrices. While empirical results suggest that masked diffusion models (using the absorbing rate matrices) outperform those using uniform ones, theoretical understanding has been limited to the latter. We present the first finite-time error bounds and convergence rate analysis for masked diffusion models. Our analysis derives a non-divergent upper bound on the KL divergence of the forward process via a surrogate initialization distribution. Then, we establish convergence guarantees for both $\tau$-leaping and uniformization samplers—showing improved rates over the uniform case. Furthermore, under suitable assumptions, we provide convergence guarantees without early stopping. Overall, our analysis introduces several new technical tools to address challenges unique to absorbing rate matrices.
The second part addresses discretized samplers. We develop a new analytical framework that removes restrictive regularity assumptions and improves the dependence on vocabulary size from quadratic to linear. Our approach is also more broadly applicable: it provides the first convergence guarantees for other widely used samplers, including the Euler method and Tweedie $\tau$-leaping. Central to our approach is a novel technique based on differential inequalities, offering a more flexible alternative to the traditional Girsanov change-of-measure methods. This technique may also be of independent interest for the analysis of other stochastic processes.
Bio
Yuchen Liang is an AI-EDGE Postdoctoral Researcher at The Ohio State University and is currently on the faculty job market. He received his Ph.D. in Electrical and Computer Engineering from the University of Illinois Urbana-Champaign in 2023. His research focuses on diffusion models, generative modeling, and statistical signal processing, with a particular emphasis on the theoretical foundations of discrete and continuous diffusion processes. His work has been published in venues such as NeurIPS, ICLR, AISTATS, and IEEE Transactions on Information Theory and Signal Processing.