Penn Arts & Sciences Logo

AMCS Colloquium

Monday, February 10, 2025 - 11:00am

Yaoyu Zhang, Institute of Natural Sciences and the School of Mathematical Sciences

Shanghai Jiao Tong University

Location

University of Pennsylvania

LRSM Auditorium

Condensation (also known as quantization, weight clustering, or alignment) is a widely observed phenomenon where neurons in the same layer tend to align with one another during the nonlinear training of deep neural networks (DNNs). It is a key characteristic of the feature learning process of neural networks. However, due to the strong nonlinear nature of this phenomenon, establishing its theoretical understanding remains challenging. In this talk, I will present our systematic efforts to tackle this challenge in recent years. First, I will present results regarding the dynamical regime identification of condensation at the infinite width limit, where small initialization is crucial. Then, I will discuss the mechanism of condensation at the initial training stage and the global loss landscape structure underlying condensation in later training stages, highlighting the prevalence of condensed critical points and global minimizers. Finally, I will present results on the quantification of condensation and its generalization advantage, which includes a novel estimate of sample complexity in the best possible scenario. These results underscore the effectiveness of the phenomenological approach to understanding DNNs, paving the way for a deeper understanding of deep learning in the near future.

Other Events on This Day