Penn Arts & Sciences Logo

Friday, October 19, 2018 - 2:00pm

Weijie Su

University of Pennsylvania, Statistics Department

Location

University of Pennsylvania

A6 DRL

Gradient-based optimization methods have been increasingly modeled and interpreted using ordinary differential equations (ODEs). Existing ODEs in the literature are, however, inadequate to distinguish between two fundamentally different methods, Nesterov’s acceleration gradient method for strongly convex functions (NAG-SC) and Polyak’s heavy-ball method. In this talk, we derive high-resolution ODEs as more accurate surrogates for the two methods and Nesterov’s acceleration gradient method for (weakly) convex functions (NAG-C). These novel ODEs can be integrated into a general framework that allows for a fine-grained analysis of the discrete optimization algorithms through translating properties of the amenable ODEs into those of their discrete counterparts. As the first application of this framework, we identify the effect of a term referred to as “gradient correction” in NAG-SC but not in the heavy-ball method, shedding insight into why the former achieves acceleration while the latter does not. Moreover, in this high-resolution ODE framework, NAG-C is shown to boost the squared gradient norm minimization at the inverse cubic rate, which is the sharpest known rate concerning NAG-C itself. Finally, by modifying the high-resolution ODE of NAG-C, we obtain a family of new optimization methods that are shown to maintain the accelerated convergence rates as NAG-C for smooth convex functions.

This is based on joint work with Stephen Boyd, Emmanuel Candes, Simon Du, Michael Jordan, and Bin Shi.