A Tale of Preconditioning and Overparameterization in Ill-conditioned Low-rank Estimation
Speaker: Yuejie Chi – Pittsburgh, PA, United StatesTopic(s): Artificial Intelligence, Machine Learning, Computer Vision, Natural language processing , Computational Theory, Algorithms and Mathematics
Abstract
Many problems encountered in science and engineering can be formulated as estimating a low-rank object from incomplete, and possibly corrupted, linear measurements. Through the lens of matrix and tensor factorization, one of the most popular approaches is to employ simple iterative algorithms such as gradient descent to recover the low-rank factors directly, which allow for small memory and computation footprints. However, the convergence rate of gradient descent degenerates quickly with the condition number of the low-rank object, and therefore, slows down painstakingly when the problem is ill-conditioned. This talk introduces an algorithmic approach, dubbed scaled gradient descent (ScaledGD), that provably converges linearly at a constant rate independent of the condition number at near-optimal sample complexities, while maintaining the low per-iteration cost of gradient descent. In addition, ScaledGD continues to admit fast global convergence from a small random initialization when the rank is over-specified, confirming the benign generalization of ScaledGD in learning overparameterized models. In total, ScaledGD highlights the power of appropriate preconditioning in accelerating nonconvex statistical estimation and learning, where the iteration-varying preconditioners promote desirable invariance properties of the trajectory with respect to the symmetry in low-rank factorization.About this Lecture
Number of Slides: 40Duration: 45 minutes
Languages Available: English
Last Updated:
Request this Lecture
To request this particular lecture, please complete this online form.
Request a Tour
To request a tour with this speaker, please complete this online form.
All requests will be sent to ACM headquarters for review.