Abstract: The purpose of this talk is to highlight three recent directions in the study of implicit bias, a promising approach to developing a tight generalization theory for deep networks interwoven with optimization. The first direction is a warm-up with purely linear predictors: here, the implicit bias perspective gives the fastest known hard-margin SVM solver! The second direction is on the early training phase with shallow networks: here, implicit bias leads to good training and testing error, with not just narrow networks but also arbitrarily large ones. The talk concludes with deep networks, providing a variety of structural lemmas that capture foundational aspects of how weights evolve for any width and sufficiently large amounts of training. This is joint work with Ziwei Ji.
To register: https://berkeley.zoom.us/webinar/register/WN_iEXcldw1QPOuUofhS0WT4g