
- This event has passed.
AMS Special Seminar Series | Pratik Patil
February 11 @ 3:00 pm - 4:00 pm
Location: Clark 110
When: February 11th at 3:00 p.m.
Abstract: Modern machine learning often operates in an overparameterized regime in which the number of parameters far exceeds the number of observations. In this regime, models can exhibit surprising generalization behaviors: (1) Models can overfit with zero training error yet still generalize well (benign overfitting); furthermore, in some cases, even adding and tuning explicit regularization can favor no regularization at all (obligatory overfitting). (2) The generalization error can vary non-monotonically with the model or sample size (double/multiple descent). These behaviors challenge classical notions of overfitting and the role of explicit regularization.
In this talk, I will present theoretical and methodological results related to these behaviors, primarily focusing on the concrete case of ridge regularization. First, I will identify conditions under which the optimal ridge penalty is zero (or even negative) and show that standard techniques such as leave-one-out and generalized cross-validation, when analytically continued, remain uniformly consistent for the generalization error and thus yield the optimal penalty, whether positive, negative, or zero. Second, I will introduce a general framework to mitigate double/multiple descent in the sample size based on subsampling and ensembling and show its intriguing connection to ridge regularization. As an implication of this connection, I will show that the generalization error of optimally tuned ridge regression is monotonic in the sample size (under mild data assumptions) and mitigates double/multiple descent. Key to both parts is the role of implicit regularization, either self-induced by the overparameterized data or externally induced by subsampling and ensembling. Finally, I will briefly mention some extensions and variants beyond ridge regularization.
The talk will feature joint work with the following collaborators (in alphabetical order): Pierre Bellec, Jin-Hong Du, Takuya Koriyama, Arun Kumar Kuchibhotla, Alessandro Rinaldo, Kai Tan, Ryan Tibshirani, Yuting Wei. The corresponding papers (in talk-chronological order) are: optimal ridge landscape (https://pratikpatil.io/papers/ridge-ood.pdf), ridge cross-validation (https://pratikpatil.io/papers/functionals-combined.pdf), risk monotonization (https://pratikpatil.io/papers/risk-monotonization.pdf), ridge equivalences (https://pratikpatil.io/papers/generalized-equivalences.pdf), and extensions and variants (https://pratikpatil.io/papers/cgcv.pdf, https://pratikpatil.io/papers/subagging-asymptotics.pdf).
Zoom link: https://wse.zoom.us/j/92755277282?pwd=iULpLaFnWAcWl6tQYUbeyZaN3zwBzn.1