Loading Events

« All Events

  • This event has passed.

AMS Special Seminar Series | Soufiane Hayou

January 30 @ 10:00 am - 11:00 am

Location: Clark 110

When: January 30th at 10:00 a.m.

Title: A Unified Framework for Efficient Learning at Scale

Abstract: State-of-the-art performance in Deep Learning is usually achieved via a series of modifications to existing neural architectures and their training procedures. A common feature of these networks is their large-scale: modern neural networks usually have billions – if not hundreds of billions – of trainable parameters. While empirical evaluations generally support the claim that increasing the scale of neural networks (width, depth, etc) boosts model performance if done correctly, optimizing the training process across different scales remains a significant challenge, and practitioners tend to follow extrapolated scaling rules .
In this talk, I will present a unified framework for efficient learning at large scale. The framework allows us to derive efficient learning rules that automatically adjust to model scale, ensuring stability and optimal performance. By analyzing the interplay between network architecture, optimization dynamics, and scale, we demonstrate how these theoretically-grounded learning rules can be applied to both pretraining and finetuning. The results offer new insights into the fundamental principles governing neural network scaling and provide practical guidelines for training large-scale models efficiently.

Bio: Soufiane Hayou is currently a postdoc researcher at Simons Institute, UC Berkeley. He was a visiting assistant professor of mathematics at the National University of Singapore for the last 3 years. He obtained his PhD in statistics and machine learning in 2021 from the University of Oxford, and graduated from Ecole Polytechnique (France) in 2018 before joining Oxford. His research is mainly focused on the theory and practice of learning at scale: theoretical analysis of large scale neural networks with the goal of obtaining principled methods for training/finetuning. Topics include depth scaling (Stable ResNet), hyperparameter transfer (Depth-muP parametrization), efficient finetuning (LoRA+), etc.

Zoom link: https://wse.zoom.us/j/92755277282?pwd=iULpLaFnWAcWl6tQYUbeyZaN3zwBzn.1

Details

Date:
January 30
Time:
10:00 am - 11:00 am
Event Category:

Venue

Clark 110
3400 North Charles Street
Baltimore, Maryland 21218
+ Google Map