When: Mar 28 2024 @ 1:30 PM
Where: Olin 305
Categories:

Location: Olin 305

When: March 28th at 1:30 p.m.

Title: Deep neural network stability at initialization: Nonlinear activations impact on the Gaussian process

Abstract: Randomly initialised deep neural networks are known to generate a Gaussian process for their pre-activation intermediate layers.  We will review this line of research with extensions to deep networks having structured random entries such as block-sparse or low-rank weight matrices.  We will then discuss how the choice of nonlinear activations impacts the evolution of the Gaussian process.  Specifically we will discuss why sparsifying nonlinear activations such as soft thresholding are unstable, we will show conditions to overcome such issues, and we will show how non-sparsifying activations can be improved to be more stable when acting on a data manifold.

Bio: Prof. Jared Tanner is the Professor of the Mathematics of Information at the University of Oxford Mathematical Institute where he leads the Machine Learning and Data Science Research Group.  His research focus is on the design, analysis, and application of algorithms for the processing of information.  His current focus is understanding how to improve the stability and computational efficiency of deep networks.  This includes both theory and experimental work on the design of network activations and weight structures.  He is currently applying these techniques in medical imaging, multispectral sensing, and hardware aware deep learning algorithms.  Previous research contributions include theory, algorithms, and applications of compressed sensing, matrix completion, low-rank plus sparse models, and grid free super-resolution.

Zoom link: https://wse.zoom.us/j/94601022340