Dissertation Defense: Arun Nair
Note: This is a virtual presentation. Here is the link for where the presentation will be taking place.
Title: Machine Learning for Beamforming in Ultrasound, Radar, and Audio
Abstract: Multi-sensor signal processing plays a crucial role in the working of several everyday technologies, from correctly understanding speech on smart home devices to ensuring aircraft fly safely. A specific type of multi-sensor signal processing called beamforming forms a central part of this thesis. Beamforming works by combining the information from several spatially distributed sensors to directionally filter information, boosting the signal from a certain direction but suppressing others. The idea of beamforming is key to the domains of ultrasound, radar, and audio.
Machine learning, succinctly defined by Tom Mitchell as “the study of algorithms that improve automatically through experience” is the other central part of this thesis. Machine learning, especially its sub-field of deep learning, has enabled breakneck progress in tackling several problems that were previously thought intractable. Today, machine learning powers many of the cutting edge systems we see on the internet for image classification, speech recognition, language translation, and more.
In this dissertation, we look at beamforming pipelines in ultrasound, radar, and audio from a machine learning lens and endeavor to improve different parts of the pipelines using ideas from machine learning. Starting off in the ultrasound domain, we use deep learning as an alternative to beamforming in ultrasound and improve the information extraction pipeline by simultaneously generating both a segmentation map and B-mode image of high quality directly from raw received ultrasound data.
Next, we move to the radar domain and study how deep learning can be used to improve signal quality in ultra-wideband synthetic aperture radar by suppressing radio frequency interference, random spectral gaps, and contiguous block spectral gaps. By training and applying the networks on raw single-aperture data prior to beamforming, it can work with myriad sensor geometries and different beamforming equations, a crucial requirement in synthetic aperture radar.
Finally, we move to the audio domain and derive a machine learning inspired beamformer to tackle the problem of ensuring the audio captured by a camera matches its visual content, a problem we term audiovisual zoom. Unlike prior work which is capable of only enhancing a few individual directions, our method enhances audio from a contiguous field of view.
- Trac Tran, Department of Electrical and Computer Engineering
- Muyinatu Bell, Department of Electrical and Computer Engineering
- Vishal Patel, Department of Electrical and Computer Engineering