Thesis Proposal: Nanxin Chen
Mar 18 @ 3:00 pm
Thesis Proposal: Nanxin Chen

Note: This is a virtual presentation. Here is the link for where the presentation will be taking place.

Title: Towards End-to-end Non-autoregressive speech applications

Abstract: Sequence labeling is a fascinating and challenging topic in the speech research community. The Sequence-to-sequence model is proposed for various sequence labeling tasks as a particularly popular end-to-end model. Autoregressive models are the dominant approach that predicts the label one by one, conditioning on previous results. This makes the training easier and more stable. However, this simplicity also results in inefficiency for the inference, particularly with those lengthy output sequences. To speed up the inference procedure, researchers start to be interested in another type of sequence-to-sequence model, known as non-autoregressive models. In contrast to the autoregressive models, non-autoregressive models predict the whole sequence within a constant number of iterations.

In this proposal, two different types of non-autoregressive models for speech applications are proposed: mask-based approach and noise-based approach. To demonstrate the effectiveness of the two proposed methods, we explored their usage for two important topics: speech recognition and speech synthesis. Experiments reveal that the proposed methods can match the performance of state-of-the-art autoregressive models with a much shorter inference time.

Committee Members

  • Najim Dehak, Department of Electrical and Computer Engineering
  • Sanjeev Khudanpur, Department of Electrical and Computer Engineering
  • Hynek Hermansky, Department of Electrical and Computer Engineering
  • Jesús Villalba, Department of Electrical and Computer Engineering
Thesis Proposal: Blake Dewey
Mar 25 @ 3:00 pm
Thesis Proposal: Blake Dewey

Note: This is a virtual presentation. Here is the link for where the presentation will be taking place.

Title: Harmonization of Structural MRI for Consistent Image Analysis

Abstract: Magnetic resonance imaging (MRI) is a flexible, non-invasive medical imaging modality that uses strong magnetic fields and radio-frequency pulses to produce images with excellent contrast in the soft tissues of the body. MRI is commonly used in diagnosis and monitoring of many conditions, but is especially useful in disorders of the central nervous system, such as multiple sclerosis (MS), where the brain and spinal cord are heavily involved. An MRI scan normally contains a number of imaging volumes, where different pulse sequence parameters are selected to highlight different tissue properties. These volumes can then be used together to provide complimentary information about the imaged area. Flexible design of the imaging system allows for a variety of questions to be answered during a single scanning session, but also comes with a cost. As there are many parameters to define when designing an imaging sequence, there is no common standard that is widely used. These differences lead to variability in image appearance between manufacturers, imaging centers, and even individual scanners. As an example, a commonly acquired MR volume is a T1-weighted image, where differences in a specific magnetic property (longitudinal relaxation time or T1) is highlighted. However, this general effect can be achieved with a myriad of different pulse sequences even before the individual parameters are considered. This is perhaps most apparent in the difference between T1-weighted images with and without a preparatory inversion pulse, where images with an inversion pulse tend to have a much clearer contrast between grey and white matter in the brain. With the advent of advanced machine learning methods, variations such as the example above create a large problem, as accurate methods become closely tied to the data used to train them and any variation in inputs can have unknown effects on output quality. This problem sets the stage for image harmonization, where synthetic “harmonized” images are produced after acquisition to provide consistent inputs to image analysis routines.

This thesis aims to develop harmonization strategies for structural brain MR images that will allow for the synthesis of harmonized images from differing inputs. These images can then be used downstream in automated analysis pipelines, most commonly whole-brain segmentation for volumetric analysis. Recently, deep learning-based techniques have been shown to be excellent candidates in the realm of image synthesis and can be readily incorporated in harmonization tasks. However, this is complicated, as training data (especially in multi-site settings) is rarely available. This work will approach these problems by covering three main topics:

  1. Development of a supervised harmonization technique for structural MRI that utilizes overlapping subjects scanned using multiple protocols.
  2. Development of a semi-supervised learning strategy that exploits existing multi-contrast MRI information within a single scan session to perform the harmonization task without overlapping subjects.
  3. Demonstration of the feasibility of MRI harmonization from the viewpoint of clinical research through validation and investigation in real-world data samples.

Committee Members

  • Jerry L. Prince, Department of Electrical and Computer Engineering
  • Vishal M. Patel, Department of Electrical and Computer Engineering
  • Muyinatu A. Lediju Bell, Department of Electrical and Computer Engineering
  • Peter C.M. van Zijl, Department of Radiology and Radiological Sciences
  • Peter A. Calabresi, Department of Neurology
Thesis Proposal: Shoujing Guo
Mar 25 @ 3:00 pm
Thesis Proposal: Shoujing Guo

Note: This is a virtual presentation. Here is the link for where the presentation will be taking place.

Title: Intraoperative Optical Coherence Tomography Guided Deep Anterior Lamellar Keratoplasty

Abstract: Deep anterior lamellar keratoplasty (DALK) is a highly challenging procedure requiring micron accuracy to guide a “big bubble” needle into the stroma of the cornea down to Descemet’s Membrane (DM). It has important advantages over Penetrating keratoplasty (PK) including lower rejection rate, less endothelial cell loss, and increased graft survival. Currently, this procedure relies heavily on the visualization through a surgical microscope, the surgeon’s own surgical experience, and tactile feel to determine the relative position of the needle and DM. Optical coherence tomography (OCT) is a well-established, non-invasive optical imaging technology that can provide high-speed, high-resolution, three-dimension images of biological samples. Since it was first demonstrated in 1991, OCT has emerged as a leading technology for ophthalmic visualization, especially for retinal structures, and has been widely applied in ophthalmic surgery and research. Common-path (CP) OCT systems use single A-scan image to deduce the tissue layer information and can be operated at a much higher speed. This synergizes well with handheld tools and automated surgical systems which require fast response time. CP-OCT has been integrated into a wide range of microsurgical tools for procedures such as epiretinal membrane peeling and subretinal injection.

In this proposal, the common-path swept-source OCT system (CP-SSOCT) is proposed to guide DALK procedures. The OCT distal sensor integrated needle and OCT guided micro-control ocular surgical system (AUTO-DALK) will be designed and evaluated. This device will allow for the autonomous insertion of a needle for pneumo-dissection based on the depth-sensing results from the OCT system. An earlier prototype of AUTO-DALK was tested on the ex-vivo porcine cornea including the comparison of expert manual needle insertion. The result showed the precision and consistency of the needle placement were increased, which could lead to better visual outcomes and fewer complications. Future work will include improving the overall design for in-vivo testing and clinical use, advanced convolutional neural network based tracking, and system validation on larger sample size.

Committee Members

Jin U. Kang (adviser), Department of Electrical and Computer Engineering

Israel Gannot, Department of Electrical and Computer Engineering

Xingde Li, Department of Biomedical Engineering

Thesis Proposal: Alycen Wiacek
Apr 1 @ 3:00 pm
Thesis Proposal: Alycen Wiacek

Note: This is a virtual presentation. Here is the link for where the presentation will be taking place.

Title: Coherence-based learning from raw ultrasound data for breast mass diagnosis

Abstract: Breast cancer is the most prevalent cancer among women in the United States, with approximately one in eight women being diagnosed in their lifetimes. Imaging modalities such as mammography, MRI, and ultrasound are employed to non-invasively visualize breast masses in order to determine the need for a biopsy. However, each of these methods results in a significant number of patients requiring biopsies of benign masses. Ultrasound in particular is praised for its low cost, painlessness, and portability, yet the false positive rate of breast ultrasound can be as high as 93% depending on the type of mass in question. Most commonly, diagnosis is performed using the brightness-mode (B-mode) image present on most clinical ultrasound scanners, which transitions naturally to the use of B-mode images for segmentation and classification of breast masses. Ultimately, segmentation and classification of breast masses can be summarized as analysis of a grayscale image. While this approach has been successful, information is lost during the B-mode image formation process.

An alternative approach to the lossy process of information extraction from B-mode images is to leverage features (e.g., spatial coherence) of backscattered ultrasound waves to determine the content of a breast mass. I will first describe my contributions to improve the diagnostic quality of breast ultrasound images by leveraging spatial coherence information. Next, I will present my deep learning approach to overcome limitations with real-time implementation of coherence-based imaging techniques. Finally, I will present a new method to learn the high-dimensional features encoded within backscattered ultrasound waves in order to differentiate benign from malignant breast masses.

Committee Members

  • Muyinatu Bell, Department of Electrical and Computer Engineering
  • Vishal Patel, Department of Electrical and Computer Engineering
  • Najim Dehak, Department of Electrical and Computer Engineering
Thesis Proposal: Arlene Chiu
Apr 8 @ 3:00 pm
Thesis Proposal: Arlene Chiu

Note: This is a virtual presentation. Here is the link for where the presentation will be taking place.

Title: Engineering Colloidal Quantum-Confined Nanomaterials for Multi-junction Solar Cell Applications

Abstract: Current single junction solar cell technologies are rapidly approaching their theoretical limits of approximately 33% power conversion efficiency. Semiconductor nanoparticles such as colloidal quantum dots (CQDs) are of interest for photovoltaic applications due to their infrared absorption, size-tunable optical properties and low-cost solution processability. Lead sulfide (PbS) CQDs offer the potential to increase solar cell efficiencies via multi-junction architectures due to these properties. This project aims to develop new strategies for implementing PbS CQDs as a material for multi-junction architectures to improve solar cell efficiencies and expand potential applications.

The first phase of the proposed research begins with developing a better-performing single junction PbS CQD solar cell by improving the performance-limiting hole transport layer HTL) in these devices. We will employ two methods to improve and replace this layer. First, we will use sulfur infusion via electron beam evaporation to alter the stoichiometry of the standard HTL. We also plan to completely replace the standard HTL with 2D nanoflakes of tungsten diselenide, an atomically-thin semiconducting transition metal dichalcogenide. The second phase of the reserach involves developing a PbS CQD multi-junction solar cell, including a novel recombination layer. The third phase of the research involves developing a hybrid multi-junction strategy in which PbS CQD films employing photonic band engineering for spectral selectivity serve as the infrared cell and other materials serve as the visible cell. The ultimate goal of these three research phases is to use photonic and materials engineering to improve efficiency and flexibility in CQD-based multi-junction solar cells to meet the demand for affordable, sustainable solar energy.

Committee Members

  • Susanna Thon, Department of Electrical and Computer Engineering
  • Jacob Khurgin, Department of Electrical and Computer Engineering
  • Amy Foster, Department of Electrical and Computer Engineering
Thesis Proposal: Sanjukta Nandi Bose
Apr 15 @ 3:00 pm
Thesis Proposal: Sanjukta Nandi Bose

Note: This is a virtual presentation. Here is the link for where the presentation will be taking place.

Title: Early prediction of adverse clinical events and optimal intervention in ICUs

Abstract: Personalized healthcare is a rapidly evolving research area with tremendous potential for optimizing patient care strategies and improving patient outcomes. Traditionally, clinical decision making relies on assessment and intervention based on the collective experience of physicians. Using big-data analytics techniques, we can now harness data-driven models to enable early prediction of patients at risk of adverse clinical events. These predictive models can provide timely analytical information to physicians facilitating early therapeutic intervention and efficient management of patients in intensive care units (ICUs).

In addition to early prediction, it is equally important to optimize intervention strategies for critically ill patients. One such urgent need is to optimally oxygenate COVID-19 patients diagnosed with acute respiratory distress syndrome (ARDS). Moderate to severe ARDS patients generally require mechanical ventilation to improve oxygen saturation and to reduce the risk of organ failure and death. The most common ventilator settings across all modes of mechanical ventilation are positive end-expiratory pressure (PEEP) and fraction of inspired oxygen (FiO2). Increasing either of these settings is expected to increase oxygen saturation. However, prolonged ventilation of patients with high PEEP and FiO2 significantly increases the risk of ventilator associated lung injury. Therefore, an optimal strategy is required to improve patient outcomes.

This thesis presents two overarching aims: (1) early prediction of adverse events and (2) optimal intervention for mechanically ventilated patients. In contrast to fixed lead-time prediction models in prior work, our methodology proposes a new framework which hypothesizes the presence of a time-varying pre-event physiologic state that differentiates the target patients from the control group. We also present a unique approach to patient risk-stratification using unsupervised clustering technique that could enable identification of a high-risk group among all positive predicted cases with a positive predictive value of more than 93% when applied to multiple organ dysfunction prediction.

In the second aim, we propose a novel application of data-driven linear parameter varying systems to capture time-varying dynamics of oxygen saturation in response to ventilator settings with a changing physiological state of a patient and its comparison with linear time invariant models.  Most prior studies on closed loop ventilator control have used stepwise, rule-based procedures, fuzzy logic, and a combination of rule-based methods and proportional integral derivative (PID) controller for closed loop control of FiO2. Other studies have worked on control strategies based on ventilator measured variables and on various mathematical lung models. In contrast we design optimal closed-loop ventilator strategies that are model based. A simulation of optimal ventilation settings for maintaining desired oxygen saturation using feedback control of LPV systems is presented.

Committee Members

  • Raimond L. Winslow, Department of Biomedical Engineering
  • Sridevi V. Sarma, Department of Biomedical Engineering
  • Enrique Mallada, Department of Electrical Engineering
  • Melania M. Bembea, Department of Anesthesiology and Critical Care Medicine
Thesis Proposal: Michelle Graham
Apr 29 @ 3:00 pm
Thesis Proposal: Michelle Graham

Note: This is a virtual presentation. Here is the link for where the presentation will be taking place.

Title: Photoacoustic imaging to detect major blood vessels and nerves during neurosurgery and head and neck surgery

Abstract: Real-time intraoperative guidance during minimally invasive neurosurgical and head and neck procedures is often limited to endoscopy, CT-guided image navigation, and electromyography, which are generally insufficient to locate major blood vessels and nerves hidden by tissue. Accidental damage to these hidden structures has incidence rates of 6.8% in surgeries to remove pituitary tumors (i.e., endonasal transsphenoidal surgery) and 3-4% in surgeries to remove parotid tumors (i.e., parotidectomy), often resulting in severe consequences, such as patient blindness, paralysis, and death. Photoacoustic imaging is a promising emerging imaging technique to provide real-time guidance of subsurface blood vessels and nerves during these surgeries.

Limited optical penetration through bone and the presence of acoustic clutter, reverberations, aberration, and attenuation can degrade photoacoustic image quality and potentially corrupt the usefulness of this promising intraoperative guidance technique. In order to mitigate image degradation, photoacoustic imaging system parameters may be adjusted and optimized to cater to the specific imaging environment. In particular, parameter adjustment can be categorized into the optimization of photoacoustic signal generation and the optimization of photoacoustic image formation (i.e., beamforming) and image display methods.

In this talk, I will describe my contributions to leverage amplitude- and coherence-based beamforming techniques to improve photoacoustic image display for the detection of blood vessels during endonasal transsphenoidal surgery. I will then present my contributions to the derivation of a novel photoacoustic spatial coherence theory, which provides a fundamental understanding critical to the optimization of coherence-based photoacoustic images. Finally, I will present a plan to translate this work from the visualization of blood vessels during neurosurgery to the visualization of nerves during head and neck surgery. Successful completion of this work will lay the foundation necessary to introduce novel, intraoperative, photoacoustic image guidance techniques that will eliminate the incidence of accidental injury to major blood vessels and nerves during minimally invasive surgeries.

Committee Members:

  • Muyinatu Bell, Department of Electrical and Computer Engineering
  • Xindge Li, Department of Biomedical Engineering
  • Jin Kang, Department of Electrical and Computer Engineering
Thesis Proposal: Honghua Guan
Jul 6 @ 12:30 pm
Thesis Proposal: Honghua Guan

Note: This is a virtual presentation. Here is the link for where the presentation will be taking place.

Title: High-throughput Optical Explorer in Freely-behaving Rodents

Abstract: One critical goal for neuroscience is to explore the mechanisms underlying neuronal information processing. A suitable brain imaging tool is of great significance to be capable of recording clear neuronal signals over prolonged periods. Among different imaging modalities, multiphoton microscopy becomes the choice for in vivo brain applications owing to its subcellular resolution, optical sectioning and deep penetration. The current experimental routine, however, requires head-fixation of animals during data acquisition. This configuration will inevitably introduce unwanted stress and limit many behavior studies such as social interaction. The scanning two-photon fiberscope is a promising technical direction to bridge this gap. Benefiting from its ultra-compact design and light-weight, it is an ideal optical brain imaging modality to assess dynamic neuronal activities in freely-behaving rodents with subcellular resolution. One significant challenge with the compact scanning two-photon fiberscope is its suboptimal imaging throughput due to the limited choices of miniature optomechanical components.

In this project, we present a compact multicolor two-photon fiberscope platform. We achieve three-wavelength excitation by synchronizing the pulse trains from a femtosecond OPO and its pump. The imaging results demonstrate that we can excite several different fluorescent proteins simultaneously with an optimal excitation efficiency. In addition, we propose a deep neural network (DNN) based solution that significantly improves the imaging frame rate with minimal loss in image quality. This innovation enables 10-fold speed enhancement for the scanning two-photon fiberscope, making it feasible to perform video-rate (26 fps) two-photon imaging in freely-moving mice with excellent imaging resolution and SNR that were previously not possible.

Committee Members

  • Xingde Li, Department of Biomedical Engineering
  • Mark Foster, Department of Electrical and Computer Engineering
  • Jing U. Kang, Department of Electrical and Computer Engineering
  • Israel Gannot, Department of Electrical and Computer Engineering
  • Hui Lu, Department of Pharmacology and Physiology, George Washington University
Closing Ceremonies for Computational Sensing and Medical Robotics (CSMR) REU
Aug 6 @ 9:00 am – 3:00 pm

The closing ceremonies of the Computational Sensing and Medical Robotics (CSMR) REU are set to take place Friday, August 6 from 9am until 3pm at this Zoom link. Seventeen undergraduate students from across the country are eager to share the culmination of their work for the past 10 weeks this summer.

The schedule for the day is listed below, but each presentation is featured in more detail in the program. Please invite your students and faculty, and feel free to distribute this flyer to advertise the event.

We would love for everyone to come learn about the amazing summer research these students have been conducting!


2021 REU Final Presentations
Time Presenter Project Title Faculty Mentor Student/Postdoc/Research Engineer Mentors

Ben Frey


Deep Learning for Lung Ultrasound Imaging of COVID-19 Patients Muyinatu Bell Lingyi Zhao

Camryn Graham


Optimization of a Photoacoustic Technique to Differentiate Methylene Blue from Hemoglobin Muyinatu Bell Eduardo Gonzalez

Ariadna Rivera


Autonomous Quadcopter Flying and Swarming Enrique Mallada Yue Shen

Katie Sapozhnikov


Force Sensing Surgical Drill Russell Taylor Anna Goodridge

Savannah Hays


Evaluating SLANT Brain Segmentation using CALAMITI Jerry Prince Lianrui Zuo

Ammaar Firozi


Robustness of Deep Networks to Adversarial Attacks René Vidal Kaleab Kinfu, Carolina Pacheco
10:30 Break

Karina Soto Perez


Brain Tumor Segmentation in Structural MRIs Archana Venkataraman Naresh Nandakumar

Jonathan Mi


Design of a Small Legged Robot to Traverse a Field of Multiple Types of Large Obstacles Chen Li Ratan Othayoth, Yaqing Wang, Qihan Xuan

Arko Chatterjee


Telerobotic System for Satellite Servicing Peter Kazanzides, Louis Whitcomb, Simon Leonard Will Pryor

Lauren Peterson


Can a Fish Learn to Ride a Bicycle? Noah Cowan Yu Yang

Josiah Lozano


Robotic System for Mosquito Dissection Russel Taylor,

Lulian Lordachita

Anna Goodridge

Zulekha Karachiwalla


Application of dual modality haptic feedback within surgical robotic Jeremy Brown
12:15 Break

James Campbell


Understanding Overparameterization from Symmetry René Vidal Salma Tarmoun

Evan Dramko


Establishing FDR Control For Genetic Marker Selection Soledad Villar, Jeremias Sulam N/A

Chase Lahr


Modeling Dynamic Systems Through a Classroom Testbed Jeremy Brown Mohit Singhala

Anire Egbe


Object Discrimination Using Vibrotactile Feedback for Upper Limb Prosthetic Users Jeremy Brown

Harrison Menkes


Measuring Proprioceptive Impairment in Stroke Survivors (Pre-Recorded) Jeremy Brown



3:00 Winner Announced
Thesis Proposal: Jaejin Cho
Sep 23 @ 3:00 pm
Thesis Proposal: Jaejin Cho

Note: This is a virtual presentation. Here is the link for where the presentation will be taking place.

Title: Improving speaker embedding in speaker verification: Beyond speaker discrimanitive training

Abstract: Speaker verification (SV) is a task to verify a claimed identity from the voice signal. A well-performing SV system requires a method to transform a variable-length recording into a fixed-length representation (a.k.a. embedding vector), compacting the speaker biometric information that captures distinctive features over different speakers. There are two popular methods: i-vector and x-vector. Although i-vector is still used nowadays, x-vector outperforms i-vector in many SV tasks as deep learning research surges. The x-vector, however, has limitations, and we mainly tackle two of them in this proposal: 1) the embedding still includes information about the spoken text, 2) it cannot leverage data that do not have speaker labels since the training requires the labels.

In the first half, we tackle the text-dependency in the x-vector speaker embedding. Spoken text remaining in x-vector can degrade its performance in text-independent SV because utterances of the same speaker may have different embeddings due to different spoken text. This could lead to a false rejection, i.e., the system rejects a valid target speaker. To tackle this issue, we propose to disentangle the spoken text and speaker identity into separate latent factors using a text-to-speech (TTS) model. First, the multi-speaker end-to-end TTS system has text and speech encoders, each of which focuses on encoding information in its corresponding modality. These encoders enable text-independent speaker embedding learning by reconstructing the frames of a target speech segment, given a speaker embedding of another speech segment of the same utterance. Second, many efforts to the neural TTS research over recent years have improved the speech synthesis quality. We hypothesize that speech synthesis and speaker embedding qualities positively correlate since the speaker encoder in a TTS system needs to learn well for better speech synthesis of multiple speakers. We confirm the above two points through a series of experiments.

In the second half, we focus on leveraging unlabeled data to learn embedding. Considering that much more unlabeled data exists than labeled data, leveraging the unlabeled data is essential, which is not straightforward with the x-vector training. This, however, is possible with the proposed TTS method. First, we show how to use the TTS method for this purpose. The results show that it can leverage the unlabeled data, but it still requires some labeled data to post-process the embeddings for the final SV system. To develop a completely unsupervised SV system, we apply a self-supervised technique proposed in computer vision research, distillation with no labels (DINO), and compare this to the TTS method. The results show that the DINO method outperforms the TTS method in unsupervised scenarios and enables SV with no labels.

Future work will focus on 1) exploring the DINO-based method in semi-supervised scenarios, 2) fine-tuning the network for downstream tasks such as emotion recognition.

Committee Members

  • Najim Dehak, Department of Electrical and Computer Engineering
  • Jesús Villalba, Department of Electrical and Computer Engineering
  • Sanjeev Khudanpur, Department of Electrical and Computer Engineering
  • Hynek Hermansky, Department of Electrical and Computer Engineering
  • Laureano Moro-Velazquez, Department of Electrical and Computer Engineering
Back to top