Calendar

Nov
5
Thu
Thesis Proposal: Jeff Craley
Nov 5 @ 3:00 pm
Thesis Proposal: Jeff Craley

Note: This is a virtual presentation. Here is the link for where the presentation will be taking place. 

Title: Localizing Seizure Foci with Deep Neural Networks and Graphical Models

Abstract: Worldwide estimates of the prevalence of epilepsy range from 1-3% of the total population, making it one of the most common neurological disorders. With its wide prevalence and dramatic effects on quality of life, epilepsy represents a large and ongoing public health challenge. Critical to the treatment of focal epilepsy is the localization of the seizure onset zone. The seizure onset zone is defined as the region of the cortex responsible for the generation of seizures. In the clinic, scalp electroencephalography (EEG) recording is the first modality used to localize the seizure onset zone.

My work focuses on developing machine learning techniques to localize this zone from these recordings. Using Bayesian techniques, I will present graphical models designed to captures the observed spreading of seizures in clinical EEG recordings. These models directly encode clinically observed seizure spreading phenomena to capture seizure onset and evolution. Using neural networks, the raw EEG signal is evaluated is evaluated for seizure activity. In this talk I will propose extensions to these techniques employing semi-supervised learning and architectural improvements for training sophisticated neural networks designed to analyze scalp EEG signals. In addition, I will propose modeling improvements to current graphical models for evaluating the confidence of localization results.

Committee Members

Archana Venkataraman (Department of Electrical and Computer Engineering)

Sri Sarma (Department of Biomedical Engineering)

Rene Vidal (Department of Biomedical Engineering)

Richard Leahy (Department of Electrical Engineering Systems – University of Southern California)

Thesis Proposal: Yan Jiang
Nov 5 @ 3:00 pm
Thesis Proposal: Yan Jiang

Note: This is a virtual presentation. Here is the link for where the presentation will be taking place.

Title: Leveraging Inverter-Based Frequency Control in Low-Inertia Power Systems

Abstract: The shift from conventional synchronous generation to renewable converter-interfaced sources has led to a noticeable degradation of power system frequency dynamics. Fortunately, recent technology advancements in power electronics and electric storage facilitate the potential to enable higher renewable energy penetration by means of inverter-interfaced storage units. With proper control approaches, fast inverter dynamics can ensure the rapid response of storage units to mitigate degradation. A straightforward choice is to emulate the damping effect and/or inertial response of synchronous generators through droop control or virtual inertia, yet they do not necessarily fully exploit the benefits of inverter-interfaced storage units. For instance, droop control sacrifices steady-state effort share to improve dynamic performance, while virtual inertia amplifies frequency measurement noise. This work thus seeks to challenge this naive choice of mimicking synchronous generator characteristics and instead advocate for a principled control design perspective. To achieve this goal, we build our analysis upon quantifying power network dynamic performance using $\mathcal L_2$ and $\mathcal L_\infty$ norms so as to perform a systematic study evaluating the effect of different control approaches on both frequency response metrics and storage economic metrics. The main contributions of this project will be as follows: (i) We will propose a novel dynamic droop control approach, for grid following inverters, that can be tuned to achieve low noise sensitivity, fast synchronization, and Nadir elimination, without affecting the steady-state performance; (ii) We will propose a new frequency shaping control approach that allows to trade-off between the rate of change of frequency (RoCoF) and storage conrol effort; (iii) We will further extend the proposed solutions to operate in a grid-forming setting that is suitable for a non-stiff power grid where the amplitude and frequency of grid voltage is not well-regulated.

Committee Members

Enrique Mallada (Department of Electrical & Computer Engineering)

Pablo A. Iglesias (Department of Electrical & Computer Engineering)

Dennice F. Gayme (Department of Mechanical Engineering)

Nov
19
Thu
Thesis Proposal: Puyang Wang
Nov 19 @ 3:00 pm
Thesis Proposal: Puyang Wang

Note: This is a virtual presentation. Here is the link for where the presentation will be taking place.

Title: Accelerating Magnetic Resonance Imaging using Convolutional Recurrent Neural Networks

Abstract: Fast and accurate MRI image reconstruction from undersampled data is critically important in clinical practice. Compressed sensing based methods are widely used in image reconstruction but the speed is slow due to the iterative algorithms. Deep learning based methods have shown promising advances in recent years. However, recovering the fine details from highly undersampled data is still challenging. Moreover, Current protocol of Amide Proton Transfer-weighted (APTw) imaging commonly starts with the acquisition of high-resolution T2-weighted (T2w) images followed by APTw imaging at particular geometry and locations (i.e. slice) determined by the acquired T2w images. Although many advanced MRI reconstruction methods have been proposed to accelerate MRI, existing methods for APTw MRI lack the capability of taking advantage of structural information in the acquired T2w images for reconstruction. In this work, we introduce a novel deep learning-based method with Convolutional Recurrent Neural Networks (CRNN) to reconstruct the image from multiple scales. Finally, we explore the use of the proposed Recurrent Feature Sharing (RFS) reconstruction module to utilize intermediate features extracted from the matched T2w image by CRNN so that the missing structural information can be incorporated into the undersampled APT raw image thus effectively improving the image quality of the reconstructed APTw image.

Committee Members

Vishal M. Patel, Department of Electrical and Computer Engineering

Rama Chellappa, Department of Electrical and Computer Engineering

Shanshan Jiang, Department of Radiology and Radiological Science

Dec
3
Thu
Thesis Proposal: Xing Di
Dec 3 @ 3:00 pm
Thesis Proposal: Xing Di

Note: This is a virtual presentation. Here is the link for where the presentation will be taking place.

Title: Deep Learning-based Heterogeneous Face Recognition

Abstract: Face Recognition (FR) is one of the most widely studied problems in computer vision and biometrics research communities due to its applications in authentication, surveillance, and security.  Various methods have been developed over the last two decades that specifically attempt to address the challenges such as aging, occlusion, disguise, variations in pose, expression, and illumination. In particular,  convolutional neural network (CNN) based  FR methods have gained significant traction in recent years.  Deep CNN-based methods have achieved impressive performances on the current FR benchmarks.  Despite the success of CNN-based methods in addressing various challenges in FR, they are fundamentally limited to recognizing face images that are collected near-infrared spectrum. In many practical scenarios such as surveillance in low-light conditions, one has to detect and recognize faces that are captured using thermal cameras.  However, the performance of many deep learning-based methods degrades significantly when they are presented with thermal face images.

Thermal-to-visible face verification is a challenging problem due to the large domain discrepancy between the modalities. Existing approaches either attempt to synthesize visible faces from thermal faces or extract robust features from these modalities for cross-modal matching. We present a work in which we use attributes extracted from visible images to synthesize the attribute-preserved visible images from thermal imagery for cross-modal matching. A pre-trained VGG-Face network is used to extract the attributes from the visible image. Then, a novel multi-scale generator is proposed to synthesize the visible image from the thermal image guided by the extracted attributes. Finally, a pre-trained VGG-Face network is leveraged to extract features from the synthesized image and the input visible image for verification.

Committee Members

Rama Chellappa, Department of Electrical and Computer Engineering

Carlos Castillo, Department of Electrical and Computer Engineering

Vishal Patel, Department of Electrical and Computer Engineering

Dec
10
Thu
Thesis Proposal: Yufan He
Dec 10 @ 3:00 pm
Thesis Proposal: Yufan He

Note: This is a virtual presentation. Here is the link for where the presentation will be taking place.

Title: Retina OCT image analysis using deep learning methods

Abstract: Optical coherence topography (OCT) is a non-invasive imaging modality which uses low-coherence light waves to take cross-sectional images of optical scattering media (e.g., the human retina). OCT has been widely used in diagnosing retinal and neural diseases by imaging the human retina. The thickness of retina layers are important biomarkers for neurological diseases like multiple sclerosis (MS). The peripapillary retinal nerve fiber layer (pRNFL) and ganglion cell plus inner plexiform layer (GCIP) thickness can be used to assess global disease progression of MS patient. Automated OCT image analysis tools are critical for quantitatively monitoring disease progression and explore biomarkers. With the development of more powerful computational resources, deep learning based methods have achieved much better performance in accuracy, speed, and algorithm flexibility for many image analysis tasks. However, these emerging deep learning methods are not satisfactory when directly applied to OCT image analysis tasks like retinal layer segmentation if not using task specific knowledge.

This thesis aims to develop a set of novel deep learning based methods for retinal OCT image analysis. Specifically, we are focusing on retinal layer segmentation from macular OCT images. Image segmentation is the process of classifying each pixel in a digital image into different classes. Deep learning methods are powerful classifiers in pixel classification, but it is hard to incorporate explicit rules. For retinal layer OCT images, pixels belonging to different layer classes must satisfy the anatomical hierarchy (topology): pixels of the upper layers should have no overlap or gap with pixels of layers beneath it. This topological criterion is usually achieved by sophisticated post-processing methods, which current deep learning method cannot guarantee. To solve this problem, we aim to:

  • Develop an end-to-end deep learning segmentation method with guaranteed layer segmentation topology for retinal OCT images.

The deep learning model’s performance will degrade badly when test data is generated differently from the training data; thus, we aim to

  • Develop domain adaptation methods to increase robustness of the deep learning methods to OCT images generated differently from network training data.

The deep learning pipeline will be used to analyze longitudinal OCT images for MS patients, where the subtle changes due to the MS should be captured; thus, we aim to:

  • Develop a longitudinal OCT image analysis pipeline for consistent longitudinal segmentation with deep learning.

 

Jan
28
Thu
Thesis Proposal: Poojan Oza
Jan 28 @ 3:00 pm
Thesis Proposal: Poojan Oza

Note: This is a virtual presentation. Here is the link for where the presentation will be taking place.

Title: Detecting Unknown Instances Using CNNs

Abstract: Deep convolutional neural networks (DCNNs) have shown impressive performance improvements for object detection and recognition problems.  However, a vast majority of DCNN-based recognition methods are designed for a closed world, where the primary assumption is that all categories are known a priori. In many real-world applications, this assumption does not necessarily hold. Generally, incomplete knowledge of the world is present at training time, and unknown classes can be submitted to an algorithm during testing. The goal of a visual recognition system is then to reject samples from unknown classes and classify samples from known classes.

In the first part of my talk, I will present new DCNNs for anomaly detection based on one-class classification. The main idea is to use a zero centered Gaussian noise in the feature space as the pseudo-negative class and train the network using the cross-entropy loss.  Also, a method in which both classifier and feature representations are learned together in an end-to-end fashion will be presented.  In the second part of the talk, I will present a multi-class category detection using a network which utilizes both global and local information to predict whether the test image belongs to one of the known classes or an unknown category.  Specifically, the models is trained using a network to perform image-level category prediction and another network to perform patch-level category prediction.  We evaluate the effectiveness all these methods on multiple publicly available datasets and show that these approaches achieve better performance compared to previous state-of-the-art methods.

Committee Members

  • Rama Chellappa, Department of Electrical and Computer Engineering
  • Carlos Castillo, Department of Electrical and Computer Engineering
  • Vishal Patel, Department of Electrical and Computer Engineering
Mar
4
Thu
Thesis Proposal: Jonathan Chang
Mar 4 @ 3:00 pm
Thesis Proposal: Jonathan Chang

Note: This is a virtual presentation. Here is the link for where the presentation will be taking place.

Title: Student-Teacher Learning Techniques for Bilingual and Low Resource OCR

Abstract: Optical Character Recognition (OCR) is the automatic generation of a transcription given a line image of text. Current methods have been very successful on printed English text, with Character Error Rates of less than 1¥%. However, clean datasets are not commonly seen in real life applications. There is a move in OCR towards `text in the wild’, conditions where there are lower resolution images like store fronts, street sign, and billboards. Oftentimes these texts contain multiple scripts, especially in countries where multiple languges are spoken. In addition, Latin characters are wildly seen no matter what language. The presence of multilingual text poses a unique challenge.

Traditional OCR methods involve text localization, script identification, and then text recognition. A separate system is used in each task and the results from one system are passed to the next. However, the downside of this pipeline approach is that errors propagate downstream and there is no way of providing feedback upstream. These downsides can be mitigated with fully integrated approaches, where one large system does text localization, script identification, and text recognition jointly. These approaches are also sometimes known as end-to-end approaches in literature.

With larger and larger networks, there is also a need for a greater amount of training data. However, this data may be difficult to obtain if the target language is low resource. There are also problems if the data that is obtained is in a slightly different domain, for example, printed versus handwritten text. This is where synthetic data generation techniques and domain adaptation techniques can be helpful.

Given these current challenges in OCR, this thesis proposal is focused on training an integrated (ie: end-to-end) bilingual systems and domain adaptation techniques. Both these objectives can be achieved using student-teacher learning methods. The basics of this approach is to have a trained teacher model add an additional loss function while training a student model. The outputs of the teacher will be used as soft targets for the student to learn. The following experiments will be performed:

  • Create monolingual baselines
  • Create bilingual baselines that do not require script identification.
  • Use Student-Teacher techniques to train bilingual models via teacher models specialized on different languages.
  • Use Student-Teacher techniques to train monolingual baselines with teacher models trained on out-of-domain data.

Committee Members

  • Sanjeev Khudanpur, Department of Electrical and Computer Engineering
  • Najim Dehak, Department of Electrical and Computer Engineering
  • Jesús Villalba, Department of Electrical and Computer Engineering
Mar
11
Thu
Thesis Proposal: Shuwen Wei
Mar 11 @ 3:00 pm
Thesis Proposal: Shuwen Wei

Note: This is a virtual presentation. Here is the link for where the presentation will be taking place.

Title: Optical coherence tomography signal processing in complex domain

Abstract: Optical coherence tomography (OCT) plays an indispensable role in clinical fields such as ophthalmology and dermatology. Over the past 30 years, OCT has gone through tremendous developments, which come with both hardware improvements and novel signal processing techniques. Hardware improvements such as the use of adaptive optics (AO) and the use of vertical-cavity surface-emitting laser (VCSEL) help push the fundamental limits of OCT imaging capability. Novel signal processing techniques aim to push the imaging capability beyond current hardware architecture limitations. Often, novel signal processing techniques achieve better performances than hardware modifications while keeping the cost to the lowest. The purpose of this dissertation proposal is to develop novel OCT signal processing techniques that provide new imaging capabilities and overcome current imaging limitations.

OCT signal, as the result of the interference between the sample back-scattering light and the reference light, is complex and contains both amplitude and phase information. The amplitude information is mostly used for OCT structural imaging, while the phase information is mostly used for OCT functional imaging. Usually, the amplitude-based methods are more robust since they are less prone to noise, while the phase-based methods are better in quantifying precision measurements since they are more sensitive to micro displacements. This dissertation proposal focuses on three advanced OCT signal processing techniques in both amplitude and phase domain.

The first signal processing technique proposed is the amplitude-based BC-mode OCT image visualization for microsurgery guidance, where multiple sparsely sampled B-scans are combined to generate a single cross-section image with enhanced instrument and tissue layer visibility and reduced shadowing artifacts. The performance of the proposed method is demonstrated by guiding a 30-gauge needle into an ex-vivo human cornea.

The second signal processing technique proposed is the amplitude-based optical flow OCT (OFOCT) for determining accurate velocity fields. Modified continuity constraint is used to compensate the Fourier-domain OCT (FDOCT) sensitivity fall-off. Spatial-temporal smoothness constraints are used to make the optical flow problem well-posed and reduce noises in the velocity fields. The accuracy of the proposed method is verified through phantom flow experiments by using a diluted milk powder solution as the scattering medium, in both cases of advective flow and turbulent flow.

The third signal processing technique proposed is phase-based. A wrapped Gaussian mixture model (WGMM) is proposed to stabilize the phase of swept-source OCT (SSOCT) systems. The OCT signal phase is divided into several components and each component is fully analyzed. The WGMM is developed based on the previous analysis. A closed-form iteration solution of the WGMM is derived using the expectation-maximization (EM) algorithm. The performance of the proposed method is demonstrated through OCT imaging of ex-vivo mice cornea and anterior chamber.

For all the three proposed methods above, process has been made in theoretical modeling, numerical implementations, and experimental verifications. All the algorithms have been implemented in the graphic processing unit (GPU) in the OCT system for real-time data processing. Preliminary results demonstrate good performances of these proposed methods. The final thesis work will include optimizing the proposed methods and applying the implemented algorithms to both ex-vivo and in-vivo biomedical research for the overall system testing and analysis.

Committee Members

  • Jin U. Kang (Advisor), Department of Electrical and Computer Engineering
  • Trac D. Tran, Department of Electrical and Computer Engineering
  • Xingde Li, Department of Biomedical Engineering
Mar
18
Thu
Thesis Proposal: Nanxin Chen
Mar 18 @ 3:00 pm
Thesis Proposal: Nanxin Chen

Note: This is a virtual presentation. Here is the link for where the presentation will be taking place.

Title: Towards End-to-end Non-autoregressive speech applications

Abstract: Sequence labeling is a fascinating and challenging topic in the speech research community. The Sequence-to-sequence model is proposed for various sequence labeling tasks as a particularly popular end-to-end model. Autoregressive models are the dominant approach that predicts the label one by one, conditioning on previous results. This makes the training easier and more stable. However, this simplicity also results in inefficiency for the inference, particularly with those lengthy output sequences. To speed up the inference procedure, researchers start to be interested in another type of sequence-to-sequence model, known as non-autoregressive models. In contrast to the autoregressive models, non-autoregressive models predict the whole sequence within a constant number of iterations.

In this proposal, two different types of non-autoregressive models for speech applications are proposed: mask-based approach and noise-based approach. To demonstrate the effectiveness of the two proposed methods, we explored their usage for two important topics: speech recognition and speech synthesis. Experiments reveal that the proposed methods can match the performance of state-of-the-art autoregressive models with a much shorter inference time.

Committee Members

  • Najim Dehak, Department of Electrical and Computer Engineering
  • Sanjeev Khudanpur, Department of Electrical and Computer Engineering
  • Hynek Hermansky, Department of Electrical and Computer Engineering
  • Jesús Villalba, Department of Electrical and Computer Engineering
Mar
24
Wed
55th Annual Conference on Information Sciences and Systems (CISS 2021)
Mar 24 – Mar 26 all-day

55th Annual Conference on Information Sciences and Systems (CISS)

March 24, 25, & 26, 2021

Hosted by the
Department of Electrical and Computer Engineering, Johns Hopkins University
and Technical Co-sponsorship by the IEEE Information Theory Society

CISS 2021 is a forum for scientists, engineers, and academics to present their latest research results and developments in multiple areas of Information Sciences and Systems. Authors will present unpublished papers describing theoretical advances, applications, and ideas in the fields of:

  • Information Theory
  • Communications
  • Energy Systems
  • Signal Processing
  • Image Processing,
  • Coding, Systems and Control
  • Optimization
  • Quantum Systems
  • Machine Learning
  • Security and Privacy
  • Statistical Inference
  • Biological Systems
  • Neuroscience
Back to top