Note: This is a virtual presentation. Here is the link for where the presentation will be taking place.
Title: Deep Learning-based Heterogeneous Face Recognition
Abstract: Face Recognition (FR) is one of the most widely studied problems in computer vision and biometrics research communities due to its applications in authentication, surveillance, and security. Various methods have been developed over the last two decades that specifically attempt to address the challenges such as aging, occlusion, disguise, variations in pose, expression, and illumination. In particular, convolutional neural network (CNN) based FR methods have gained significant traction in recent years. Deep CNN-based methods have achieved impressive performances on the current FR benchmarks. Despite the success of CNN-based methods in addressing various challenges in FR, they are fundamentally limited to recognizing face images that are collected near-infrared spectrum. In many practical scenarios such as surveillance in low-light conditions, one has to detect and recognize faces that are captured using thermal cameras. However, the performance of many deep learning-based methods degrades significantly when they are presented with thermal face images.
Thermal-to-visible face verification is a challenging problem due to the large domain discrepancy between the modalities. Existing approaches either attempt to synthesize visible faces from thermal faces or extract robust features from these modalities for cross-modal matching. We present a work in which we use attributes extracted from visible images to synthesize the attribute-preserved visible images from thermal imagery for cross-modal matching. A pre-trained VGG-Face network is used to extract the attributes from the visible image. Then, a novel multi-scale generator is proposed to synthesize the visible image from the thermal image guided by the extracted attributes. Finally, a pre-trained VGG-Face network is leveraged to extract features from the synthesized image and the input visible image for verification.
Carlos Castillo, Department of Electrical and Computer Engineering
Vishal Patel, Department of Electrical and Computer Engineering
Note: This is a virtual presentation. Here is the link for where the presentation will be taking place.
Title: Retina OCT image analysis using deep learning methods
Abstract: Optical coherence topography (OCT) is a non-invasive imaging modality which uses low-coherence light waves to take cross-sectional images of optical scattering media (e.g., the human retina). OCT has been widely used in diagnosing retinal and neural diseases by imaging the human retina. The thickness of retina layers are important biomarkers for neurological diseases like multiple sclerosis (MS). The peripapillary retinal nerve fiber layer (pRNFL) and ganglion cell plus inner plexiform layer (GCIP) thickness can be used to assess global disease progression of MS patient. Automated OCT image analysis tools are critical for quantitatively monitoring disease progression and explore biomarkers. With the development of more powerful computational resources, deep learning based methods have achieved much better performance in accuracy, speed, and algorithm flexibility for many image analysis tasks. However, these emerging deep learning methods are not satisfactory when directly applied to OCT image analysis tasks like retinal layer segmentation if not using task specific knowledge.
This thesis aims to develop a set of novel deep learning based methods for retinal OCT image analysis. Specifically, we are focusing on retinal layer segmentation from macular OCT images. Image segmentation is the process of classifying each pixel in a digital image into different classes. Deep learning methods are powerful classifiers in pixel classification, but it is hard to incorporate explicit rules. For retinal layer OCT images, pixels belonging to different layer classes must satisfy the anatomical hierarchy (topology): pixels of the upper layers should have no overlap or gap with pixels of layers beneath it. This topological criterion is usually achieved by sophisticated post-processing methods, which current deep learning method cannot guarantee. To solve this problem, we aim to:
The deep learning model’s performance will degrade badly when test data is generated differently from the training data; thus, we aim to
The deep learning pipeline will be used to analyze longitudinal OCT images for MS patients, where the subtle changes due to the MS should be captured; thus, we aim to:
Note: This is a virtual presentation. Here is the link for where the presentation will be taking place.
Title: Medical Image Modality Synthesis and Resolution Enhancement Based on Machine Learning Techniques
Abstract: To achieve satisfactory performance from automatic medical image analysis algorithms such as registration or segmentation, medical imaging data with the desired modality/contrast and high isotropic resolution are preferred, yet they are not always available. We addressed this problem in this thesis using 1) image modality synthesis and 2) resolution enhancement.
The first contribution of this thesis is computed tomography (CT)-to-magnetic resonance imaging (MRI) image synthesis method, which was developed to provide MRI when CT is the only modality that is acquired. The main challenges are that CT has poor contrast as well as high noise in soft tissues and that the CT-to-MR mapping is highly nonlinear. To overcome these challenges, we developed a convolutional neural network (CNN) which is a modified U-net. With this deep network for synthesis, we developed the first segmentation method that provides detailed grey matter anatomical labels on CT neuroimages using synthetic MRI.
The second contribution is a method for resolution enhancement for a common type of acquisition in clinical and research practice, one in which there is high resolution (HR) in the in-plane directions and low resolution (LR) in the through-plane direction. The challenge of improving the through-plane resolution for such acquisitions is that the state-of-art convolutional neural network (CNN)-based super-resolution methods are sometimes not applicable due to lack of external LR/HR paired training data. To address this challenge, we developed a self super-resolution algorithm called SMORE and its iterative version called iSMORE, which are CNN-based yet do not require LR/HRpaired training data other than the subject image itself. SMORE/iSMOREcreate training data from the HR in-plane slices of the subject image itself, then train and apply CNNs to through-plane slices to improve spatial resolution and remove aliasing. In this thesis, we perform SMORE/iSMORE on multiple simulated and real data sets to demonstrate their accuracy and generalizability. Also, SMORE as a preprocessing step is shown to improve segmentation accuracy.
In summary, CT-to-MR synthesis, SMORE, and iSMORE were demonstrated in this thesis to be effective preprocessing algorithms for visual quality and other automatic medical image analysis such as registration or segmentation.
Jerry Prince, Department of Electrical and Computer Engineering
John Goutsias, Department of Electrical and Computer Engineering
Trac Tran, Department of Electrical and Computer Engineering
Title: Detecting Unknown Instances Using CNNs
Abstract: Deep convolutional neural networks (DCNNs) have shown impressive performance improvements for object detection and recognition problems. However, a vast majority of DCNN-based recognition methods are designed for a closed world, where the primary assumption is that all categories are known a priori. In many real-world applications, this assumption does not necessarily hold. Generally, incomplete knowledge of the world is present at training time, and unknown classes can be submitted to an algorithm during testing. The goal of a visual recognition system is then to reject samples from unknown classes and classify samples from known classes.
In the first part of my talk, I will present new DCNNs for anomaly detection based on one-class classification. The main idea is to use a zero centered Gaussian noise in the feature space as the pseudo-negative class and train the network using the cross-entropy loss. Also, a method in which both classifier and feature representations are learned together in an end-to-end fashion will be presented. In the second part of the talk, I will present a multi-class category detection using a network which utilizes both global and local information to predict whether the test image belongs to one of the known classes or an unknown category. Specifically, the models is trained using a network to perform image-level category prediction and another network to perform patch-level category prediction. We evaluate the effectiveness all these methods on multiple publicly available datasets and show that these approaches achieve better performance compared to previous state-of-the-art methods.
Title: Towards building a clinically-inspired ultrasound innovation hub: Design, Development and Clinical Validation of novel Ultrasound hardware for Imaging, Therapeutics, Sensing and other applications.
Abstract: Ultrasound is a relatively established modality with a number of exciting, yet not fully explored applications, ranging from imaging and image-guided navigation, to tumor ablation, neuro-modulation, piezoelectric surgery, and drug delivery. In this talk, Dr. Manbachi will be discussing some of his ongoing projects aiming to address low-frequency bone sonography, minimally invasive ablation of neuro-oncology and implantable sensors for spinal cord blood flow measurements.
Bio: Dr. Manbachi is an Assistant Professor of Neurosurgery and Biomedical Engineering at Johns Hopkins University. His research interests include applications of sound and ultrasound to various neurosurgical procedures. These applications include imaging the spine and brain, detection of foreign body objects, remote ablation of brain tumors, monitoring of blood flow and tissue perfusion, as well as other upcoming interesting applications such as neuromodulation and drug delivery. His teaching activities mentorship with BME Design Teams as well as close collaboration with clinical experts in Surgery and Radiology at Johns Hopkins.
His previous work included the development of ultrasound-guided spine surgery. He obtained his PhD from the University of Toronto, under the supervision of Dr. Richard S.C. Cobbold. Prior to joining Johns Hopkins, he was a postdoctoral fellow at Harvard-MIT Division of Health Sciences and Technology (2015-16) and the founder and CEO of Spinesonics Medical (2012–2015), a spinoff from his doctoral studies.
Amir is an author on >25 peer-reviewed journal articles, > 30 conference proceedings, 10 invention disclosures / patent applications and a book entitled “Towards Ultrasound-guided Spinal Fusion Surgery.” He has mentored 150+ students, has so far been raised $1.1M of funding and his interdisciplinary research has been recognized by a number of awards, including University of Toronto’s 2015 Inventor of Year award, Ontario Brain Institute 2013 fellowship, Maryland Innovation Initiative and Cohen Translational Funding.
Dr. Manbachi has extensive teaching experience, particularly in the field of engineering design, medical imaging and entrepreneurship (both at Hopkins and Toronto), for which he received the University of Toronto’s Teaching Excellence award in 2014, as well as Johns Hopkins University career centre’s award nomination for students’ “Career Champion” (2018) and finally Johns Hopkins University Whiting School of Engineering’s Robert B. Pond Sr. Excellence in Teaching Excellence Award (2018).
Title: 5G Security – Opportunities and Challenges
Abstract: Software Defined Networking (SDN) and Network Function Virtualization (NFV) are the key pillars of future networks, including 5G and beyond that promise to support emerging applications such as enhanced mobile broadband, ultra-low latency, massive sensing type applications while providing the resiliency in the network. Service providers and other vertical industries (e.g., Connected Cars, IOT, eHealth) can leverage SDN/NFV to provide flexible and cost-effective service without compromising the end user quality of service (QoS). While NFV and SDN open up the door for flexible networks and rapid service creation, these also offer both security opportunities while also introducing additional challenges and complexities, in some cases. With the rapid proliferation of 4G and 5G networks, operators have now started the trial deployment of network function virtualization, especially with the introduction of various virtualized network elements in the access and core networks. While several standardization bodies (e.g., ETSI, 3GPP, NGMN, ATIS, IEEE) have started looking into the many security issues introduced by SDN/NFV, additional work is needed with larger security community including vendors, operators, universities, and regulators.
This talk will address evolution of cellular technologies towards 5G but will largely focus on various security challenges and opportunities introduced by SDN/NFV and 5G networks such as Hypervisor, Virtual Network Functions (VNFs), SDN controller, orchestrator, network slicing, cloud RAN, edge cloud, and security function virtualization. This talk will introduce a threat taxonomy for 5G security from an end-to-end system perspective, potential threats introduced by these enablers, and associated mitigation techniques. At the same time, some of the opportunities introduced by these pillars will also be discussed. This talk will also highlight some of the ongoing activities within various standards communities and will illustrate a few deployment use case scenarios for security including threat taxonomy for both operator and enterprise networks.
Bio: Ashutosh Dutta is currently senior scientist and 5G Chief Strategist at the Johns Hopkins University Applied Physics Laboratory (JHU/APL). He is also a JHU/APL Sabbatical Fellow and adjunct faculty at The Johns Hopkins University. Ashutosh also serves as the chair for Electrical and Computer Engineering Department of Engineering for Professional Program at Johns Hopkins University. His career, spanning more than 30 years, includes Director of Technology Security and Lead Member of Technical Staff at AT&T, CTO of Wireless for NIKSUN, Inc., Senior Scientist and Project Manager in Telcordia Research, Director of the Central Research Facility at Columbia University, adjunct faculty at NJIT, and Computer Engineer with TATA Motors. He has more than 100 conference, journal publications, and standards specifications, three book chapters, and 31 issued patents. Ashutosh is co-author of the book, titled, “Mobility Protocols and Handover Optimization: Design, Evaluation and Application” published by IEEE and John & Wiley.
As a Technical Leader in 5G and security, Ashutosh has been serving as the founding Co-Chair for the IEEE Future Networks Initiative that focuses on 5G standardization, education, publications, testbed, and roadmap activities. Ashutosh serves as IEEE Communications Society’s Distinguished Lecturer for 2017-2020 and as an ACM Distinguished Speaker (2020-2022) Ashutosh has served as the general Co-Chair for the premier IEEE 5G World Forums and has organized 65 5G World Summits around the world.
Ashutosh served as the chair for IEEE Princeton / Central Jersey Section, Industry Relation Chair for Region 1 and MGA, Pre-University Coordinator for IEEE MGA and vice chair of Education Society Chapter of PCJS. He co-founded the IEEE STEM conference (ISEC) and helped to implement EPICS (Engineering Projects in Community Service) projects in several high schools. Ashutosh has served as the general Co-Chair for the IEEE STEM conference for the last 10 years. Ashutosh served as the Director of Industry Outreach for IEEE Communications Society from 2014-2019. He was recipient of the prestigious 2009 IEEE MGA Leadership award and 2010 IEEE-USA professional leadership award. Ashutosh currently serves as Member-At-Large for IEEE Communications Society for 2020-2022.
Ashutosh obtained his BS in Electrical Engineering from NIT Rourkela, India; MS in Computer Science from NJIT; and Ph.D. in Electrical Engineering from Columbia University, New York under the supervision of Prof. Henning Schulzrinne. Ashutosh is a Fellow of IEEE and senior member of ACM.
Title: Student-Teacher Learning Techniques for Bilingual and Low Resource OCR
Abstract: Optical Character Recognition (OCR) is the automatic generation of a transcription given a line image of text. Current methods have been very successful on printed English text, with Character Error Rates of less than 1¥%. However, clean datasets are not commonly seen in real life applications. There is a move in OCR towards `text in the wild’, conditions where there are lower resolution images like store fronts, street sign, and billboards. Oftentimes these texts contain multiple scripts, especially in countries where multiple languges are spoken. In addition, Latin characters are wildly seen no matter what language. The presence of multilingual text poses a unique challenge.
Traditional OCR methods involve text localization, script identification, and then text recognition. A separate system is used in each task and the results from one system are passed to the next. However, the downside of this pipeline approach is that errors propagate downstream and there is no way of providing feedback upstream. These downsides can be mitigated with fully integrated approaches, where one large system does text localization, script identification, and text recognition jointly. These approaches are also sometimes known as end-to-end approaches in literature.
With larger and larger networks, there is also a need for a greater amount of training data. However, this data may be difficult to obtain if the target language is low resource. There are also problems if the data that is obtained is in a slightly different domain, for example, printed versus handwritten text. This is where synthetic data generation techniques and domain adaptation techniques can be helpful.
Given these current challenges in OCR, this thesis proposal is focused on training an integrated (ie: end-to-end) bilingual systems and domain adaptation techniques. Both these objectives can be achieved using student-teacher learning methods. The basics of this approach is to have a trained teacher model add an additional loss function while training a student model. The outputs of the teacher will be used as soft targets for the student to learn. The following experiments will be performed:
Title: Optical coherence tomography signal processing in complex domain
Abstract: Optical coherence tomography (OCT) plays an indispensable role in clinical fields such as ophthalmology and dermatology. Over the past 30 years, OCT has gone through tremendous developments, which come with both hardware improvements and novel signal processing techniques. Hardware improvements such as the use of adaptive optics (AO) and the use of vertical-cavity surface-emitting laser (VCSEL) help push the fundamental limits of OCT imaging capability. Novel signal processing techniques aim to push the imaging capability beyond current hardware architecture limitations. Often, novel signal processing techniques achieve better performances than hardware modifications while keeping the cost to the lowest. The purpose of this dissertation proposal is to develop novel OCT signal processing techniques that provide new imaging capabilities and overcome current imaging limitations.
OCT signal, as the result of the interference between the sample back-scattering light and the reference light, is complex and contains both amplitude and phase information. The amplitude information is mostly used for OCT structural imaging, while the phase information is mostly used for OCT functional imaging. Usually, the amplitude-based methods are more robust since they are less prone to noise, while the phase-based methods are better in quantifying precision measurements since they are more sensitive to micro displacements. This dissertation proposal focuses on three advanced OCT signal processing techniques in both amplitude and phase domain.
The first signal processing technique proposed is the amplitude-based BC-mode OCT image visualization for microsurgery guidance, where multiple sparsely sampled B-scans are combined to generate a single cross-section image with enhanced instrument and tissue layer visibility and reduced shadowing artifacts. The performance of the proposed method is demonstrated by guiding a 30-gauge needle into an ex-vivo human cornea.
The second signal processing technique proposed is the amplitude-based optical flow OCT (OFOCT) for determining accurate velocity fields. Modified continuity constraint is used to compensate the Fourier-domain OCT (FDOCT) sensitivity fall-off. Spatial-temporal smoothness constraints are used to make the optical flow problem well-posed and reduce noises in the velocity fields. The accuracy of the proposed method is verified through phantom flow experiments by using a diluted milk powder solution as the scattering medium, in both cases of advective flow and turbulent flow.
The third signal processing technique proposed is phase-based. A wrapped Gaussian mixture model (WGMM) is proposed to stabilize the phase of swept-source OCT (SSOCT) systems. The OCT signal phase is divided into several components and each component is fully analyzed. The WGMM is developed based on the previous analysis. A closed-form iteration solution of the WGMM is derived using the expectation-maximization (EM) algorithm. The performance of the proposed method is demonstrated through OCT imaging of ex-vivo mice cornea and anterior chamber.
For all the three proposed methods above, process has been made in theoretical modeling, numerical implementations, and experimental verifications. All the algorithms have been implemented in the graphic processing unit (GPU) in the OCT system for real-time data processing. Preliminary results demonstrate good performances of these proposed methods. The final thesis work will include optimizing the proposed methods and applying the implemented algorithms to both ex-vivo and in-vivo biomedical research for the overall system testing and analysis.
Title: Intraoperative Optical Coherence Tomography Guided Deep Anterior Lamellar Keratoplasty
Abstract: Deep anterior lamellar keratoplasty (DALK) is a highly challenging procedure requiring micron accuracy to guide a “big bubble” needle into the stroma of the cornea down to Descemet’s Membrane (DM). It has important advantages over Penetrating keratoplasty (PK) including lower rejection rate, less endothelial cell loss, and increased graft survival. Currently, this procedure relies heavily on the visualization through a surgical microscope, the surgeon’s own surgical experience, and tactile feel to determine the relative position of the needle and DM. Optical coherence tomography (OCT) is a well-established, non-invasive optical imaging technology that can provide high-speed, high-resolution, three-dimension images of biological samples. Since it was first demonstrated in 1991, OCT has emerged as a leading technology for ophthalmic visualization, especially for retinal structures, and has been widely applied in ophthalmic surgery and research. Common-path (CP) OCT systems use single A-scan image to deduce the tissue layer information and can be operated at a much higher speed. This synergizes well with handheld tools and automated surgical systems which require fast response time. CP-OCT has been integrated into a wide range of microsurgical tools for procedures such as epiretinal membrane peeling and subretinal injection.
In this proposal, the common-path swept-source OCT system (CP-SSOCT) is proposed to guide DALK procedures. The OCT distal sensor integrated needle and OCT guided micro-control ocular surgical system (AUTO-DALK) will be designed and evaluated. This device will allow for the autonomous insertion of a needle for pneumo-dissection based on the depth-sensing results from the OCT system. An earlier prototype of AUTO-DALK was tested on the ex-vivo porcine cornea including the comparison of expert manual needle insertion. The result showed the precision and consistency of the needle placement were increased, which could lead to better visual outcomes and fewer complications. Future work will include improving the overall design for in-vivo testing and clinical use, advanced convolutional neural network based tracking, and system validation on larger sample size.
Jin U. Kang (adviser), Department of Electrical and Computer Engineering
Israel Gannot, Department of Electrical and Computer Engineering
Xingde Li, Department of Biomedical Engineering