Title: Consistent Inter-Model Specification for Stochastic Volatility and VIX Market Models
Abstract: This talk addresses the following question: If a stochastic model is specified for the curve of VIX futures, what are the restrictions in order for it to be consistent with a stochastic volatility model? In other words, assuming that a stochastic volatility model is in place, a so-called market model will need to satisfy some conditions in order for there to not be any inter-model arbitrage or mis-priced derivatives. The present work gives such a condition, and also shows how to recover the correctly specified stochastic volatility function from the market model.
Title: Multivariate Records
Abstract: Given a vector-valued time series, a multivariate record is said to occur at some time if no previous observation dominates it in every coordinate. This notion of a record generalizes the usual notion in one dimension, and gives rise to some interesting phenomena, some of which will be presented. An efficient algorithm for sampling the multivariate records process that enables one to study the process empirically and discover new phenomena related to record growth in time will be described, and theoretical results illuminated from simulations will be presented. (This is joint work with Fred Torcaso and Vincent Lyzinzki).
Title: Hilbert’s Nullstellensatz and Linear Algebra: An Algorithm for Determining Combinatorial Infeasibility
Unlike systems of linear equations, systems of multivariate polynomial equations over the complex numbers or finite fields can be compactly used to model combinatorial problems. In this way, a problem is feasible (e.g. a graph is 3-colorable, Hamiltonian, etc.) if and only if a given system of polynomial equations has a solution. Via Hilbert’s Nullstellensatz, we generate a sequence of large-scale, sparse linear algebra computations from these non-linear models to describe an algorithm for solving the underlying combinatorial problem. As a byproduct of this algorithm, we produce algebraic certificates of the non-existence of a solution (i.e., non-3-colorability, non-Hamiltonicity, or non-existence of an independent set of size k).
In this talk, we present theoretical and experimental results on the size of these sequences, and the complexity of the Hilbert’s Nullstellensatz algebraic certificates. For non-3-colorability over a finite field, we utilize this method to successfully solve graph problem instances having thousands of nodes and tens of thousands of edges. We also describe methods of optimizing this method, such as finding alternative forms of the Nullstellensatz, adding carefully-constructed polynomials to the system, branching and exploiting symmetry.
Graduate students are happily advised that no background in algebraic geometry or familiarity with Hilbert’s Nullstellensatz is assumed for this talk. All theorems and terms are clearly explained with friendly pictures and examples. 🙂
Title: Generative Models to Decode Brain Pathology
Clinical neuroscience is field with all the difficulties that come from high dimensional data, and none of the advantages that fuel modern-day breakthroughs in computer vision, automated speech recognition, and health informatics. It is a field of unavoidably small datasets, due to the costly acquisitions and environmental confounds, massive patient variability, and an arguable lack of ground truth information. My lab tackles these challenges by combining analytical tools from signal processing and machine learning with hypothesis-driven insights about the brain.
This talk will highlight three ongoing projects in my lab that span a range of methodologies and clinical applications. First, I will develop a joint optimization framework to predict clinical severity from resting-state fMRI data. Our model is based on two coupled terms: a generative non-negative matrix factorization and a discriminative linear regression. This project is part of our larger effort to better characterize heterogeneous patient cohorts. Next, I will describe a spatio-temporal model to track the spread of epileptic seizures from EEG data. Unlike conventional approaches, our model relies on a latent network structure that captures the hidden state of each EEG channel; the latent variables are complemented by an intuitive likelihood model for the observed neuroimaging measures. This project takes the first steps toward noninvasive seizure localization. Finally, I will highlight a very recent initiative in my lab to manipulate emotional cues in human speech. Our long-term goal is to create a naturalistic therapy for autism.
Archana Venkataraman is a John C. Malone Assistant Professor in the Department of Electrical and Computer Engineering at Johns Hopkins University. She directs the Neural Systems Analysis Laboratory and is a core faculty member of the Malone Center for Engineering in Healthcare. Dr. Venkataraman’s research lies at the intersection of multimodal integration, network modeling and clinical neuroscience. Her work has yielded novel insights in to debilitating neurological disorders, such as autism, schizophrenia and epilepsy, with the long-term goal of improving patient care. Dr. Venkataraman completed her B.S., M.Eng. and Ph.D. in Electrical Engineering at MIT in 2006, 2007 and 2012, respectively. She is a recipient of the CHDI Grant on network models for Huntington’s Disease, the MIT Lincoln Lab campus collaboration award, the NIH Advanced Multimodal Neuroimaging Training Grant, the National Defense Science and Engineering Graduate Fellowship, the Siebel Scholarship and the MIT Provost Presidential Fellowship.
Title: Subset selection in sparse matrices
Abstract: In subset selection, we search for the best linear predictor that involves a small subset of variables. From a computational complexity viewpoint, subset selection is NP-hard and few classes are known to be solvable in polynomial time. Using mainly tools from discrete geometry, we show that some sparsity conditions on the original data matrix allow us to solve the problem in polynomial time.
This is joint work with Alberto Del Pia and Robert Weismantel
Title: An Introduction to Randomized Algorithms
for Matrix Computations
Abstract: The emergence of massive data sets, over the past twenty or so years, has lead to the development of Randomized Numerical Linear Algebra.
Fast and accurate randomized matrix algorithms are being designed for
applications like machine learning, population genomics, astronomy, nuclear engineering, and optimal experimental design.
We give a flavour of randomized algorithms for the solution of least
squares/regression problems. Along the way we illustrate important
concepts from numerical analysis (conditioning and pre-conditioning),
probability (concentration inequalities), and statistics (sampling and leverage scores).
Title: Determining the number of communities in degree-corrected stochastic block models.
Abstract: We propose to estimate the number of communities in degree-corrected stochastic block models based on a pseudo likelihood ratio. For estimation, we consider a spectral clustering together with binary segmentation method. This approach guarantees an upper bound for the pseudo likelihood ratio statistic when the model is over-fitted. We also derive its limiting distribution when the model is under-fitted. Based on these properties, we establish the consistency of our estimator for the true number of communities. Developing these theoretical properties require a mild condition on the average degree — growing at a rate faster than log(n), where n is the number of nodes. Our proposed method is further illustrated by simulation studies and analysis of real-world networks. The numerical results show that our approach has satisfactory performance when the network is sparse and/or has unbalanced communities.
Title: A limit theorem for an omnibus embedding of multiple random graphs
Abstract: Performing statistical inference on collections of graphs is of import to many disciplines. Graph embedding, in which the vertices of a graph are mapped to vectors in a low-dimensional Euclidean space, has gained traction as a basic tool for graph analysis. We describe an omnibus embedding in which multiple graphs on the same vertex set are jointly embedded into a single space with a distinct representation for each graph. We prove a central limit theorem for this omnibus embedding, and show that this simultaneous embedding into a common space allows comparison of graphs without the need to perform pairwise alignments of graph embeddings. Experimental results demonstrate that the omnibus embedding improves upon existing methods, and in particular provides insight into analysis of real connectomic data.
Title: How nature might endow cortical regions of the mammalian brain with diffeomorphisms?
Abstract: A surface-based diffeomorphic algorithm to generate 3D coordinate grids in the cortical ribbon in the mammalian brain is described. In the grid, normal coordinate lines are generated by the diffeomorphic evolution from the grey/white (inner) surface to the grey/csf (outer) surface. Here, the cortical ribbon is described by two triangulated surfaces with open boundaries. It is assumed that the cortical ribbon consists of cortical columns which are orthogonal to the white matter surface. This might be viewed as a consequence of the embryonic development of the columns. It is also assumed that the columns are orthogonal to the outer surface so that the resultant vector field is orthogonal to the evolving surface. The laminar properties of the cortical ribbon, i.e. cortical layers, are then characterized by the normal lines. The distance of the normal lines from the vector field such that the inner surface evolves diffeomorphically towards the outer one can be construed as a measure of thickness. Finally, an equivolumetric reparametrization of the diffeomorphism is developed to ensure volumetric preservation of cortical layers across highly folded regions such as gyri and sulci. Applications are described for human and feline brains.
Title: Investigating Spatially Complex Data with Topological Data Analysis
Abstract: Data exhibiting complicated spatial structures are common in many areas of science (e.g. cosmology, biology), but can be difficult to analyze. Persistent homology is a popular approach within the area of Topological Data Analysis (TDA) that offers a way to represent, visualize, and interpret complex data by extracting topological features, which can be used to infer properties of the underlying structures. For example, TDA may be useful for analyzing the large-scale structure (LSS) of the Universe, which is an intricate and spatially complex web of matter. The output from persistent homology, called persistence diagrams, summarize the different ordered holes in the data (e.g. connected components, loops, voids). I will introduce persistent homology, present functional transformations of persistence diagrams useful for inference, and discuss several applications.