Data Science Master’s Program



The Data Science Master’s degree at the Johns Hopkins University will provide the training in applied mathematics, statistics and computer science to serve as the basis for an understanding, and appreciation, of existing data science tools.  Our program aims to produce the next generation of leaders in data science by emphasizing mastery of the skills needed to translate real-world data-driven problems in mathematical ones, and then solving these problems by using a diverse collection of scientific tools.



In addition to Introduction to Data Science (EN.553.636), students will take one course in each of the four core areas:  Statistics, Machine Learning, Optimization, and Computing.  Students will decide on an area of focus and take three courses in either Computational Medicine, Computational Machine Learning, Computer Vision, Computational Finance, Mathematics of Data Science, Language and Speech, or Statistical Theory.  The final capstone project is course EN.553.806 Big Data Design, or another project-oriented course approved by the faculty advisor and the Internal Oversight Committee, and a written paper on a related topic (approved by the instructor) which includes a deeper study of the pre-approved topic.  The goal of the final course and written paper is to allow the student to apply data analysis techniques learned in the program, and possibly to extend those ideas to more general settings or to new application areas.  Lastly, the paper will be summarized in a poster session organized at the end of each semester.


    • A two-day orientation program will precede the first Fall semester of enrollment.
    • EN.553.636 Introduction to Data Science, and

    One course in each of the four Core Areas below.  Courses chosen in this section must be distinct from the courses used to satisfy requirements 2 and 3.

    • Statistics – Introduction to Statistics (EN.553.630), Bayesian Statistics (EN.553.632), Statistical Theory I (EN.553.730), Statistical Theory II (EN.553.731)
    • Machine Learning – Statistical Machine Learning: Methods, Theory, and Applications (PH.140.644), Machine Learning (EN.601.675), Statistical Machine Learning (EN.601.775), Machine Learning (EN.553.740)
    • Optimization – Nonlinear Optimization I (EN.553.761), Nonlinear Optimization II (EN.553.762), Convex Optimization (EN.553.765)
    • Computing – Computing for Applied Mathematics (EN.553.688), Parallel Programming (EN.601.620)
  • Three courses from one of the following focus areas.

    • Computational Medicine: Computational medicine is an interdisciplinary field that combines mathematics, computer science, medicine, and engineering to analyze and interpret biological and medical data. The following courses are approved for this focus area 1: Introduction to Bioinformatics (AS.410.633), Bioinformatics: Tools for Genome Analysis (AS.410.635), Methods in Proteomics (AS.410.661), Gene Expression Data Analysis and Visualization (AS.410.671), Computational Molecular Medicine (EN.553.650), Algorithms for Bioinformatics (EN.605.620) or Foundations of Algorithms (EN.605.621), Computational Genomics (EN.605.653), Analysis of Gene Expression and High-Content Biological Data (EN.605.754).
    • Computational Machine Learning: Computational machine learning uses methods from linear algebra, statistics and optimization to build machines that can learn to make predictions from data. The following courses are approved for this focus area: Statistical Machine Learning: Methods, Theory, and Applications (PH.140.644), Information Theory (EN.520.447), Machine Learning for Signal Processing (EN.520.612), Compressed Sensing and Sparse Recovery (EN.520.648), Random Signal Analysis (EN.520.651), Machine Learning (EN.553.740), Graphical Models (EN.553.743), Machine Learning (EN.601.675), Machine Learning: Data to Models (EN.601.676), Machine Learning: Optimization (EN.601.681), Causal Inference (EN.601.677), Machine Learning: Representation Learning (EN.601.679), Statistical Machine Learning (EN.601.775), Unsupervised Learning: Big Data to Low-Dimensional Representations (EN.601.780).
    • Computer Vision: Compressed Sensing and Sparse Recovery (EN.520.648), Computer Vision (EN.601.661), Machine Learning: Deep Learning (EN.601.682), Vision as Bayesian Inference (EN.601.783).
    • Computational Finance: Stochastic Processes in Finance I, II (EN.553.627, EN.553.628), Equity Markets and Quantitative Trading (EN.553.641), Investment Science (EN.553.642), Introduction to Financial Derivatives (EN.553.644), Interest Rate and Credit Derivatives (EN.553.645), Risk Measurement and Management in Financial Markets (EN.553.646), Quantitative Portfolio Theory & Performance Analysis (EN.553.647), Financial Engineering and Structured Products (EN.553.648), Advanced Equity Derivatives (EN.553.649), Commodities and Commodity Markets (EN.553.753).
    • Mathematics of Data Science: Monte Carlo Methods (EN.553.633), Introduction to Convexity (EN.553.665), High-Dimensional Approximation, Probability, and Statistical Learning (EN.553.738), Machine Learning (EN.553.740), Nonlinear Optimization I, II (EN.553.761, EN.553.762), Stochastic Search and Optimization (EN.553.763), Convex Optimization (EN.553.765), Combinatorial Optimization (EN.553.766), Matrix Analysis (EN.553.792), Randomized and Big Data Algorithms (EN.601.634), Approximation Algorithms (EN.601.635),
    • Language and Speech: Semantics I, II (AS.050.617, AS.050.622), Syntax (AS.050.620), Phonology (AS.050.625), Information Extraction (EN.520.666), Speech and Auditory Processing by Humans and Machines (EN.520.680), Natural Language Processing (EN.601.665), Machine Learning: Linguistic and Sequence Modeling (EN.601.765).
    • Statistical Theory: Statistical Machine Learning: Methods, Theory, and Applications (PH.140.644), Bayesian Statistics (EN.553.632), Statistical Theory (EN.553.730), Statistical Theory II (EN.553.731), Topics in Statistical Pattern Recognition (EN.553.735), Distribution-free Statistics and Resampling Methods (EN.553.737), High-Dimensional Approximation, Probability, and Statistical Learning (EN.553.738), Statistical Pattern Recognition Theory & Methods (EN.553.739), Causal Inference (EN.601.677), Statistical Machine Learning (EN.601.775).
    • The program requires the student to take one elective course. To maximize a student’s flexibility in choosing this course, the student may choose any course offered at JHU that is directly or indirectly related to data science. The elective course must be approved by the student’s advisor as well as the Internal Oversight Committee.
  • Capstone Experience in Data Science (EN.553.806), or another project-oriented course approved by the research supervisor, academic advisor and the Internal Oversight Committee.  Students must complete a Proposal Request for the Capstone Experience in Data Science form and follow instructions to submit for approval before being permitted to enroll in EN.553.806.


    1. The student must find and contact a research supervisor who will agree to supervise the capstone experience.  The research supervisor must be a JHU faculty member.
    2. The student must complete a proposal form, describing the project goals, and submit to their academic advisor, who will in turn send it to the Internal Oversight Committee for approval.
    3. The proposal will include the following and must be submitted using the approved proposal request form (above):
      • Title of proposed project
      • Project description, with sufficient details (e.g., 200 words)
      • Completion timeline
      • Name(s) and signature(s) of faculty supervisor(s)
    4. Upon approval, the student will be permitted to register for EN.553.806: Capstone Experience in Data Science.
    5. Upon completion, the research supervisor will provide a Pass/Fail (P/F) grade.
    6. As part of the experience, the student must write a paper or research report that must be approved the the research supervisor.  The final paper should be 6-12 pages in latex full-page format (1 inch margins, 12 point Times font) or ms-word equivalent.
    7. The written paper will be summarized in a poster presented in a poster session organized at the end of each semester.

  • Prior to Fall Semester of Year I

    Orientation Program (2 days)


    Year 1: Fall Semester

    Introduction to Data Science



    Area of Focus 1


    Year 1: Winter Break

    Online Data Ethics Course


    Year 1:  Spring Semester

    Machine Learning


    Area of Focus 2

    Area of Focus 3


    Year 2:  Fall Semester

    Capstone Experience


  • We are now accepting applications for the Spring 2021 semester.  For more information on admission, please visit our Admissions Process and Admissions Criteria web page.



  • Faculty Name Department Email
    Arora, Raman Computer Science/MINDS
    Basu, Amitabh Applied Mathematics/MINDS
    Braverman, Vladimir Computer Science/MINDS
    Budavari, Tamas Applied Mathematics/MINDS
    Caffo, Brian Biostatistics/MINDS
    Chellappa, Rama Electrical & Computer Eng/MINDS
    Eisner, Jason Computer Science/MINDS
    Fertig, Elana Oncology/MINDS
    Naiman, Daniel Applied Mathematics
    Patel, Vishal Electrical & Computer Eng/MINDS
    Priebe, Carey Applied Mathematics/MINDS
    Shpitser, Ilya Computer Science/MINDS
    Venkataraman, Archana Electrical & Computer Eng/MINDS
    Vidal, Rene Biomedical Engineering/MINDS
    Xu, Yanxun Applied Mathematics/MINDS
    Younes, Laurent, Program Director Applied Mathematics/MINDS


Back to top