Jaemin Cho is an assistant research professor in the Department of Computer Science and a member of the Data Science and AI Institute at the Johns Hopkins University.
His research focuses on multimodal AI, integrating diverse data types—such as images, videos, text, audio, and motion—to develop models that are interpretable, controllable, and scalable. He is also interested in learning action knowledge from unlabeled visual demonstrations and in human-in-the-loop agents that enhance productivity in various applications, including education, medicine, filmmaking, programming, and dancing.
Cho’s work has been featured at top conferences in computer vision—such as the Conference on Computer Vision and Pattern Recognition (CVPR), the International Conference on Computer Vision, and the European Conference on Computer Vision—natural language processing—including the Conference on Empirical Methods in Natural Language Processing, the Annual Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics, and the Conference on Language Modeling—and machine learning, among them the Conference on Neural Information Processing Systems (NeurIPS), the International Conference on Learning Representations (ICLR), and the AAAI Conference on Artificial Intelligence. His research has been recognized through multiple oral presentations at NeurIPS and ICLR, a Bloomberg Data Science PhD Fellowship, and media coverage by MIT Technology Review, IEEE Spectrum, and Wired. He has also co-organized the Workshop on Transformers for Vision at CVPR 2023, 2024, and 2025.
Cho obtained a PhD in computer science in 2025 from the University of North Carolina at Chapel Hill, where he was advised by Mohit Bansal, before which he obtained a BSc in industrial engineering in 2018 from Seoul National University. He will be spending a year at the Allen Institute for AI before joining Johns Hopkins full-time.