Impact: Faculty Innovation / Spring 2026

Accelerating Pancreatic Cancer Detection

Hopkins researchers have developed a database aimed at leveraging artificial intelligence to spot pancreatic cancer earlier, when it is potentially treatable. 

Pancreatic cancer is the third-leading cause of cancer-related deaths in the U.S., with 80% to 85% of cases diagnosed too late for effective treatment. Its silent progression and anatomical complexity make early detection difficult for radiologists. 

Now, a team including Johns Hopkins computer science researchers—collaborating with NVIDIA and institutions worldwide—has developed a database aimed at leveraging artificial intelligence to spot pancreatic cancer earlier, when it is potentially treatable. 

The Pancreatic Tumor Segmentation Dataset, or PanTS, is the largest fully open-source CT scan dataset for pancreatic cancer detection. The work, developed by Wenxuan Li, Engr ’23 (MSE), a PhD student, Assistant Research Professor Zongwei Zhou, and Bloomberg Distinguished Professor and computer scientist Alan Yuille, was presented at the 39th Annual Conference on Neural Information Processing Systems. 

“Although current AI isn’t yet ready for population-wide screening, if we use imaging biomarkers, clinical notes, and deep neural networks to select high-risk patients, we can transform a blunt screener into a precision detection tool,” says Zhou, senior author of the project, who holds a joint appointment in oncology. 

The dataset contains over 36,000 3D CT scans from 145 medical centers,with expert-validated annotations of more than 993,000 anatomical structures, including the pancreas, tumors, and surrounding organs. Each scan includes metadata—patient age, sex, diagnosis, imaging protocols, and biomarkers—to help develop models that identify high-risk individuals across diverse populations and imaging conditions. PanTS was built with NVIDIA’s MONAI Label, an open-source AI framework for medical imaging that supports interactive 3D segmentation and scalable, human-in-the-loop annotation workflows. 

Models trained on PanTS significantly outperform those trained on existing public datasets, gains that the team attributes to PanTS’ scale and anatomical detail. The public dataset includes a reserved test set for third-party validation so developers and hospitals worldwide can train and evaluate models. 

“Our team will keep promoting open science in medical computer vision—especially for cancer research, where public annotated datasets are limited,” says Li. 

By enabling earlier, more accurate tumor detection, PanTS could improve survival rates and transform pancreatic cancer care. Researchers are already developing an algorithm that reportedly detects pancreatic cancer in CT scans over a year earlier than most radiologists—thanks to PanTS training data.

— JAIMIE PATTERSON