Published:
Author: Salena Fitzgerald
AI models, Adobe stock photo

Johns Hopkins researchers have developed a new framework that uses graph neural networks (GNNs) to classify images through geometry—an advance that could help improve how machines flag inappropriate content and identify diseases in medical images. GNNs are a type of AI that learns by mapping relationships between data points. Their method promises better performance when working with limited data by leveraging the hidden structure, or manifold, underneath data points like images.  

The team’s approach, which appears on the preprint site ArXiv, captures the hidden geometric relationships in data that help models make more accurate and generalizable decisions. Even better, the researchers say the new method is simple enough to integrate into existing systems without requiring a major overhaul. 

“We know that graph neural networks are powerful tools, especially when working with data that’s naturally connected—like social networks or route maps,” said Caio F. Deberaldini Netto, a PhD student in applied mathematics and statistics at the Whiting School of Engineering and lead author of the study. “But arbitrary data, like images–which can be seen as pixel grids or language, which can be seen as token sequences–don’t naturally look like graphs. 

The trick, the researchers say, is to find the manifold—a hidden, low-dimensional geometric pattern that reveals how seemingly different images are actually related. This idea, known as the manifold hypothesis, suggests that although image data is high-dimensional (like millions of pixels), the real patterns that matter—like “this is a cat”—live in a much simpler structure. 

Using a machine learning model called a variational autoencoder (VAE), the team first mapped images into this lower-dimensional manifold. Then, they constructed a graph where each image was a node, and connections were made based on how close the images were in this geometric space. 

“If two pictures of dogs are close together in the manifold—even if the model doesn’t know they’re dogs—it still picks up the similarity based on features like shape or color,” explained co-author Luana Ruiz, assistant professor of applied mathematics and statistics. 

Once the graph of images is built, a GNN processes it by learning from these connections—propagating information between similar images. This allows the model to learn patterns not just from individual images, but also from the relationships between them. 

The team found that across multiple datasets—including CIFAR-10 (common objects), CelebA (celebrity faces), and even medical images—the method consistently outperformed other strategies like the most basic type of neural network (called “vanilla”) or superpixel-based graphs, which are networks of visually similar pixel groups. Most importantly, it showed better generalization: the ability to correctly classify new, unseen images. 

“We tested not just on new images, but also on larger and more complex graphs,” said Deberaldini Netto. “Our GNN’s generalization gap—the difference between training and test accuracy—shrinks as you add more data. That’s exactly what you want.” 

The researchers say this approach is ready to be plugged into real-world pipelines—from medical diagnosis to content recommendation—anywhere models need to make robust decisions based on complex data. 

“Say you’re classifying patient scans,” Deberaldini Netto explained. “You can’t just rely on pixel values. You need to learn from patterns across similar patients. Our method lets you build those patterns into the model from the start.” 

Additionally, the researchers are already applying it to textual data—like authorship attribution, where relational structure between written passages could help determine whether a work belongs to Shakespeare or Marlowe. And because it’s compatible with existing models, like large language models or convolutional networks, it could become a powerful building block in general-purpose AI, they say. 

“This work is about making models that are not just accurate, but also smarter and more adaptable,” said Ruiz. “It’s a step toward AI that really understands structure in the world—whether that’s in pixels, words, or people.”