Smarter algorithms, less data

Published: January 15, 2025

Author: Salena Fitzgerald

Category:

Research

Solving complex problems with minimal data is a critical challenge in fields like stock market predictions and supply chain logistics, where acquiring information can be costly or time-consuming. Building on decades-old optimization theories, a team of Johns Hopkins researchers has developed a new method to make problem-solving tools work effectively even with incomplete or imprecise data.

“We’re addressing the core question: what’s the smallest amount of data you need to solve a problem effectively? Once we establish that threshold, we can prove mathematically that no other method can perform better with less data,” said study leader Amitabh Basu, a professor in the Whiting School of Engineering’s Department of Applied Mathematics and Statistics.

The team’s results appeared on the preprint site ArXiv.

The team’s “black box framework” tackles the issue of data inexactness—where information is noisy, incomplete, or unpredictable—by modifying any convex optimization algorithm to work with inexact information and still produce an accurate result. (A convex optimization algorithm is a method used to find the best solution to problems with structure that can be modeled using a specific mathematical structure called convexity.) This new method adapts incomplete or inexact information fed into any algorithm that works optimally in the idealized scenario where exact data exists, enabling the original algorithm to work effectively even with the inexact data.

“This new method does not need any knowledge of the internal logic or workings of the original algorithm, making it possible to deploy it in a wide variety of cases without needing any details about the original. This is especially useful when there are concerns about protecting the original algorithm’s intellectual property,” said Basu.

The team also says that the new method could allow companies such as FedEx and Amazon to optimize their supply chain operations using less data.

Basu explains that the current algorithms often use more data than needed, which raises costs and makes things more complicated. “Our new method reduces the need for data while still getting the best results, ensuring solutions stay accurate even with imperfect data.”

While the team recognizes its approach prioritizes using data efficiently, it acknowledges that it comes with an important tradeoff, because using less data usually requires more computational power.

When data is expensive or challenging to gather and you have access to vast computing resources (like Google’s servers), this new method is ideal. However, in situations with limited computational power, such as smartphones, traditional methods that use more data but require less computational effort are more practical. The best approach depends on the tools and resources available in any given situation, the researchers say.

“As data becomes an increasingly valuable commodity, the ability to optimize its use is paramount. Our research not only advances theoretical understanding but also offers practical tools to tackle some of today’s most pressing computational challenges. The implications are clear: smarter algorithms, reduced costs, and more efficient systems across the board,” said Basu.

Team member Phillip Kerger, a former PhD student in the Department of Applied Mathematics and Statistics who is now an assistant teaching professor at UC Berkeley, adds “Whether it’s training computer models or using them in real life, it’s crucial to make sure algorithms only use the necessary data without losing accuracy. By using this new method, companies can create AI systems that are both cheaper to run and more effective.”

Stay Connected

Address

Contact

Site Menu

Share Options

Site Menu