Johns Hopkins Electrical and Computer Engineering Associate Professor Tinoosh Mohsenin recently sat down with the Edge AI Foundation to discuss one of the biggest challenges in artificial intelligence today: how to make powerful generative AI models run on small, energy-efficient devices.
Generative AI is often associated with massive cloud computers and large data centers. But many real-world technologies, such as medical imaging systems, robotics platforms, and smart sensors, need AI to run directly on compact hardware close to where data is collected.
In the conversation, Mohsenin shared how her research group is redesigning AI models so they can operate efficiently on these resource-limited “edge” devices. Her work focuses on shrinking large models and reducing how much memory and energy they require while keeping their performance nearly the same. Using techniques such as model pruning and quantization, Mohsenin and her team have achieved up to 43× smaller models and latency improvements of up to 22×, making advanced AI far more practical outside the cloud.