Imagine a day when people focus their smartphone cameras on a sign language speaker and have their gestures instantly translated on their screens—that’s the research goal of Xuan Zhang, a fifth-year doctoral candidate in the Department of Computer Science.
Supported by a JHU + Amazon Initiative for Interactive AI (AI2AI) Fellowship, Zhang is researching sign language recognition and translation tasks with the aim of developing a large vision-language model capable of translating signed words into spoken language.
At present, she is working on integrating valuable linguistic insights—such as fingerspelling, where words are articulated letter by letter, and classifiers, which are shapes and movements that represent entire classes of words—into modern sign language processing methods, which currently neglect these techniques despite their frequent use by the Deaf community.
“My goal is to build a more robust, linguistically nuanced sign language processing system that transcends the conventional limitations of technology,” she says.