On any given day, 1,000 articles about my area of research may appear in various publications—and the same holds true for other engineers, scientists, and doctors. Twenty years ago, a doctor wouldn’t have been faulted for not having read a study in an obscure medical journal that could have prevented a patient’s death. Today, if the information is out there, you’re accountable.
Email has grown out of control. For the past six years, every three months, I’ve filed the emails I didn’t have time to read, intending to get to them. This past spring the number of emails in that folder had grown to 1,000—not counting spam.
The enormity of information has repercussions for industry where critical data need to be gathered and synthesized. Companies need feedback from customers around the world about products. Are there recurring problems? Patterns of opinion?
All of this presents the Center for Language and Speech Processing (CLSP) with a major new challenge—creating digital tools that enable computers to understand and categorize what constitutes relevant, critical information.
As I see it, there are two pieces to the problem. There’s “the fire hose”: There’s so much information coming at you at once, how do you organize it? The other is “the ocean”: How do you fish out the most pertinent information and reel in just what you want?
Language is ambiguous, variable, and context dependent. So we’re investigating how to represent the meaning of a sentence, of words, so that a computer can understand and assess it well enough to index it correctly. Right now, under “friendly” conditions we can do this. But “friendly” conditions mean the speaker is fluent in American English, isn’t too old, too young, or emotional, and there’s no reverberation. These are big limitations, so interpreting text and understanding the intent of communication is a much larger problem than just writing down words.
In the future, computers will be smarter and will serve our individual information needs. While this will impact politics, industry, and consumers, it will be especially important in education. Eventually, every kid who goes to school will bring along a device that records, indexes, and is able to contextualize information. The device will be able to recall all that was said, read, and seen that day, provide a reminder about an assignment, a sample math problem from a week before, and a relevant part of a lecture.
Communications have exploded. The CLSP is enhancing our ability to take advantage of and manage information, ultimately improving communication between people.