Beware the Eavesdropper - JHU Engineering Magazine

In the past decade, millions of people have begun to use Voice Over Internet Protocol (VOIP) services that route their telephone calls over the Internet. Because the public nature of the Internet makes eavesdropping relatively easy, VOIP providers are increasingly encrypting the conversations to protect users’ privacy.

But the encryption scheme has a weak spot, according to Whiting School computer scientists. Although the words will be garbled, a clever eavesdropper will be able to figure out what language is being spoken. And the researchers suspect that the encryption scheme in some circumstances might “leak” even more information, such as who is talking.

In the arcane world of encryption a vulnerability that gives this much information away is a pretty big deal.

“We can’t be naive about how determined or dedicated the attacker can be. Information, even in small amounts, if gathered in a timely way can have enormous benefits [to the eavesdropper],” says Gerald M. Masson, a professor in the Department of Computer Science and director of the Johns Hopkins University Information Security Institute.

Of course, intelligence services might want to use the newfound knowledge to zero in on potential terrorists. But corporate spies might also use it against competitors—perhaps getting a heads up about the location of a new plant based on the language being spoken in eavesdropped phone calls.

Masson, along with associate professor Fabian Monrose and graduate students Charles V. Wright and Lucas Ballard, showed that weaknesses in VOIP encryption let them make pretty good guesses about what language the conversations are being conducted in, even though the actual words remained obscure. They presented their findings at the 16th Usenix Security Conference in Boston last August.

Here’s the problem the researchers discovered: To send a phone call over the Internet, VOIP software first converts the analog signal to a digital one. To save bandwidth, most software uses a trick called Variable Bit Rate encoding— each 20 millisecond “frame” of the conversation is encoded using only as many bits as are needed to convey the information. For instance, a frame that contains a vowel sound will contain more bits than one that contains a consonant like an ‘s’.

Before these frames are sent over the Internet they can be encrypted. But the encryption doesn’t hide the size of each frame. What the researchers showed is that an attacker who intercepts the conversation can do a statistical analysis of the frame sizes, and use that analysis to make a good guess about what language is being spoken.

The researchers did it themselves, making encrypted VOIP calls between their own computers and playing files from a dataset of native speakers of 21 different languages. Using their technique, they could correctly classify the language being spoken at a much better rate than random guessing. For instance, they could correctly predict if Indonesian was being spoken 40 percent of the time, and Russian and Tamil 35 percent of the time. On average the correct language was one of the top four guesses more than 50 percent of the time.

The software did even better in choosing between two given languages, such as Spanish and English. In 75 percent of the pairings, the technique predicted between two languages with 70 percent or greater accuracy.

The researchers suspect that encrypted VOIP might leak even more information. For instance, an attacker might be able to tease out enough information about the voices speaking to tell if a particular person was involved. “I think that in general this technology would only have special purpose applications. But that one-tenth of one-tenth of the time where it can provide a slight advantage can have a lot of value,” Masson says.

— Kurt Kleiner