{"id":5052,"date":"2017-01-10T10:46:42","date_gmt":"2017-01-10T15:46:42","guid":{"rendered":"https:\/\/engineering.jhu.edu\/magazine-archive\/?p=5052"},"modified":"2020-02-14T16:16:18","modified_gmt":"2020-02-14T21:16:18","slug":"say-what","status":"publish","type":"post","link":"https:\/\/engineering.jhu.edu\/magazine-archive\/2017\/01\/say-what\/","title":{"rendered":"Say What?"},"content":{"rendered":"<a href=\"https:\/\/engineering.jhu.edu\/magazine-archive\/wp-content\/uploads\/2017\/01\/Say-What_1.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-5496\" src=\"https:\/\/engineering.jhu.edu\/magazine-archive\/wp-content\/uploads\/2017\/01\/Say-What_1.jpg\" alt=\"Say What?\" width=\"500\" height=\"500\" srcset=\"https:\/\/engineering.jhu.edu\/magazine-archive\/wp-content\/uploads\/2017\/01\/Say-What_1.jpg 600w, https:\/\/engineering.jhu.edu\/magazine-archive\/wp-content\/uploads\/2017\/01\/Say-What_1-150x150.jpg 150w, https:\/\/engineering.jhu.edu\/magazine-archive\/wp-content\/uploads\/2017\/01\/Say-What_1-300x300.jpg 300w, https:\/\/engineering.jhu.edu\/magazine-archive\/wp-content\/uploads\/2017\/01\/Say-What_1-125x125.jpg 125w\" sizes=\"auto, (max-width: 500px) 100vw, 500px\" \/><\/a>\n<blockquote>\n<p style=\"text-align: left;\"><em>Think today&#8217;s computers are smart? Just look at what&#8217;s coming. Meet a multinational bullpen of computer scientists who are rapidly bridging the divide between humans and machines.<\/em><\/p>\n<\/blockquote>\n<p>It all began with a plastic puppy named Radio Rex, a 2-inch bulldog with a battery in his bunk. Simply bark his name and out he would bound from his little wooden house. It was 1922.<\/p>\n<p>\u201cAfter Rex is in position,\u201d read the instructions glued to the bottom of the toy, \u201ceither call sharply, REX, clap the hands, or blow a whistle of the proper pitch and Rex will respond by dashing forth, thus causing endless amusement to young and old.\u201d<\/p>\n<p>Ninety and more years ago, this must have seemed like magic\u2014a device controlled by voice alone. (Even the invention of the amazing Clapper\u2014\u201cClap on! Clap off!\u201d\u2014was still 60 years away.) But there was science behind the secret: The acoustic energy of the short \u201ceh\u201d sound, as spoken by a typical adult male at 500 hertz, triggered a spring in Rex\u2019s rear that launched him out of his doghouse.<\/p>\n<p>A century later, we hold a thin little gizmo in our hand and ask it, \u201cSiri, what pitcher won the fifth game of the 1983 World Series for the Orioles?\u201d Or we station a jet-black cylinder on our countertop and command it: \u201cAlexa, send Beyonc\u00e9\u2019s Lemonade to my sister as a birthday gift. And, by the way, we\u2019re out of toilet paper.\u201d And the wise and soothing genie inside the gadget makes it so.<\/p>\n<p>Such is the work\u2014and the wonder\u2014of speech recognition and machine learning in its infancy: multiplying our fallible memories, fulfilling our mercantile wishes, and putting us on hold for miracles yet to come.<\/p>\n<p>Already, we ask our automobiles to dial our telephones and to announce, turn by turn, the highway home. Move 10 years into the future and even more extraordinary advances become possible, as we and our children may live our lives accompanied by what <a href=\"https:\/\/www.cs.jhu.edu\/faculty\/david-yarowsky\/\" target=\"_blank\" rel=\"noopener noreferrer\">David Yarowsky<\/a>, professor of computer science at the Whiting School of Engineering and a member of the <a href=\"http:\/\/www.clsp.jhu.edu\/\" target=\"_blank\" rel=\"noopener noreferrer\">Johns Hopkins Center for Language and Speech Processing<\/a>, calls \u201ca computer and sensors presence that sees what you see, hears what you hear, knows what you read, and records every single thing that you encounter from the moment of your birth and remembers it forever.\u201d<\/p>\n<p>Picture yourself in a boat on a river and 20 years later being able to retrieve every detail of the day, every word, every sight, every splash.<\/p>\n<p>\u201cUltimately,\u201d says Yarowsky, \u201cour computer assistants will be able to do more than our human assistants. I think we will become so dependent on the technology\u2014sadly\u2014that we won\u2019t be able to function without them.<\/p>\n<p>\u201cWe\u2019ve already become so dependent on technology that I\u2019ve stopped remembering phone numbers. I don\u2019t know my own daughter\u2019s number, and that\u2019s scary.\u201d<\/p>\n<p>At the Johns Hopkins Center for Language and Speech Processing on the Homewood campus, a multinational bullpen of computer scientists is striving to enhance and perfect the voice-to-computer interface across a dazzling spectrum of tasks so that tomorrow\u2019s devices will be able to discern when a combat veteran is depressed and suicidal; isolate and identify individual voices from the babel of a frenetic crime scene; retrieve every word ever spoken by or to anyone we ever encounter; and eliminate the pen, the mouse, the touchpad, and the keyboard forever, engineering a world of supercyber intelligences controlled by the spoken word alone.<\/p>\n<p>&nbsp;<\/p>\n<h2>Testing the Meaning of Understanding<\/h2>\n<p>\u201cWhen I was a kid, I thought there was a lady in the radio,\u201d <a href=\"http:\/\/engineering.jhu.edu\/ece\/faculty\/hermansky-hynek\/\" target=\"_blank\" rel=\"noopener noreferrer\">Hynek Hermansky<\/a>, the Julian S. Smith Endowed Professor of Electrical Engineering, says. He is the director of the CLSP and the proud owner of an original\u2014and still working\u2014Radio Rex toy, which he keeps on the windowsill of his office in Hackerman Hall. Today, that lady\u2014Apple\u2019s Siri, Amazon\u2019s Alexa, Microsoft\u2019s Cortana, et al.\u2014may seems almost halfway human, but there are many problems yet to be solved before she truly is one of us.<\/p>\n<p>\u201cHuman decoding of speech is really robust; we have many ways of looking at a speech signal. We can look at ones that work and happily discard the ones that don\u2019t,\u201d says Hermansky. \u201cBut we do not fully understand precisely what kind of errors people can make and still be understood.<\/p>\n<p>\u201cYou can build a system for a computer to recognize individual words better than human beings can. But language is so flexible, so changeable, that to duplicate all that humans can do\u2014understand context, recognize new words\u2014is still very difficult.\u201d<\/p>\n<p>It has been more than 25 years since Hermansky, visiting his homeland (the former Czechoslovakia) just after the fall of its communist government, tried to use a rotary telephone to dial his office in the U.S. and the antiquated device was unable to connect to the relevant extension, which could only be reached by touch-tone.<\/p>\n<p>The experimental speech recognition system in Hermansky\u2019s laboratory saved the situation by being able to voice dial the required number.<\/p>\n<p>Today, of course, we merely recite the number vocally and the robot operator understands and obeys.<\/p>\n<p>\u201cIt was obvious even back then that we needed a better system of using our voices to control our computers,\u201d he remembers. \u201cNow, with Siri and Alexa and the others, I think that finally, our work is truly useful.\u201d<\/p>\n<p>At Johns Hopkins, the CLSP carries on the work of the late <a href=\"http:\/\/www.nytimes.com\/2010\/09\/24\/business\/24jelinek.html\" target=\"_blank\" rel=\"noopener noreferrer\">Frederick Jelinek<\/a>, the Czech-American patriarch of computer speech processing. (It may not be a coincidence that Radio Rex responds just as eagerly to a shout of \u201cCzechs!\u201d) Alumni of the department have been central to the development of Google Voice, Alexa, and other dramatic advances in the inevitable marriage of humans and machines.<\/p>\n<a href=\"https:\/\/engineering.jhu.edu\/magazine-archive\/wp-content\/uploads\/2017\/01\/Say-What_2-e1483643832849.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-5500\" src=\"https:\/\/engineering.jhu.edu\/magazine-archive\/wp-content\/uploads\/2017\/01\/Say-What_2-e1483643832849.jpg\" alt=\"Researchers at CLSP\" width=\"600\" height=\"450\" srcset=\"https:\/\/engineering.jhu.edu\/magazine-archive\/wp-content\/uploads\/2017\/01\/Say-What_2-e1483643832849.jpg 600w, https:\/\/engineering.jhu.edu\/magazine-archive\/wp-content\/uploads\/2017\/01\/Say-What_2-e1483643832849-300x225.jpg 300w\" sizes=\"auto, (max-width: 600px) 100vw, 600px\" \/><\/a>\n<p>To <a href=\"http:\/\/engineering.jhu.edu\/ece\/faculty\/khudanpur-sanjeev-p\/\" target=\"_blank\" rel=\"noopener noreferrer\">Sanjeev Khudanpur<\/a>, the Kannada-speaking native of Pune, India, and associate professor of computer science and of electrical and computer engineering, the achievements of the CLSP and its alumni \u201cprove the supremacy of a system that rewards talent without prejudice.\u201d<\/p>\n<p>Although his role sometimes resembles that of a sheepdog trying to herd a dozen cats, Khudanpur sees in the CLSP \u201ca combination of analytical talent and creative talent. It is a little bit like the human brain; some people are eccentric and crazy, and some people are like the glial cells\u2014the glue that holds it all together.\u201d<\/p>\n<p>Fifty-five years ago, the pioneering American computer scientist Ida Rhodes wrote that \u201cthe heartbreaking problem that we face \u2026 is how to use the machine\u2019s considerable speed to overcome its lack of human cognizance.\u201d This conundrum still pulses in the heart of speech recognition.<\/p>\n<p>\u201cWhen someone asks, \u2018Can computers really understand language?\u2019 we have to have a test for the meaning of \u2018understand,\u2019\u201d says Khudanpur. \u201cComputers do not yet understand language in its fullest sense, but if we were only at the fifth-grade level before, we may be getting closer to the 10th grade now.\u201d<\/p>\n<p>One vexing problem, Khudanpur says, is the difficulty that existing speech recognition systems have in sorting out a cacophony of competing sounds; in his words, \u201cit cannot yet be the fly on the wall, sorting out all the voices in the room, the air vent cycling on and off, the 4-year-old child crying, and the fly\u2019s own buzzing.\u201d<\/p>\n<p>Khudanpur points to the snippet of a cellphone call that lay at the center of the Trayvon Martin shooting case in Florida in 2012 as an example of the shortcomings of the current state of his art. On the static-filled tape, a man cries \u201cHelp!\u201d But forensic audiologists were unable to determine whether the voice was Martin\u2019s or shooter George Zimmerman\u2019s.<\/p>\n<p>\u201cThese things still break our computers,\u201d Khudanpur says, \u201cbut that problem is within reach. Deep neural networks (<a href=\"https:\/\/engineering.jhu.edu\/magazine-archive\/2017\/01\/say-what\/2\/\" target=\"_blank\" rel=\"noopener noreferrer\">see the glossary<\/a>) are starting to do things that we don\u2019t understand. Given enough data, our computers may soon be able to figure out things that we cannot figure out. That\u2019s very promising, isn\u2019t it? To have a partner who can solve things that you cannot solve?\u201d<\/p>\n<p>Ultimately, solving the problem of translating one language into another with meaning, feeling, and nuance is the goal of <a href=\"https:\/\/www.cs.jhu.edu\/faculty\/philipp-koehn\/\" target=\"_blank\" rel=\"noopener noreferrer\">Philipp Koehn<\/a>, a German-born computer science professor at the CLSP, who is credited as being one of the inventors of a phrase-based translation approach.<\/p>\n<p>\u201cIn general,\u201d Koehn says, \u201cpeople will come to expect that if you see or hear something in a foreign language, whether it is a street sign or a Facebook post or something on the radio or a YouTube subtitle, you can just click on it and have it translated.\u201d<\/p>\n<p>Currently, Koehn says, machine translation is achieving useful results for such texts as technical manuals and law books, but not poetry or literature. Asked if a computer today could deliver a workable translation of, say, a Harry Potter book, Koehn says, \u201cIt could, but there would be mistakes in every sentence.\u201d<\/p>\n<p>&nbsp;<\/p>\n<h2>Code-Writing Masterminds<\/h2>\n<p>The work of bridging the divide between human and machine can be a deeply individual and intellectual pursuit, accomplished by code-writing masterminds working deep into the darkest hours of the night.<\/p>\n<p>Among these is Czech-born <a href=\"http:\/\/www.clsp.jhu.edu\/faculty\/jan-trmal\/\" target=\"_blank\" rel=\"noopener noreferrer\">Jan Trmal<\/a>, an assistant research scientist who is challenged by another imperfection in automatic speech recognition: the fact that, even when a system can understand one language, it is useless when spoken to in another tongue.<\/p>\n<p>Says Trmal: &#8220;The fact you have good American English ASR does not really help you in any way if you are facing the need to get a good ASR in a different language\u2014say, Russian or Japanese. The more distant the languages grow, the more problems you would have with getting it to work. So the work on each ASR starts with collecting lots of training data,\u201d which is time- and money-consuming.<\/p>\n<p>\u201cThis is what I think would be of tremendous benefit\u2014the ability to skip the training data collection part or, more precisely, to reduce the stage significantly,\u201d he says. But, as any human who has tried to learn a second language knows, being fluent in Tagalog is of little value when trying to order a meal in Greek. For computers, the hurdle is even higher\u2014to absorb and intuit not only vocabulary, but syntax and innuendo.<\/p>\n<p><a href=\"https:\/\/www.cs.jhu.edu\/faculty\/jason-eisner\/\" target=\"_blank\" rel=\"noopener noreferrer\">Jason Eisner<\/a>, a computer science professor from New Jersey, wants to surmount that hurdle using machine learning. &#8220;Babies can puzzle out the structure of a language\u2014both the pieces and how to smush them together,&#8221; he says. &#8220;They&#8217;re solving a great big statistical inference problem: &#8216;Why am I hearing what I&#8217;m hearing?'&#8221; Eisner tries to get his computers to crack various parts of that puzzle, and he&#8217;s developing new automated strategies each year.<\/p>\n<p>Next door to Eisner, Trmal&#8217;s office mate is Assistant Research Professor <a href=\"http:\/\/www.clsp.jhu.edu\/faculty\/daniel-povey-2\/\" target=\"_blank\" rel=\"noopener noreferrer\">Dan Povey<\/a>, whose enterprise is the creation, in conjunction with computer scientists around the world, of an entirely new, open-source speech recognition programming infrastructure called Kaldi.<\/p>\n<p>At its most basic level, Kaldi\u2014and Povey\u2014seeks to update Radio Rex by expanding the commands that he and his digital progeny can understand into the billions and beyond.<\/p>\n<p>\u201cThe challenge is that the more possible inputs there are, the more difficult speech recognition becomes,\u201d says Povey, who can be found most midnights bent over his laptop, subsisting on meals of microwaved potatoes, corn flakes, and a bottomless mug of green tea.<\/p>\n<p>\u201cSimple tasks, such as chess, are easy\u2014\u2018rook to king 6\u2019\u2014or a Google search.<\/p>\n<p>\u201cBefore deep neural networks came along, it was believed that speech recognition was at kind of a plateau. But now, with deep learning, it\u2019s not really possible to say why a computer came to a certain conclusion, and that\u2019s a property that it shares with the human brain.<\/p>\n<p>\u201cI don\u2019t think we know whether there\u2019s anything special about our brain,\u201d Povey muses. \u201cMaybe it\u2019s not magic.\u201d<\/p>\n<p>This conjecture leads to a tantalizing\u2014and controversial\u2014question. If the human brain is not animated by some unique or divine spark, then is it possible that, by building a computer network with an equal number of connections, the brain could be copied onto silicon?<\/p>\n<p>\u201cThe question is not whether it can be replicated,\u201d Povey says. \u201cThe question is how long it will take.\u201d<\/p>\n<p>&nbsp;<\/p>\n<a href=\"https:\/\/engineering.jhu.edu\/magazine-archive\/wp-content\/uploads\/2017\/01\/Say-What_3.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-5504\" src=\"https:\/\/engineering.jhu.edu\/magazine-archive\/wp-content\/uploads\/2017\/01\/Say-What_3.jpg\" alt=\"Illustration of scientists examining speech input and output for translation\" width=\"600\" height=\"308\" srcset=\"https:\/\/engineering.jhu.edu\/magazine-archive\/wp-content\/uploads\/2017\/01\/Say-What_3.jpg 600w, https:\/\/engineering.jhu.edu\/magazine-archive\/wp-content\/uploads\/2017\/01\/Say-What_3-300x154.jpg 300w\" sizes=\"auto, (max-width: 600px) 100vw, 600px\" \/><\/a>\n<p>&nbsp;<\/p>\n<h2>Learning to Infer<\/h2>\n<p>The biggest problem at this stage of speech-to-computer conversation, says <a href=\"https:\/\/www.cs.jhu.edu\/faculty\/benjamin-van-durme\/\" target=\"_blank\" rel=\"noopener noreferrer\">Benjamin Van Durme<\/a>, a native of upstate New York and assistant computer science professor, is that machines do not yet possess \u201ca general understanding of how the world works.\u201d<\/p>\n<p>\u201cHuman beings know basic things about the world,\u201d Van Durme notes. \u201cWe know that birds fly, that people blink and breathe and eat food and sleep. But an iPhone doesn\u2019t know that if you push it off the desk, it will break.\u201d<\/p>\n<p>In the near term, Van Durme says, computers will become \u201cvery good at recognizing very explicit facts. But the ability to infer things that have never been described, that will take possibly 10 years, when we have these devices that follow all people around all of the time.\u201d<\/p>\n<p>The issue is the relative pokiness of the modern-day computer; according to Yarowsky, a machine needs to input 10 to 100 times more data to learn a new word or phrase than does a human child.<\/p>\n<p>Says Van Durme, \u201cAs we have video cameras on every building, on every pair of glasses, on drones that are constantly over our heads\u2014this totally pervasive video, feeding enormous amounts of data\u2014then the computer will know.\u201d<\/p>\n<p>Siri and Alexa already can proclaim that it was pitcher Scott McGregor on the mound for the Baltimore O\u2019s when they won the Series in \u201983. But it is what they do not yet know\u2014the human emotions that they cannot sense or feel\u2014that is the inspiration for the work of CLSP\u2019s <a href=\"http:\/\/engineering.jhu.edu\/ece\/faculty\/najim-dehak\/\" target=\"_blank\" rel=\"noopener noreferrer\">Najim Dehak<\/a>, a son of war-torn Algeria who is writing the code for what he calls \u201cdepression detection from speech.\u201d<\/p>\n<p>With so much of our lives now played out on social media, Dehak envisions a time when the content, keywords, and frequency of our tweets\u2014or the tone of our voice when we order from Alexa\u2014could serve as a trigger for professional intervention.<\/p>\n<p>\u201cIf there is a soldier with PTSD,\u201d says Dehak, an assistant professor of electrical and computer engineering, &#8220;we could create a system that can call him every day to see how he is feeling without having to wait for his next appointment.\u201d<\/p>\n<p>Dozens of machine learning specialists from around the world spent six weeks last summer at the <a href=\"http:\/\/www.clsp.jhu.edu\/workshops\/\" target=\"_blank\" rel=\"noopener noreferrer\">CLSP\u2019s famed summer workshop<\/a>, started in 1997 and now called the Frederick Jelinek Memorial Workshop, discussing such hypotheses as the possibility of deducing suicidal intent from the frequency and keywords of a person\u2019s tweets. This challenging topic\u2014whether \u201cclinically meaningful information can be derived from social media language\u201d\u2014echoes Dehak\u2019s work on emotion recognition at the dawn of<br \/>\nthe age of the social robot. Computers, he says, soon will learn to intuit sadness by contrasting it with billions of inputs from people who are happy.<\/p>\n<p>\u201cWouldn\u2019t it be good if [Alexa or Siri] could feel your emotions and react to them? We all want and need someone to talk to.\u201d Such a system, Dehak suggests, would help people with autism as well as combat veterans. But there also is a commercial engine powering the rocket of speech recognition.<\/p>\n<p>\u201cEvery company wants to know your behavior,\u201d states Dehak. \u201cThat\u2019s where the real money is. Imagine a single system that manages your phone, your car, your house\u2014all controlled by your voice alone.\u201d<\/p>\n<p>&nbsp;<\/p>\n<h2>Ethical Conundrums<\/h2>\n<p>At the CLSP, the accomplished specialists who toil away at the melding of human and artificial intelligences understand that Radio Rex is out of his doghouse\u2014and it is too late to slide him back in. They also understand that they are building a pathway to a future of perpetual surveillance and robotic companionship\u2014a future that no one truly can foresee.<\/p>\n<p>\u201cIn about five years, we will have the capacity to capture every utterance,\u201d predicts Van Durme. \u201cAs a scientist, yes, I would like access to that data and what it will enable. As an individual, it\u2019s very concerning. But there\u2019s no way around it.\u201d<\/p>\n<p>\u201cI don\u2019t think I want a computer to hear everything I say,\u201d Dan Povey flatly states. \u201cI\u2019d want the ability to turn it off. Or maybe we could have an erase button. We are simply providing the technology to make it possible to be recorded all the time. It\u2019s up to others to decide whether or not to use it.\u201d<\/p>\n<p>\u201cWe\u2019re not doomed by this,\u201d Khudanpur says. \u201cI don\u2019t worry about it. We will put in mechanisms. When we built nuclear weapons, we could all have been annihilated, but we\u2019re still here.\u201d<\/p>\n<p>Will our machines know if we are lying?<\/p>\n<p>\u201cThat\u2019s a tough one,\u201d he says. \u201cA well-thought-out lie, I don\u2019t think so.\u201d<\/p>\n<p><i>Will they be able to recognize not only what we are saying, but what we are thinking?<\/i> \u201cNo,\u201d Khudanpur replies. \u201cWhat&#8217;s in your mind is only in your mind. So far.\u201d<br \/>\n<!--nextpage--><\/p>\n<h3>Glossary<\/h3>\n<h4><i><br \/>\n<\/i>Artificial Intelligence:<\/h4>\n<p><i>The theory and development of <\/i>computer systems able to perform tasks normally requiring human<br \/>\nintelligence, such as visual perception, speech recognition, decision-making, translation between languages, and winning on Jeopardy!, as IBM\u2019s Watson did in 2011. Or reminding you that today is your wife\u2019s birthday.<\/p>\n<h4>Machine Translation:<\/h4>\n<p>The process by which computer software is used to convert text or speech in one language into another language while preserving the original meaning, such as the \u201cclick to translate\u201d function on Facebook that can tell you what the \u201cGangnam style\u201d lyrics really mean.<\/p>\n<h4>Deep Learning\/Deep Neural Networks:<\/h4>\n<p>Deep learning software attempts to mimic the signal transformation that takes place in the deepest layers of neurons in the human brain. Instead of a single train of connected \u201cneurons,\u201d deep learning supercomputers are programmed with several wide but independent strata of processing units that can comb through massive amounts of data, making inferences about meaning, visual clues, and speech patterns that are then passed to the next higher level, and so on.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Think today&#8217;s computers are smart? Just look at what&#8217;s coming. Meet a multinational bullpen of computer scientists who are rapidly bridging the divide between humans and machines.<\/p>\n","protected":false},"author":4,"featured_media":5488,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[28],"tags":[1048,119,1040,131,148,222,223,225,226,996,1000,1004,1008,1012,1016,1020,1024,1028,1032,1036,1044],"class_list":["post-5052","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-features","tag-deep-neural-networks","tag-department-of-computer-science","tag-artificial-intelligence","tag-department-of-electrical-and-computer-engineering","tag-benjamin-van-durme","tag-sanjeev-khudanpur","tag-jason-eisner","tag-clsp","tag-center-for-language-and-speech-processing","tag-david-yarowsky","tag-speech-recognition","tag-machine-learning","tag-hynek-hermansky","tag-frederick-jelinek","tag-philipp-koehn","tag-machine-translation","tag-jan-trmal","tag-kaldi","tag-daniel-povey","tag-najim-dehak","tag-deep-learning","issue-winter-2017"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v28.0 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Say What? - JHU Engineering Magazine<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/engineering.jhu.edu\/magazine-archive\/2017\/01\/say-what\/\" \/>\n<link rel=\"next\" href=\"https:\/\/engineering.jhu.edu\/magazine-archive\/2017\/01\/say-what\/2\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Say What? - JHU Engineering Magazine\" \/>\n<meta property=\"og:description\" content=\"Think today&#039;s computers are smart? Just look at what&#039;s coming. Meet a multinational bullpen of computer scientists who are rapidly bridging the divide between humans and machines.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/engineering.jhu.edu\/magazine-archive\/2017\/01\/say-what\/\" \/>\n<meta property=\"og:site_name\" content=\"JHU Engineering Magazine\" \/>\n<meta property=\"article:published_time\" content=\"2017-01-10T15:46:42+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2020-02-14T21:16:18+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/engineering.jhu.edu\/magazine-archive\/wp-content\/uploads\/2017\/01\/Say-What_THUMB.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"300\" \/>\n\t<meta property=\"og:image:height\" content=\"200\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"Abby Lattes\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Abby Lattes\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"15 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"NewsArticle\",\"@id\":\"https:\\\/\\\/engineering.jhu.edu\\\/magazine-archive\\\/2017\\\/01\\\/say-what\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/engineering.jhu.edu\\\/magazine-archive\\\/2017\\\/01\\\/say-what\\\/\"},\"author\":{\"name\":\"Abby Lattes\",\"@id\":\"https:\\\/\\\/engineering.jhu.edu\\\/magazine-archive\\\/#\\\/schema\\\/person\\\/0244393be370fbc3ead8ec26062e9742\"},\"headline\":\"Say What?\",\"datePublished\":\"2017-01-10T15:46:42+00:00\",\"dateModified\":\"2020-02-14T21:16:18+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/engineering.jhu.edu\\\/magazine-archive\\\/2017\\\/01\\\/say-what\\\/\"},\"wordCount\":3050,\"commentCount\":0,\"image\":{\"@id\":\"https:\\\/\\\/engineering.jhu.edu\\\/magazine-archive\\\/2017\\\/01\\\/say-what\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/engineering.jhu.edu\\\/magazine-archive\\\/wp-content\\\/uploads\\\/2017\\\/01\\\/Say-What_THUMB.jpg\",\"keywords\":[\"Deep Neural Networks\",\"Department of Computer Science\",\"Artificial Intelligence\",\"Department of Electrical and Computer Engineering\",\"Benjamin Van Durme\",\"Sanjeev Khudanpur\",\"Jason Eisner\",\"CLSP\",\"Center for Language and Speech Processing\",\"David Yarowsky\",\"Speech Recognition\",\"Machine Learning\",\"Hynek Hermansky\",\"Frederick Jelinek\",\"Philipp Koehn\",\"Machine Translation\",\"Jan Trmal\",\"Kaldi\",\"Daniel Povey\",\"Najim Dehak\",\"Deep Learning\"],\"articleSection\":[\"Features\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\\\/\\\/engineering.jhu.edu\\\/magazine-archive\\\/2017\\\/01\\\/say-what\\\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/engineering.jhu.edu\\\/magazine-archive\\\/2017\\\/01\\\/say-what\\\/\",\"url\":\"https:\\\/\\\/engineering.jhu.edu\\\/magazine-archive\\\/2017\\\/01\\\/say-what\\\/\",\"name\":\"Say What? - JHU Engineering Magazine\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/engineering.jhu.edu\\\/magazine-archive\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/engineering.jhu.edu\\\/magazine-archive\\\/2017\\\/01\\\/say-what\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/engineering.jhu.edu\\\/magazine-archive\\\/2017\\\/01\\\/say-what\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/engineering.jhu.edu\\\/magazine-archive\\\/wp-content\\\/uploads\\\/2017\\\/01\\\/Say-What_THUMB.jpg\",\"datePublished\":\"2017-01-10T15:46:42+00:00\",\"dateModified\":\"2020-02-14T21:16:18+00:00\",\"author\":{\"@id\":\"https:\\\/\\\/engineering.jhu.edu\\\/magazine-archive\\\/#\\\/schema\\\/person\\\/0244393be370fbc3ead8ec26062e9742\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/engineering.jhu.edu\\\/magazine-archive\\\/2017\\\/01\\\/say-what\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/engineering.jhu.edu\\\/magazine-archive\\\/2017\\\/01\\\/say-what\\\/#primaryimage\",\"url\":\"https:\\\/\\\/engineering.jhu.edu\\\/magazine-archive\\\/wp-content\\\/uploads\\\/2017\\\/01\\\/Say-What_THUMB.jpg\",\"contentUrl\":\"https:\\\/\\\/engineering.jhu.edu\\\/magazine-archive\\\/wp-content\\\/uploads\\\/2017\\\/01\\\/Say-What_THUMB.jpg\",\"width\":300,\"height\":200},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/engineering.jhu.edu\\\/magazine-archive\\\/#website\",\"url\":\"https:\\\/\\\/engineering.jhu.edu\\\/magazine-archive\\\/\",\"name\":\"JHU Engineering Magazine\",\"description\":\"\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/engineering.jhu.edu\\\/magazine-archive\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Person\",\"@id\":\"https:\\\/\\\/engineering.jhu.edu\\\/magazine-archive\\\/#\\\/schema\\\/person\\\/0244393be370fbc3ead8ec26062e9742\",\"name\":\"Abby Lattes\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/c56cb7af5427f847aa288542444ba9ff3d2107bf85dc6c6d44a4d1315608258d?s=96&r=g\",\"url\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/c56cb7af5427f847aa288542444ba9ff3d2107bf85dc6c6d44a4d1315608258d?s=96&r=g\",\"contentUrl\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/c56cb7af5427f847aa288542444ba9ff3d2107bf85dc6c6d44a4d1315608258d?s=96&r=g\",\"caption\":\"Abby Lattes\"}}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Say What? - JHU Engineering Magazine","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/engineering.jhu.edu\/magazine-archive\/2017\/01\/say-what\/","next":"https:\/\/engineering.jhu.edu\/magazine-archive\/2017\/01\/say-what\/2\/","og_locale":"en_US","og_type":"article","og_title":"Say What? - JHU Engineering Magazine","og_description":"Think today's computers are smart? Just look at what's coming. Meet a multinational bullpen of computer scientists who are rapidly bridging the divide between humans and machines.","og_url":"https:\/\/engineering.jhu.edu\/magazine-archive\/2017\/01\/say-what\/","og_site_name":"JHU Engineering Magazine","article_published_time":"2017-01-10T15:46:42+00:00","article_modified_time":"2020-02-14T21:16:18+00:00","og_image":[{"width":300,"height":200,"url":"https:\/\/engineering.jhu.edu\/magazine-archive\/wp-content\/uploads\/2017\/01\/Say-What_THUMB.jpg","type":"image\/jpeg"}],"author":"Abby Lattes","twitter_card":"summary_large_image","twitter_misc":{"Written by":"Abby Lattes","Est. reading time":"15 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"NewsArticle","@id":"https:\/\/engineering.jhu.edu\/magazine-archive\/2017\/01\/say-what\/#article","isPartOf":{"@id":"https:\/\/engineering.jhu.edu\/magazine-archive\/2017\/01\/say-what\/"},"author":{"name":"Abby Lattes","@id":"https:\/\/engineering.jhu.edu\/magazine-archive\/#\/schema\/person\/0244393be370fbc3ead8ec26062e9742"},"headline":"Say What?","datePublished":"2017-01-10T15:46:42+00:00","dateModified":"2020-02-14T21:16:18+00:00","mainEntityOfPage":{"@id":"https:\/\/engineering.jhu.edu\/magazine-archive\/2017\/01\/say-what\/"},"wordCount":3050,"commentCount":0,"image":{"@id":"https:\/\/engineering.jhu.edu\/magazine-archive\/2017\/01\/say-what\/#primaryimage"},"thumbnailUrl":"https:\/\/engineering.jhu.edu\/magazine-archive\/wp-content\/uploads\/2017\/01\/Say-What_THUMB.jpg","keywords":["Deep Neural Networks","Department of Computer Science","Artificial Intelligence","Department of Electrical and Computer Engineering","Benjamin Van Durme","Sanjeev Khudanpur","Jason Eisner","CLSP","Center for Language and Speech Processing","David Yarowsky","Speech Recognition","Machine Learning","Hynek Hermansky","Frederick Jelinek","Philipp Koehn","Machine Translation","Jan Trmal","Kaldi","Daniel Povey","Najim Dehak","Deep Learning"],"articleSection":["Features"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/engineering.jhu.edu\/magazine-archive\/2017\/01\/say-what\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/engineering.jhu.edu\/magazine-archive\/2017\/01\/say-what\/","url":"https:\/\/engineering.jhu.edu\/magazine-archive\/2017\/01\/say-what\/","name":"Say What? - JHU Engineering Magazine","isPartOf":{"@id":"https:\/\/engineering.jhu.edu\/magazine-archive\/#website"},"primaryImageOfPage":{"@id":"https:\/\/engineering.jhu.edu\/magazine-archive\/2017\/01\/say-what\/#primaryimage"},"image":{"@id":"https:\/\/engineering.jhu.edu\/magazine-archive\/2017\/01\/say-what\/#primaryimage"},"thumbnailUrl":"https:\/\/engineering.jhu.edu\/magazine-archive\/wp-content\/uploads\/2017\/01\/Say-What_THUMB.jpg","datePublished":"2017-01-10T15:46:42+00:00","dateModified":"2020-02-14T21:16:18+00:00","author":{"@id":"https:\/\/engineering.jhu.edu\/magazine-archive\/#\/schema\/person\/0244393be370fbc3ead8ec26062e9742"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/engineering.jhu.edu\/magazine-archive\/2017\/01\/say-what\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/engineering.jhu.edu\/magazine-archive\/2017\/01\/say-what\/#primaryimage","url":"https:\/\/engineering.jhu.edu\/magazine-archive\/wp-content\/uploads\/2017\/01\/Say-What_THUMB.jpg","contentUrl":"https:\/\/engineering.jhu.edu\/magazine-archive\/wp-content\/uploads\/2017\/01\/Say-What_THUMB.jpg","width":300,"height":200},{"@type":"WebSite","@id":"https:\/\/engineering.jhu.edu\/magazine-archive\/#website","url":"https:\/\/engineering.jhu.edu\/magazine-archive\/","name":"JHU Engineering Magazine","description":"","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/engineering.jhu.edu\/magazine-archive\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Person","@id":"https:\/\/engineering.jhu.edu\/magazine-archive\/#\/schema\/person\/0244393be370fbc3ead8ec26062e9742","name":"Abby Lattes","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/secure.gravatar.com\/avatar\/c56cb7af5427f847aa288542444ba9ff3d2107bf85dc6c6d44a4d1315608258d?s=96&r=g","url":"https:\/\/secure.gravatar.com\/avatar\/c56cb7af5427f847aa288542444ba9ff3d2107bf85dc6c6d44a4d1315608258d?s=96&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/c56cb7af5427f847aa288542444ba9ff3d2107bf85dc6c6d44a4d1315608258d?s=96&r=g","caption":"Abby Lattes"}}]}},"_links":{"self":[{"href":"https:\/\/engineering.jhu.edu\/magazine-archive\/wp-json\/wp\/v2\/posts\/5052","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/engineering.jhu.edu\/magazine-archive\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/engineering.jhu.edu\/magazine-archive\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/engineering.jhu.edu\/magazine-archive\/wp-json\/wp\/v2\/users\/4"}],"replies":[{"embeddable":true,"href":"https:\/\/engineering.jhu.edu\/magazine-archive\/wp-json\/wp\/v2\/comments?post=5052"}],"version-history":[{"count":16,"href":"https:\/\/engineering.jhu.edu\/magazine-archive\/wp-json\/wp\/v2\/posts\/5052\/revisions"}],"predecessor-version":[{"id":13608,"href":"https:\/\/engineering.jhu.edu\/magazine-archive\/wp-json\/wp\/v2\/posts\/5052\/revisions\/13608"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/engineering.jhu.edu\/magazine-archive\/wp-json\/wp\/v2\/media\/5488"}],"wp:attachment":[{"href":"https:\/\/engineering.jhu.edu\/magazine-archive\/wp-json\/wp\/v2\/media?parent=5052"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/engineering.jhu.edu\/magazine-archive\/wp-json\/wp\/v2\/categories?post=5052"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/engineering.jhu.edu\/magazine-archive\/wp-json\/wp\/v2\/tags?post=5052"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}