Intelligence: Machines That Talk Back

Archives

August 3, 2010: The U.S. Department of Defense is evaluating three hand held electronic translators that only use spoken input, and output. That is, an American soldier says, what he wants an Afghan to understand, into the device, pushes button, and out it comes translated into the language (Pushtun or Dari) that the Afghan speaks and pronounced so the Afghan can understand it. The soldier then holds the device up to the Afghan, who answers in their language. Another button pressed and the soldier hears what the Afghan said translated into English.

This sort of technology is nothing new. Voice recognition software (that takes what you say and types it out on the computer) has made great strides over the last two decades. Not just for people who don't want to  type, but also for people seeking information over the phone. Many businesses now use these robotic voice recognition systems to handle telephone inquiries. These systems are reliable, if annoying, and it was a small step from that to a system that also translates. There are already laptop based systems that take spoken (in a foreign language) phrases and translate them into English that is displayed on the laptop screen. The challenge has been getting this into a portable device, so soldiers at a checkpoint or walking through a village could immediately converse with the locals.

For example, four years ago, the U.S. Army began using a language-translation device called Mastor (Multilingual Automatic Speech-to-Speech Translator). This was translation software that did not, as in the past, require the user to speak a long list of words and phrases into a microphone, to enable the software could fully understand your particular voice. The Mastor software understood anyone (well, almost anyone) immediately. This software was used in most laptops, with the addition of a good microphone.

Language translation devices have been available since the beginning of the Iraq war. Most were hand-held, smart phone sized systems that held a bunch of commonly used phrases. The user selected the English language version, and the device would speak it out loud in, say, Arabic. It was crude, but it was useful, and the troops liked it. However, a human translator was much preferred, as you could only do so much with a list of words and phrases. This made Mastor appear as the future of battlefield machine translation. Mastor was basically a robot (in the form of a laptop computer) translator. The English and Arabic person speaks to it, is understood, and has their speech translated. In addition to the synthetic speech, the conversation is also stored as text, which makes it even more useful for official business. The Mastor translation was crude, but serviceable, compared to a human translator. Mastor was used in hospitals and other places where American and Iraqis (and Afghans) need to speak with each other. There are never enough translators to go around, and Mastor took up some of the slack.

An example of how far this speech recognition software has gone can be seen in a recent IBM system, called Watson, that can, for example, be programmed to play the TV quiz game Jeopardy, and consistently win against the best human players. Watson uses the IBM Blue Gene supercomputer system to accomplish this. A key element of Watson was its ability to decipher language and determine what was being said.