What is Speech to Speech Translation?


No longer in the realm of science fiction, the concept of a real-time universal translator is currently in the works as pioneering companies such as Google and Facebook are acquiring and developing technologies that support speech recognition, language translation, and speech synthesis. In 2006, an advancement that led to the development and use of layered models of inputs, termed deep neural networks (DNN), brought speech recognition to its highest level of accuracy yet, clearing the way for speech-to-speech translation. As a result, today’s consumers are habitually interacting with voice-activated virtual assistants on their mobile phones and even in their vehicles with greater ease and comfort. Researchers are now applying DNN to automatic translation engines in efforts to increase the semantic accuracy of interpreting the world’s languages, and Microsoft engineers have already demo-ed software that can synthesize an individual’s own voice in another language, from English to Mandarin. Progress in machine learning technologies is bringing the universal translator closer to the consumer’s hand, and is poised to transform communication and collaboration at the global level.