Google model instantly translates speech to foreign text using AI
Tue 4 Apr 2017
A deep neural network architecture which can directly translate speech from one language into text in another language is being developed by Google researchers.
The study, titled Sequence-to-Sequence Models Can Directly Transcribe Foreign Speech, describes using a modified sequence-to-sequence model, which has had previous success in speech recognition, to create a powerful encoder-decoder network for machine translation.
The paper explains that the new model does not explicitly transcribe the speech into text in the source language, nor does it require supervision from the source language transcription during training.
In testing, the research team reported ‘state-of-the-art performance’ on conversational Spanish to English speech translation tasks. The experiments used the Fisher Callhome Spanish-English dataset and found that the proposed model could outperform cascades of speech recognition and machine translation technologies.
Using the BLEU (bilingual evaluation understudy) scoring framework, which evaluates the quality of machine-translated text, the proposed system recorded 1.8 points over other translation models.
According to the study, when Spanish transcripts were used as training data for additional supervision across independent automatic speech recognition (ASR) and speech translation (ST) decoders, additional improvements of at least 1.4 BLEU points were obtained.
In future work, the Google researchers plan to construct a multilingual speech translation system in which a single decoder is shared across multiple languages.
With big advances in deep learning, human versus AI competitions are springing up across the world. In February, human translators battled against AI machine translators in Seoul, South Korea. While Google Translate and Naver Papago, both based on Neural Machine Translation (NMT), won the high-profile battle in terms of speed, they fell considerably behind their human counterparts on quality. It is however expected that with further NMT developments, such as this latest Google research, machine translation will improve at a fast rate and could reach human-level accuracy very soon.