There has been a good deal of research on machine speech-to-speech translation (S2ST) in Japan, and this article presents these and our own recent research on automatic simultaneous speech translation. The S2ST system is basically composed of three modules: Large vocabulary continuous automatic speech recognition (ASR), machine text-to-text translation (MT) and text-to-speech synthesis (TTS). All these modules need to be multilingual in nature and thus require multilingual speech and corpora for training models. S2ST performance is drastically improved by deep learning and large training corpora, but many issues still still remain such as simultaneity, paralinguistics, context and situation dependency, intention and cultural dependency. This article presents current on-going research and discusses issues with a view to next-generation speech-to-speech translation.
CITATION STYLE
Satoshi, N., Sudoh, K., & Sakti, S. (2019). Towards machine speech-to-speech translation. Revista Tradumatica, (17), 81–87. https://doi.org/10.5565/rev/tradumatica.238
Mendeley helps you to discover research relevant for your work.