UIJRT » United International Journal for Research & Technology

Review Paper for TTS Algorithm

Gauri S. Nandkhedkar, Prajakta A. Ghumatkar, Vinayak Kabra and Eesha Bhayya

Total Views / Downloads: 113 

Cite ➜

Nandkhedkar, G.S., Ghumatkar, P.A., Kabra, V. and Bhayya, E., 2021. Review Paper for TTS Algorithm. United International Journal for Research & Technology (UIJRT), 2(7), pp.197-200.

Abstract

The evolution of Text To Speech has seen many algorithms from the conventional Concatenative Synthesis to the most evolved Google’s Tacotron and its iteration Tacotron 2. This survey paper discusses the architecture and outcomes of the three recent models for TTS which are Wavenet, Tacotron 1, and Tacotron 2, and further compares all of them based on their respective mean opinion scores(MOS). The MOS Values for the above-mentioned algorithms are 4.21, 3.82, and 4.58 respectively. The paper concludes that Tacotron 2 has the highest MOS value and hence is also widely used across various applications of Text to Speech.

Keywords: Text-to-speech, Deep learning, Wavenet, Tacotron.

References

  1. Oord, Aaron van den , Sander Dieleman, Heiga Zen, Karen Simonyan, Oriol Vinyals, Alex Graves, Nal Kalchbrenner, Andrew Senior, and Koray Kavukcuoglu. ”Wavenet: A generative model for raw audio.” arXiv:1609.03499 (2018)
  2. Wang, R. Skerry-Ryan, D. Stanton, Y. Wu, R. J. Weiss, N. Jaitly, etal., ”Tacotron: Towards end-to-end speech synthesis”, Proc. Interspeech, pp. 4006-4010, Aug. 2017.
  3. Jonathan Shen, Ruoming Pang, Ron J Weiss, Mike Schuster, Navdeep Jaitly, Zongheng Yang, Zhifeng Chen, Yu Zhang, Yuxuan Wang, RJ Skerry-Ryan et al., ”Natural tts synthesis by conditioning wavenet on mel spectrogram predictions”, 2017.

For Conference & Paper Publication​

UIJRT Publication - International Journal