The Text to Speech service understands text and natural language to generate synthesized audio output complete with appropriate cadence and intonation. It is available in 27 voices (13 neural and 14 standard) across 7 languages. Select voices now offer Expressive Synthesis and Voice Transformation features.
This system is for demonstration purposes only and is not intended to process Personal Data. No Personal Data is to be entered into this system as it may not have the necessary controls in place to meet the requirements of the General Data Protection Regulation (EU) 2016/679
The text language must match the selected voice language: Mixing language (English text with a Spanish male voice) does not produce valid results. The synthesized audio is streamed to the client as it is being produced, using the HTTP chunked encoding. The audio is returned in mp3 format which can be played using VLC and Audacity players.
For optimal naturalness, select neural voices (V3, enhanced dnn) in the list below. Please see Watson TTS blog for more information.