Different ssml values generate the same audio in Google Text to Speech

Question

Unable to generate different audio wave when using ssml when using WaveNet voices.

<prosody rate="slow" pitch="-2st">Can you hear me now?</prosody>
<prosody rate="medium" pitch="1st">Can you hear me now?</prosody>
<prosody rate="high" pitch="5st">Can you hear me now?</prosody>

Using the emphasis tag produces the same results.

We are using the Python API from Google Cloud Text-to-Speech to request audio generation.

I would like to hear different voice intensities in each sample.

Please note, we also try scaping the ", but it makes no diference in the generated audios.

https://issuetracker.google.com/issues/131618213

score 0 · Answer 1 · answered Aug 24 '20 at 19:10

I don't know what that looks like with the Python sdk, but I'm currently using their NodeJs sdk for TTS.

It seems that, these prosody properties (rate, volume, pitch), instead of setting and passing through your ssml text, should be configured directly in the request object which will be sent to Google TTS api.

Different ssml values generate the same audio in Google Text to Speech

1 Answers1