5

I need to use SSML to play an audio file with the tag in my Alexa Skill (as per Amazon's instructions).

Problem is, I don't know how to use SSML with Python. I know I can use it with Java but I want to build my skills with Python. I've looked all over, but haven't found any working examples of SSML in a Python script/program - does anyone know?

SamYoungNY
  • 5,716
  • 4
  • 22
  • 36

6 Answers6

5

This was asked two years ago but maybe someone will benefit from the below.

I've just checked and if you use Alexa Skills Kit SDK for Python you can simply add SSML to your response, for example:

@sb.request_handler(can_handle_func=is_request_type("LaunchRequest"))
def launch_request_handler(handler_input):

    speech_text = "Wait for it 3 seconds<break time="3s"/> Buuuu!"

    return handler_input.response_builder.speak(speech_text).response

Hope this helps.

wmatt
  • 475
  • 5
  • 16
3

SSML audio resides in the response.outputSpeech.ssml attribute. Here is an example obj with other required parameters removed:

{
 "response": {
    "outputSpeech": {
      "type": "SSML",
      "ssml": "<speak>
              Welcome to Car-Fu.
              <audio src="https://carfu.com/audio/carfu-welcome.mp3" />
              You can order a ride, or request a fare estimate. Which will it be?
              </speak>"
    }
}

Further reference:

BMW
  • 509
  • 3
  • 14
2

These comments really helped a lot in figuring out how to make SSML works using the ask-sdk-python. Instead of

speech_text = "Wait for it 3 seconds<break time="3s"/> Buuuu!" - from wmatt's comment

I defined variables that represents the start and end of every tags that I'm using

ssml_start = '<speak>'
speech_text = ssml_start + whispered_s + "Here are the latest alerts from MMDA" + whispered_e

using single quotes and concatenate those strings to the speech output and it worked! Thanks a lot guys! I appreciate it a lot!

Len
  • 56
  • 4
2

Install ssml-builder "pip install ssml-builder", and use it:

from ssml_builder.core import Speech

speech = Speech()
speech.add_text('sample text')
ssml = speech.speak()
print(ssml)
1

The ssml package for python exists.

you can install like below by pip



    $ pip install pyssml
    or
    $ pip3 install pyssml


so example is link below

http://blog.naver.com/chandong83/221145083125 sorry. it is korean.



    # -*- coding: utf-8 -*-
    # for amazon
    import re
    import os
    import sys
    import time
    from boto3 import client
    from botocore.exceptions import BotoCoreError, ClientError
    import vlc
    from pyssml.PySSML import PySSML


    # amazon service fuction
    # if isSSML is True, SSML format
    # else Text format
    def aws_polly(text, isSSML = False):
        voiceid = 'Joanna'

        try:
            polly = client("polly", region_name="ap-northeast-2")

            if isSSML:
                textType = 'ssml'
            else:
                textType = 'text'

            response = polly.synthesize_speech(
                    TextType=textType,
                    Text=text,
                    OutputFormat="mp3",
                    VoiceId=voiceid)

            # get Audio Stream (mp3 format)
            stream = response.get("AudioStream")

            # save the audio Stream File
            with open('aws_test_tts.mp3', 'wb') as f:
                data = stream.read()
                f.write(data)


            # VLC play audio
            # non block
            p = vlc.MediaPlayer('./aws_test_tts.mp3')
            p.play()

        except ( BotoCoreError, ClientError) as err:
            print(str(err))


    if __name__ == '__main__':
        # normal pyssml
        #s = PySSML()

        # amazon speech ssml
        s = AmazonSpeech()

        # normal 
        s.say('i am normal')

        #  speed is very slow
        s.prosody({'rate':"x-slow"}, 'i am very slow')

        #  volume is very loud
        s.prosody({'volume':'x-loud'}, 'my voice is very loud')

        #  take a one sec
        s.pause('1s')

        #  pitch is very high
        s.prosody({'pitch':'x-high'}, 'my tone is very high')

        # amazone 
        s.whisper('i am whispering')
        # print to convert to ssml format
        print(s.ssml())

        # request aws polly and play
        aws_polly(s.ssml(), True)

        # Wait while playback.
        time.sleep(50)


chandong83
  • 71
  • 8
0

This question was somewhat vague, however I did manage to figure out how to incorporate SSML into a Python script. Here's a snippet that plays some audio:

  if 'Item' in intent['slots']:
    chosen_item = intent['slots']['Item']['value']
    session_attributes = create_attributes(chosen_item)

    speech_output =  '<speak> Here is something to play' + \
    chosen_item + \
    '<audio src="https://s3.amazonaws.com/example/example.mp3" /> </speak>'
SamYoungNY
  • 5,716
  • 4
  • 22
  • 36
  • User BMW has pointed out the correct answer. When you set the `type` param of the `outputSpeech` JSON object to `SSML` and use `ssml` instead if `text`, you can use SSML tags (as documented in the [Speech Synthesis Markup Language (SSML) Reference](https://developer.amazon.com/public/solutions/alexa/alexa-skills-kit/docs/speech-synthesis-markup-language-ssml-reference)). – abaumg Jan 29 '18 at 14:50