Amazon Alexa SSML Copy and Paste Worksheet for Skills Development.
Date: December 6, 2020, V. 04c
Mark Maestas, Voices.App.
Document Source: https://voices.app/?p=285
Reference: https://developer.amazon.com/docs/custom-skills/speech-synthesis-markup-language-ssml-reference.html
Below is a list of SSML tags used to make Amazon Alexa Skills for easy copying and pasting. Just replace the * with your own text. For longer tag groupings, these are broken into two lines in case you wish to define them in variables.
If you are writing code, or using code-based tools, you may need to also include and root element SSML tags. Some skill-building tools, such as Voiceflow, already include these behind-the-scenes.
Voiceflow users: If you are adding SSML tags either in-line, or configuring them using variables, do not use the pull-down effects in the associated Speak or other blocks where text is configured.
The say-as tags have sample text to be replaced to help identify the context. Check the reference document for additional SSML tag options, parameters, constraints and incompatible tags and recent updates not yet documented here.
There are separate copy and paste worksheets for voice/lang SSML tags and speechcons (see the voices.app website, tools section).
==========================================
AMAZON:DOMAIN
==========================================
1. amazon:domain - conversational
// en-US, it-IT and ja-JP. Native Alexa voice
// en-US. Matthew and Joanna
__________________________________________
2. amazon:domain - long-form
// en-US only. Native Alexa voice only
__________________________________________
3. amazon:domain - music
// en-US, en-CA and en-GB. Native Alexa voice only
__________________________________________
4. amazon:domain - news
// en-US and en-AU. Native Alexa voice
// en-US. Matthew and Joanna voices.
// es-US (Spanish/American). Lupe voice
==========================================
5. amazon:effect
*
__________________________________________
6. amazon:emotion
// en-US, en-GB and ja-JP. Native Alexa voice only.
__________________________________________
7. audio (several variations depending on URL source)
// Basic tag
//Tag with MP3 extension
//Tag for a sound file on Amazon AWS S3
//Tag for a sound file on Amazon AWS Cloudfront
__________________________________________
8. audio from the Amazon Sound Library
Instructions:
a. Open the Amazon Sound Library Reference Page:
https://developer.amazon.com/en-US/docs/alexa/custom-skills/ask-soundlibrary.html
b. Search for an available sound.
c. When find a sound, click on the row. This will open up the Source code.
d. Copy / Paste the source code, which is already in an SSML tag format.
Example:
Category: Animals/Bear
Name: Bear Groan Roar (1)
__________________________________________
9. break
// Default value with no specified attribute is the same as "medium."
__________________________________________
10. emphasis
// Default value with no specified attribute is the same as "moderate."
****
_________________________________________
11. lang
Separate worksheet:
https://voices.app/?p=2054
_________________________________________
12. p (paragraph)
// Equivalent to break strength = x-strong before and after the tag.
*
__________________________________________
13. phoneme
// IPA
*
*
// Example:
You say, pecan.
I say, pecan.
//X-SAMPA
*
*
// Example:
bottle
//NOTE: Lists of supported symbols, by language, are in the Amazon SSML reference.
==========================================
PROSODY
==========================================
14. prosody - rate
***********************
__________________________________________
15. prosody - pitch
// Check for incompatible tags in the reference documentation.
****************
__________________________________________
16. prosody - volume
*******************
__________________________________________
17. prosody Valid Parameter Values
Numerical parameter values are possible for all 3 prosody types. However due to the variety of possible combinations when combining prosody attributes in the next several sections below, only the word-based values are listed here. If you need more refinement check for valid values in the SSML reference document or see the examples in the above sections.
Copy and paste the desired parameter value into the formulas in the next sections as part of configuring your voices. You can test andadjust the values using the ADC Voice & Tone Simulator.
For Pitch, check for incompatible tags in the reference documentation.
A. Pitch
x-slow
slow
medium
fast
x-faxt
B. Rate
x-low
low
medium
high
x-high
C. Volume
silent
x-soft
soft
medium
loud
x-loud
__________________________________________
18. prosody - three prosody attributes (pitch, rate, volume)
*
// Example:
hello
__________________________________________
19. prosody - two attributes (pitch, rate)
*
// Example:
hello
__________________________________________
20. prosody - two attributes (pitch, volume)
*
// Example:
hello
__________________________________________
21. prosody - two attributes (rate, volume)
*
// Example:
hello
__________________________________________
22. nested polly voice, lang and prosody combinations
Separate worksheet:
https://voices.app/?p=2054
==========================================
23. s (sentence)
// Equivalent to break strength = strong before and after the tag or ending a sentence with a period.
*
==========================================
SAY-AS / INTERPRET AS
==========================================
24. say-as characters (spell-out)
*
// Recite each individual letter
// Example: "h-e-l-l-o"
hello
__________________________________________
25. say-as cardinal (number)
*
// Recite the value as a cardinal number
// Example: "Twelve thousand three hundred and forty five"
12345
__________________________________________
26. say-as ordinal (number)
*
// Recite the value as an ordinal number.
// An ordinal number is a position in a series. For example, first, second, third, etc.
// Example: "You are now third in line"
You are now 3 in line
// Example: "Twelve thousand three hundred and forty fifth"
12345
__________________________________________
27. say-as digits (number)
*
// Recite each digit of a number individually.
// Example: "one two three four five"
12345
__________________________________________
28. say-as fraction
*
// Recite numerical value as a fraction. Alexa can recite both common fractions and mixed fractions.
// Common fractions. Examples:
// "1/2" will be recited as "half"
// "2/3" will be recited as "two thirds"
// "11/16" will be recited as "eleven sixteenths."
1/2
// Mixed fractions. Examples:
// "2+1/2" will be recited as "two and a half"
2+1/2
// "2+2/3" will be recited as "Two and two thirds."
2+2/3
__________________________________________
29. say-as unit
*
// Recite full name of an abbreviated unit value. Examples:
// "lb" and "lb." are recited as "pound"
// "lbs" and "lbs." are recited as "pounds."
lb
__________________________________________
30. say-as date
// This tag can configure how dates are recited based on a variety of abbreviated configurations. Review the blog post for examples.
***********
// NOTE: Format YYYMMDD is not supported.
// Use of question marks will bypass recitation of that part of the date:
????0922
// Interpreted as: "September twenty second."
__________________________________________
31. say-as time
*
// Recite duration in minutes and seconds. Example:
1'21"
// This will be recited as as: "One minute and twenty one seconds."
__________________________________________
32. say-as telephone (number)
// Supports 7 and 10 digit phone numbers, and extensions
*
// Examples:
5551212555-121220255512122025551212x345
// The last one will be recited as "two oh two, five five five, one two one two, extension three four five."
// NOTE: The say-as tag is not needed if the phone number is formatted with dashes.
// For example, "202-555-1212" will be recited with a pause after each dash.
// However if the text format is "2025551212" the say-as tag is needed in order to recite it as a telephone number.
__________________________________________
33. say-as address
*
// Example:
410 Terry Ave. N, Seattle WA, 98109
// This will be recited as: "Four ten Terry Avenue North, Seattle Washington, nine eight zero one nine."
__________________________________________
34. say-as interjection (Speechcons)
Separate worksheet:
https://voices.app/?p=856
__________________________________________
35. say-as expletive (bleep)
*
// Recite a "bleep"
// Example: "bad word" is recited as "bleep"
bad word
__________________________________________
36. speak
// Root element of SSML documents.
// Not required for Voiceflow.
*
__________________________________________
37. sub (substitute)
*
*
// Example: Al
__________________________________________
38. voice
Separate worksheet:
https://voices.app/?p=2054
==========================================
W role - Word pronunciation. Similar to Say-As)
==========================================
39. w amazon:VB - pronounce word as a present simple verb
*
// Examples:
read
// As in: "I am going to read a book" and not "I have read the book."
object
// As in: "I object, your honor!" and not "The object is an apple."
__________________________________________
40. w amazon:VBD - pronounce word as a verb, past participle
*
// Example:
read //verb, past participle
// As in: "I have read the book" and not "I am going to read the book."
__________________________________________
41. w amazon:NN - pronounce word as a noun
*
// Example:
object
// As in: "The object is an apple" and not "I object your honor!"
__________________________________________
42. w amazon:SENSE_1 - non-default word pronunciation
*
// Example:
bass
// "I play the bass guitar" is the primary pronunciation of "bass", the musical instrument
// This SSML tag changes to the secondary pronunciation of "bass", the freshwater fish
// As in: "Let's go bass fishing" and not "I play the bass guitar."
==========================================
// End of document.