Eckher
EckherInsightsContact
Eckher
Your guide to what's next.
SSML
Apr 16, 2020

SSML

What is Speech Synthesis Markup Language (SSML) and how to use it?

Speech Synthesis Markup Language (SSML) is a markup language used with text-to-speech applications. Specifically, SSML provides more control over how the text is "read" and how the TTS engine is to pronounce proper names, acronyms, numbers, and so on.

Much like HTML is an alternative to plaintext on the web, SSML is used to mark up input text and is an alternative to submitting plaintext to the speech synthesizer. SSML elements allow you to add various details to the text input. The following is an example of an SSML document:

<?xml version="1.0"?>
<speak version="1.1" xmlns="http://www.w3.org/2001/10/synthesis" xml:lang="en-GB">
    Hi!
    <break time="2s" />
    Would you like some <phoneme alphabet="ipa" ph="ləˈsænjə">lasagne</phoneme>?
</speak>

SSML elements

The following table summarizes the most commonly used SSML elements:

ElementDescription
<speak>The root element of an SSML document.
<break>Represents a pause of specified duration.
<say‑as>Used to specify additional information about how numbers, dates, times are to be pronounced.
<audio>Used to insert an audio clip into the synthesized audio.
<sub>Used to substitute input text with an alternative utterance.
<prosody>Used to customize prosodic features of the contained text.
<emphasis>Represents the presence or absence of emphasis.
<phoneme>Used to provide phonetic respelling.
See also
Eckher Text to Speech
Transform your text into speech inside the web browser.
Introduction to SPARQL
SPARQL is a query language for graph data. The graph model of thinking fits well a lot of use cases.