Text To Speech

Various companies (AT&T, IBM, and smaller companies) have been working to produce more natural sounding voices. Their work improves prosody -- the rhythm, intonation, and lexical stress in speech not represented in the orthography (written representation) of text.

Microsoft Text to Speech

YOUTUBE: Change Default Voice in Text to Speech

Microsoft Windows has a built-in voice engine.

On Windows 8 Control Panel, there is a Text to Speech control where you can select voices.

I prefer the UK_EN British accent. This is not so much that I think British accents are elegant (which I do), but because I am used to hearing American accents. Since British accents are unfamiliar to my brain's autonomic language processing, I am less bothered by unnatural intonations and stress than when listening to a "fake" American computer speak.

Microsoft Windows comes with a Speech program accessible through its Speech API (SAPI4 or SAPI 5.1). The "Sam" voice that comes from Microsoft sounds like a robot, with unnatural pauses and emphasis.

TTS Add-on to Windows

voices on on Android.

Ken Fallon's blog on shell script to turn Wikipedia text to speech.

Speech for Windows Phone 8

Phonemes

Computers may take several seconds to process sounds because it has to look up each word in its database of phonemes which instructs the speech engine how to pronouce each word. The phonemes are converted into a sound file. All this takes much computational power.

A phoneme string consists of one or more phoneme symbols and stress marks, optionally separated by whitespace.

The elapsed time from when the client first sends text to the server to when the client receives the first audio buffer from the server is measured using metric named TTFA = time first audio received.

There are several phoneme dictionaries.
The SAMPA phoneme set is used internationally for German, French, etc.
The DARPA phoneme set (nicknamed the darpabet) is used by US English voices to represent the sounds in the English language.
The IPA (International Phonetic Alphabet) [W] was devised by the IPA (International Phonetic Association) (established in 1886 in Paris) to represent the sounds of all languages. So, (unlike the darpabet) uses non-Latin characters -- 107 distinct letters and 56 diacritics and suprasegmentals visible in a font of their own design.

Apple Macintosh computers come with a MacinTalk text-to-speech embedded voice synthesizer that turns ASCII text into speech through its speaker.

Apple's North American phoneme text symbols represent vowels as pairs of uppercase letters and consonants by single letters. However, the DARPA phoneme set (the "darapabet" used by AT&T for English) does not capitalize vowels:

Phoneme Example Word Example Transcription

ey bait b ey t

ae bat b ae t

iy beat b iy t

eh bet b eh t

ay bite b ay t

ih bit b ih t

ow boat b ow t

aa bob b aa b

ao bought b ao t

aw brown b r ow n

oy boy b oy

ah but b ah t

ax about ax b ow t

uw boot b uw t

uh book b uh k

er bird b er d

b bet b eh t

ch church ch er ch

d dog d ao g

dx butter b ah dx er

f fog f ao g

g got g aa t

hh hot hh aa t

jh jump jh ah m p

k kit k ih t

l lot l aa t

em Chatham ch ae t em

m Mom m aa m

en satin s aa t en

n nod n aa d

ng thing th ih ng

p pot p aa t

q button b ah q en

r rat r ae t

s sat s ae t

sh shut sh ah t

t top t aa p

dh that dh aa t

th thick th ih k

v vat v aa t

w won w ah n

y you y uw

z zoo z uw

zh measure m eh zh er

Modifiers (also called prosodic control symbols) are used to specify emphasis in the DARPA phoneme set, these

Description darpabet Apple SAMPA

Silence pau %

No stress 0

Breath intake @

Primary stress 1

Secondary stress 2

Nextup's $30 TextAloud

Natural Voice free - $40 - $60 versions offer less voices than Nextup.

NeoSpeech voices

TTS engines support the Java Speech Markup Language (JSAPI) and the W3C Speech Synthesis Markup Language (SSML) Voice XML standards for client applications to include special instructions within the input text that may change the default behavior of the text synthesizer.

Your rating of this page:
Low High

Your comments on this topic, please:

Publish this comment publicly

Your first name:

Your family name:

Your location (city, country):

Your Email address:

Email me updates

Top of Page

Thank you!

Text To Speech

Text to Speech (TTS)

Microsoft Text to Speech

TTS Add-on to Windows

Phonemes