A chunk of text to be spoken, along with parameters that affect its speech.


An AVSpeechUtterance object is the basic unit of speech synthesis.

To synthesize speech, you must:

  1. Create an AVSpeechUtterance instance containing the text to be spoken. (See Creating an Utterance.)

  2. (Optional) Change its voice (including the language used), rate, or other parameters. (See Configuring Utterance Speech.)

  3. Pass the utterance to an AVSpeechSynthesizer instance to begin speech (or enqueue the utterance to be spoken later if the synthesizer is already speaking).

You may choose whether and how to split a body of text into multiple utterances for speech. Because an utterance can control speech parameters, you can split text into sections that require different parameters. For example, you can emphasize a sentence by increasing the pitch and decreasing the rate of that utterance relative to others, or you can introduce pauses between sentences by putting each one into an utterance with a leading or trailing delay. Because the speech synthesizer sends messages to its delegate as it starts or finishes speaking an utterance, you can create an utterance for each meaningful unit in a longer text in order to be notified as its speech progresses.


Creating an Utterance

- initWithString:

Initializes an utterance object with text to be spoken.

+ speechUtteranceWithString:

Creates an utterance object with text to be spoken.

Configuring Utterance Speech


The baseline pitch at which the utterance will be spoken.


The amount of time a speech synthesizer will wait after the utterance is spoken before handling the next queued utterance.


The amount of time a speech synthesizer will wait before actually speaking the utterance upon beginning to handle it.


The rate at which the utterance will be spoken.


The voice used to speak the utterance.


The volume used when speaking the utterance.

Accessing Utterance Text


The text to be spoken in the utterance.

Speech Rate Constants

Allowed rates for synthesized speech.


The minimum allowed speech rate.


The maximum allowed speech rate.


The default rate at which an utterance is spoken unless its rate property is changed.


Spoken Text Attributes


A distinct voice for use in speech synthesis.