Synthesizing Speech

By using an NSSpeechSynthesizer object, you can make your application speak a word, phrase, or sentence to the user. This synthesized speech is an essential aid to those users with attention or vision disabilities. It is also useful when you want to draw the user’s attention to something important when they might be distracted by something else on the screen.

Using an NSSpeechSynthesizer object to “pronounce” text is easy. You initialize the object with a voice and send a startSpeakingString: message to it, passing in an NSString object representing the text to speak. Optionally, you can implement one of several delegation methods either to accompany the words pronounced in some interactive fashion or to do something application-specific when speaking concludes.

Voices and Initialization

The essential attribute of an NSSpeechSynthesizer object is a voice. Speech Synthesis defines a number of voices for OS X, each with its own recognizable speech characteristics (such as gender and age). You can view the list of system voices, and set the default voice, in the Default Voice pane of the Speech system preferences.

If you initialize an NSSpeechSynthesizer instance using the init method, the default voice is used. If for some reason you want another voice, initialize the instance with the NSSpeechSynthesizer method initWithVoice:. You can change the voice at any time with the setVoice: method.

By invoking the class methods availableVoices and defaultVoice, you can get the list of system voices and the current default voice, respectively. Each voice has multiple attributes, including name, age, gender, and language. By invoking the class method attributesForVoice:, you can get an NSDictionary object for a specific voice which contains these attributes. (The argument of this method is a voice identifier, a string of the form com.apple.speech.synthesis.voice.voiceName.) See the reference documentation for NSSpeechSynthesizer for the valid dictionary keys.

Speaking Text

Once you have initialized the NSSpeechSynthesizer object, send the startSpeakingString: message to it, passing it the text to speak. Listing 1 illustrates initializing the object with the default voice and then, prior to speaking a string fetched from a text field (phraseField), setting the voice as requested by the user in a pop-up list (voiceList).

Listing 1  Using an NSSpeechSynthesizer object

- (id)init {
    self = [super init];
    if (self) {
    synth = [[NSSpeechSynthesizer alloc] init]; //start with default voice
                                              //synth is an ivar
    [synth setDelegate:self];
    }
    return self;
}
 
- (IBAction)speak:(id)sender
{
    NSString *text = [phraseField stringValue];
    NSString *voiceID =[[NSSpeechSynthesizer availableVoices] objectAtIndex:[voiceList indexOfSelectedItem]];
    [synth setVoice:voiceID];
    [synth startSpeakingString:text];
}

Note that this code example sets a delegate for the NSSpeechSynthesizer object in the init method. NSSpeechSynthesizer defines three methods to allow its delegate to become involved during the speaking of a string of text and after the speaking of a string:

Listing 2  An implementation of speechSynthesizer:didFinishSpeaking:

- (void)speechSynthesizer:(NSSpeechSynthesizer *)sender didFinishSpeaking:(BOOL)finishedSpeaking
{
    [_textToSpeechExampleTextView setSelectedRange:_orgSelectionRange]; // Set selection length to zero.
    [_textToSpeechExampleSpeakButton setTitle:@"Start Speaking"];
    [_saveButton setTitle:@"Save As File..."];
    [_textToSpeechExampleSpeakButton setEnabled:YES];
    [_saveButton setEnabled:YES];
    [_voicePop setEnabled:YES];
}