Mac OS X contains speech technologies that enable software to recognize and speak U.S. English. These technologies provide benefits for all users and present the possibility of a new paradigm for human-computer interaction.
Speech recognition is the ability for the computer to recognize and respond to a person’s speech. Using speech recognition, users can accomplish tasks comprising multiple steps—for example, “Schedule a meeting next Friday at 3 p.m. with John, Paul, and George” or “Create a 3-by-3 table”—with one spoken command. Mac OS X users can control the computer by voice rather than be limited to the mouse or keyboard; consequently, speech-recognition technology is very important for people with special needs and also for general users. Developers can take advantage of the speech engine and API included with Mac OS X, as well as the built-in user interface.
Speech synthesis also called text-to-speech (TTS), converts text into audible speech. It provides a way to deliver information to users without forcing them to shift attention from their current task. For example, the computer could, in the background, deliver such messages as “Your download is complete; one of the files has been corrupted” and “You have email from your boss; would you like to read it now?” TTS is also crucial for users with vision or attention disabilities. As with speech recognition, Mac OS X TTS provides both an API and several user interface features.
For information about implementing speech synthesis and recognition, see the documents in Guides > User Experience > Speech Technologies.
Last updated: 2008-06-09