AVSpeechSynthesizer - how to run c… | Apple Developer Forums

AVSpeechSynthesizer - how to run callback onError

I use AVSpeechSynthesizer to pronounce some text in German. Sometimes it works just fine and sometimes it doesn't for some unknown to me reason (there is no error, because the speak() method doesn't throw and the only thing I am able to observe is the following message logged in the console):

_BeginSpeaking: couldn't begin playback

I tried to find some API in the AVSpeechSynthesizerDelegate to register a callback when error occurs, but I have found none.

The closest match was this (but it appears to be only available for macOS, not iOS): https://developer.apple.com/documentation/appkit/nsspeechsynthesizerdelegate/1448407-speechsynthesizer?changes=_10

Below you can find how I initialize and use the speech synthesizer in my app:

class Speaker: NSObject, AVSpeechSynthesizerDelegate {
  class func sharedInstance() -> Speaker {
    struct Singleton {
      static var sharedInstance = Speaker()
    }
    return Singleton.sharedInstance
  }
   
  let audioSession = AVAudioSession.sharedInstance()
  let synth = AVSpeechSynthesizer()
   
  override init() {
    super.init()
    synth.delegate = self
  }
   
  func initializeAudioSession() {
    do {
      try audioSession.setCategory(.playback, mode: .spokenAudio, options: .duckOthers)
      try audioSession.setActive(true, options: .notifyOthersOnDeactivation)
    } catch {
       
    }
  }
   
  func speak(text: String, language: String = "de-DE") {
    guard !self.synth.isSpeaking else { return }

    let utterance = AVSpeechUtterance(string: text)
    let voice = AVSpeechSynthesisVoice.speechVoices().filter { $0.language == language }.first!
     
    utterance.voice = voice
    self.synth.speak(utterance)
  }
}

The audio session initialization is ran during app started just once.

Afterwards, speech is synthesized by running the following code:

Speaker.sharedInstance.speak(text: "Lederhosen")

The problem is that I have no way of knowing if the speech synthesis succeeded—the UI is showing "speaking" state, but nothing is actually being spoken.