I'm making a Safari extension for learning languages. I need speech synthesis for any language the user chooses to learn.
I initially tried to make this work within JavaScript, but Safari 18 doesn't reliably list voices for all languages on the web SpeechSynthesis API as described here: https://stackoverflow.com/questions/79179072/how-do-you-use-a-japanese-voice-with-speechsynthesis-in-safari-ios-18
As a workaround, I've had to use AVSpeechSynthesizer in SafariWebExtensionHandler (NSExtensionRequestHandling implementation for the extension). This works in the simulator but not on a real device. I've found this note from Apple in a StackOverflow reply:
"Safari extensions are very short-lived, hence not fit for audio playback or speech synthesis. Not being able to validate an app extension in Xcode with a manually-added plist entry for background audio is the designed behavior. The general recommendation is to synthesize speech using JavaScript in conjunction with the Web Speech API."
Unfortunately, the suggestion to use the Web Speech API is unsuitable as I just explained.
Is there a way to set up a background process in the host app that can do speech synthesis? The app extension would need a way to communicate with this process, and start it if it's not running. Is that possible?