Record microphone in a Keyboard app (in the background)

I'm currently I'm working on an iOS app + custom keyboard extension, and I’m hoping to get some insight into how to best architect a workflow where the keyboard acts as a remote trigger for dictation, but the main app handles the actual microphone recording and transcription.

I know that third-party keyboards are sandboxed and can’t access the microphone directly, so the pattern I’m following is similar to what Wispr Flow appears to be doing:

What I'm Trying to Build

The user taps a mic button in the custom keyboard (installed system-wide).

This triggers the main app to open, start a recording session, and send the audio to my transcription endpoint (not using Speech.framework).

Once the transcription result is ready, it's stored in an App Group shared container.

The keyboard extension polls for or receives the transcribed text and inserts it into the current input field via textDocumentProxy.insertText(...).

Key Questions

  1. Triggering App Dictation from Keyboard

Is there a clean system-native way to transition from the keyboard to the main app and back?

Are there best practices around?:

  • Preventing jarring transitions (keyboard disappearing)?

  • Letting the app record in the background once triggered (see below)?

  1. Keeping the App Alive During Audio Recording

Once the main app is opened to handle the dictation:

I want it to record audio continuously (sometimes for up to a minute or 2 ), send it to an external transcription API, and return the result.

However, from what I have read, iOS aggressively suspends or kills apps that are not in the foreground or haven’t requested the correct background modes - especially if there are background tasks running for longer than 30 seconds.

Even with audio enabled in Background Modes and a live AVAudioEngine session, I find that the app is sometimes paused or killed after a few seconds; especially when the user switches back to the app where they want to type.

But apps like Wispr Flow seem to manage this well. Their flow allows the user to record voice and insert it seamlessly without the app being terminated mid-recording.

So:

✅ How do I prevent my app from being killed/suspended while it's recording, especially when the user switches back to the original app?

Do I need:

  • A background AVAudioSession hack?

  • Audio playback tricks (e.g. silent audio)?

  • A workaround using CallKit (some apps seem to use it for persistent audio sessions)?

  • Something else Apple allows but doesn’t document clearly (or I am just a bad sercher)?

  1. Returning Text to the Keyboard Extension

I’m using UserDefaults(suiteName:) in the App Group to pass the transcription result. Is that still the recommended approach?

Would it be better to use a shared file (for larger data or richer metadata)? Are there any timing issues I should be aware of, e.g. like race conditions, stale reads, etc.?

Record microphone in a Keyboard app (in the background)
 
 
Q