Speech

[26] audioTimeRange would still be interesting for .volatileResults in SpeechTranscriber

So experimenting with the new SpeechTranscriber, if I do: let transcriber = SpeechTranscriber( locale: locale, transcriptionOptions: [], reportingOptions: [.volatileResults], attributeOptions: [.audioTimeRange] ) only the final result has audio time ranges, not the volatile results. Is this a performance consideration? If there is no performance problem, it would be nice to have the option to also get speech time ranges for volatile responses. I'm not presenting the volatile text at all in the UI, I was just trying to keep statistics about the non-speech and the speech noise level, this way I can determine when the noise level falls under the noisefloor for a while. The goal here was to finalize the recording automatically, when the noise level indicate that the user has finished speaking.

Media Technologies Audio Speech

6

0

650

3d

SpeechTranscriber not supported

I've tried SpeechTranscriber with a lot of my devices (from iPhone 12 series ~ iPhone 17 series) without issues. However, SpeechTranscriber.isAvailable value is false for my iPhone 11 Pro. https://developer.apple.com/documentation/speech/speechtranscriber/isavailable I'am curious why the iPhone 11 Pro device is not supported. Are all iPhone 11 series not supported intentionally? Or is there any problem with my specific device? I've also checked the supportedLocales, and the value is an empty array. https://developer.apple.com/documentation/speech/speechtranscriber/supportedlocales

Media Technologies Audio Speech

4

0

540

5d

SpeechTranscriber supported Devices

I have the new iOS 26 SpeechTranscriber working in my application. The issue I am facing is how to determine if the device I am running on supports SpeechTranscriber. I was able to create code that tests if the device supports transcription but it takes a bit of time to run and thus the results are not available when the app launches. What I am looking for is a list of what iOS 26 devices it doesn't run on. I think its safe to assume any new devices will support it so if we can just have a list of what devices that can run iOS 26 and not able to do transcription it would be much faster for the app. I have determined it doesn't work on a SE 2nd Gen, it works on iPhone 12, SE 3rd Gen, iPhone 14 Pro, 15 Pro. As the SpeechTranscriber doesn't work in the simulator I can't determine that way. I have checked the docs and it doesn't list the devices it doesn't work on.

Media Technologies Audio Speech

1

0

289

5d

SpeechAnalyzer speech to text wwdc sample app

I am using the sample app from: https://developer.apple.com/videos/play/wwdc2025/277/?time=763 I installed this on an Iphone 15 Pro with iOS 26 beta 1. I was able to get good transcription with it. The app did crash sometimes when transcribing and I was going to post here with the details. I then installed iOS beta 2 and uninstalled the sample app. Now every time I try to run the sample app on the 15 Pro I get this message: SpeechAnalyzer: Input loop ending with error: Error Domain=SFSpeechErrorDomain Code=10 "Cannot use modules with unallocated locales [en_US (fixed en_US)]" UserInfo={NSLocalizedDescription=Cannot use modules with unallocated locales [en_US (fixed en_US)]} I can't continue our our work towards using SpeechAnalyzer now with this error. I have set breakpoints on all the catch handlers and it doesn't catch this error. My phone region is "United States"

Media Technologies Audio Speech

21

8

1.6k

1w

What iPhone and iPad models under iOS 26 support SpeechTranscriber

For what iPhone and iPad models under iOS 26 SpeechTranscriber.isAvailable is true

App & System Services Hardware iPad iOS iPhone Speech

3

0

172

1w

Question about SpeechTranscriber availability on specific iPad models

Hello, I am a developer planning to build an application using Apple's new SpeechTranscriber technology. I am facing an issue where SpeechTranscriber is not available on my iPad Pro (11-inch, 2nd generation, model number: MXDC2J/A), even though I have updated it to iPadOS 26. I was under the impression that SpeechTranscriber would be available on any device running iPadOS 26. Could you please clarify if this is incorrect? Furthermore, I am planning to purchase a new iPad with an A16 chip for the development and deployment of this application. Can you confirm if SpeechTranscriber will be fully functional on an iPad equipped with the A16 chip? Thank you for your assistance.

App & System Services Hardware Speech

3

0

136

1w

SpeechTranscriber on Simulator

I am trying to use SpeechTranscriber from Speech framework. Is it possible to use it on Simulator of iOS 26 (Mac OS Tahoe)? Function "supportedLocales" returns an empty array.

Media Technologies Audio Speech

3

2

699

1w

SpeechAnalyzer error "asset not found after attempted download" for certain languages

I am trying to use the new SpeechAnalyzer framework in my Mac app, and am running into an issue for some languages. When I call AssetInstallationRequest.downloadAndInstall() for some languages, it throws an error: Error Domain=SFSpeechErrorDomain Code=1 "transcription.ar asset not found after attempted download." The ".ar" appears to be the language code, which in this case was Arabic. When I call AssetInventory.status(forModules:) before attempting the download, it is giving me a status of "downloading" (perhaps from an earlier attempt?). If this language was completely unsupported, I would expect it to return a status of "unsupported", so I'm not sure what's going on here. For other languages (Polish, for example) SpeechTranscriber.supportedLocale(equivalentTo:) is returning nil, so that seems like a clearly unsupported language. But I can't tell if the languages I'm trying, like Arabic, are supported and something is going wrong, or if this error represents something I can work around. Here's the relevant section of code. The error is thrown from downloadAndInstall(), so I never even get as far as setting up the SpeechAnalyzer itself. private func setUpAnalyzer() async throws { guard let sourceLanguage else { throw Error.languageNotSpecified } guard let locale = await SpeechTranscriber.supportedLocale(equivalentTo: Locale(identifier: sourceLanguage.rawValue)) else { throw Error.unsupportedLanguage } let transcriber = SpeechTranscriber(locale: locale, preset: .progressiveTranscription) self.transcriber = transcriber let reservedLocales = await AssetInventory.reservedLocales if !reservedLocales.contains(locale) && reservedLocales.count == AssetInventory.maximumReservedLocales { if let oldest = reservedLocales.last { await AssetInventory.release(reservedLocale: oldest) } } do { let status = await AssetInventory.status(forModules: [transcriber]) print("status: \(status)") if let installationRequest = try await AssetInventory.assetInstallationRequest(supporting: [transcriber]) { try await installationRequest.downloadAndInstall() } } ...

Media Technologies Audio Speech

6

0

434

2w

Video Audio + Speech To Text

Hello, I am wondering if it is possible to have audio from my AirPods be sent to my speech to text service and at the same time have the built in mic audio input be sent to recording a video? I ask because I want my users to be able to say "CAPTURE" and I start recording a video (with audio from the built in mic) and then when the user says "STOP" I stop the recording.

Media Technologies Audio Speech AVAudioSession AVAudioEngine AVFoundation

0

80

3w

Speech Recognition Problem in iOS 18.0

It looks like Apple has added some new API(s) to SFSpeechRecognition My app, which is currently listed on App Store does feature speech recognition. Yet, trying to use it under iOS 18.0 throws errors: -[SFSpeechRecognitionTask localSpeechRecognitionClient:speechRecordingDidFail:]_block_invoke Ignoring subsequent local speech recording error: Error Domain=kAFAssistantErrorDomain Code=1101 "(null)" What happens is that after several words are transcribed and displayed, the next sentence results in previous words disappearance. That's probably what that portion of the error text - "Ignoring subsequent local speech recording error: Error Domain=kAFAssistantErrorDomain Code=1101 "(null)" means. The problem occurs ONLY when the app is running under iOS 18.0 Even when it's compiled in Xcode 16.0 using iOS 17.5 everything works fine. Any suggestions?

App & System Services General Speech

40

9

7.7k

Oct ’25

How to record voice, auto-transcribe, translate (auto-detect input language), and play back translated audio on same device in iOS Swift?

Hi everyone 👋 I’m building an iOS app in Swift where I want to do the following: Record the user’s voice Transcribe the spoken sentence (speech-to-text) Auto-detect the spoken language Translate it to another language selected by the user (e.g., English → Spanish or Hindi → English) Speak back (text-to-speech) the translated text on the same device Is this possible to record via phone mic and play the transcribe voice into headphone's audio?

Media Technologies Audio Speech Siri and Voice Localization Live Text

0

171

Oct ’25

Improving Speech Analyzer Transcription for technical terms

I am developing an app with transcription and I am exploring ways to improve the transcription from the SpeechAnalyzer/Transcriber for technical terms. SFSpeech... recognition had the capability of being augmented by contextualStrings. Does something similar exist for SpeechAnalyzer/Transcriber? If so please point me towards the documentation and any sample code that may exist for this. If there are other options, please let me know.

Media Technologies Audio Speech

1

0

215

Sep ’25

Clarification on SFSpeechRecognizer system alert message and service URLs for whitelisting

Hello Apple Engineers, I am developing a feature related to SpeechRecognizer (import Speech), and I have two questions: After adding the NSSpeechRecognitionUsageDescription key in my Info.plist, when I initialize an SFSpeechRecognizer instance, the system shows an authorization alert. The alert contains a pre-defined message from Apple: “Speech data from this app will be sent to Apple to process your requests. This will also help Apple improve its speech recognition technology.” Is it possible to remove or customize this message? My app runs in a network environment with a whitelist. I need to know which URL the SFSpeechRecognizer instance connects to, and which port it uses, so that I can add it to the whitelist. Thank you very much for your support! Best regards, Yu Cheng

Media Technologies General Speech

0

56

Sep ’25

Failed to change the TTS language to CN or TW

I have some question about the TTS My device default language is zh-HK. (cantonese) my device is iPhone 16 Pro, IOS 18.6 I create a function speakMandarin I want the device to speak the zh-CN (putonghua) however the device only can speak zh-HK (cantonese). I already set the AVSpeechSynthesisVoice language as zh-CN func speakMandarin(text: String) { print("speakMandarin, \(text)") lastError = nil // Reset error // Stop any ongoing speech before starting new if synthesizer.isSpeaking { synthesizer.stopSpeaking(at: .immediate) } // Configure speech utterance let utterance = AVSpeechUtterance(ssmlRepresentation: text)! utterance.rate = 0.5 // Natural speaking speed utterance.pitchMultiplier = 1.0 utterance.volume = 1.0 utterance.voice = AVSpeechSynthesisVoice(language: "zh-CN") let preferredLanguages = [ "zh-CN" , "zh-TW"] var selectedVoice: AVSpeechSynthesisVoice? for lang in preferredLanguages { if let voice = AVSpeechSynthesisVoice(language: lang) { print(lang) selectedVoice = voice utterance.voice = voice break } } // If no Mandarin voice found, use system default if selectedVoice == nil { selectedVoice = AVSpeechSynthesisVoice(language: nil) lastError = "未偵測到普通話語音包，將使用系統預設語音" print (lastError) } utterance.voice = selectedVoice print(utterance) synthesizer.speak(utterance) } here is my log speakMandarin, <speak>你好！我們來聊聊喜歡的動物吧。你喜歡什麼動物呢？</speak> zh-CN [AVSpeechUtterance 0x1194efb80] String: 你好！我們來聊聊喜歡的動物吧。你喜歡什麼動物呢？ Voice: [AVSpeechSynthesisVoice 0x104ceff90] Language: zh-CN, Name: Tingting, Quality: Default [com.apple.voice.compact.zh-CN.Tingting] Rate: 0.50 Volume: 1.00 Pitch Multiplier: 1.00 Delays: Pre: 0.00(s) Post: 0.00(s)

Media Technologies General Speech Localization

0

222

Sep ’25

Japanese “Hattori” TTS voice missing from Settings > General > Read & Speak > Voices > Japanese on iOS 26

Japanese “Hattori” TTS voice missing from Settings > General > Read & Speak > Voices > Japanese on iOS 26 Steps: Open the path above → “Hattori” is not listed and cannot be downloaded Expected: Hattori is available to download and select Actual: Hattori is absent from the catalog Regression: Was available on iOS 18.x on the same device

Accessibility & Inclusion General Speech Accessibility

0

362

Sep ’25

Play Audio and Recognize Speech in Car

Hello, I'm trying to determine the best/recommended AVAudioSession configuration (i.e category, mode, and options) for the following use-case. Essentially, I'd like to switch between periods of playing an audio file and then recognizing speech. The audio file is typically speech and I don't intend for playback and speech recognition to occur simultaneously. I'd like for the user to sill be able to interact with Siri and I'd like for it to work with CarPlay where navigation prompts can occur. I would assume the category to use is 'playAndRecord', but I'm not sure if it's better to just set that once for the entire lifecycle, or set to 'playback' for audio file playback and then switch to 'playAndRecord' for speech recognition . I'm also not sure on the best 'mode' and 'options' to set. Any suggestions would be appreciated. Thanks.

Media Technologies Audio CarPlay Speech AVAudioSession

0

475

Sep ’25

Crash when testing Speech sample app with FoundationModels on macOS 26.0 beta and iOS 26.0 beta

Hello, I am testing the sample project provided here: Bringing advanced speech-to-text capabilities to your app. On both macOS 26.0 beta and iOS 26.0 beta, the app crashes immediately on launch with a dyld "Symbol not found" error related to FoundationModels.framework. It feels like this may be related to testing primarily on newer Apple Silicon devices, as I am seeing consistent crashes on an Intel MacBook and on an older iPhone device. I would appreciate any insight, confirmation, or guidance on whether this is a known limitation or if there is a workaround. Is it planned to be resolved soon? Environment macOS: Device: MacBook Pro (Intel) Processor: 2 GHz Quad-Core Intel Core i5 Graphics: Intel Iris Plus Graphics 1536 MB Memory: 16 GB 3733 MHz LPDDR4X OS: macOS Tahoe Version 26.0 Beta (25A5338b) iOS: Device: iPhone 11 Model Number: MHDD3HN/A OS: iOS 26.0 Xcode: Version: 26.0 beta 3 (17A5276g) Crash (macOS) Abort signal received. Excerpt from crash dump: dyld`__abort_with_payload: 0x7ff80e3ad4a0 <+0>: movl $0x2000209, %eax 0x7ff80e3ad4a5 <+5>: movq %rcx, %r10 0x7ff80e3ad4a8 <+8>: syscall -> 0x7ff80e3ad4aa <+10>: jae 0x7ff80e3ad4b4 Console: dyld[9819]: Symbol not found: _$s16FoundationModels20LanguageModelSessionC5model10guardrails5tools12instructionsAcA06SystemcD0C_AC10GuardrailsVSayAA4Tool_pGAA12InstructionsVSgtcfC Referenced from: /Users/userx/Library/Developer/Xcode/DerivedData/SwiftTranscriptionSampleApp-*/Build/Products/Debug/SwiftTranscriptionSampleApp.app/Contents/MacOS/SwiftTranscriptionSampleApp.debug.dylib Expected in: /System/Library/Frameworks/FoundationModels.framework/Versions/A/FoundationModels Crash (iOS) Abort signal received. Excerpt from crash dump: dyld`__abort_with_payload: 0x18f22b4b0 <+0>: mov x16, #0x209 0x18f22b4b4 <+4>: svc #0x80 -> 0x18f22b4b8 <+8>: b.lo 0x18f22b4d8 Console dyld[2080]: Symbol not found: _$s16FoundationModels20LanguageModelSessionC5model10guardrails5tools12instructionsAcA06SystemcD0C_AC10GuardrailsVSayAA4Tool_pGAA12InstructionsVSgtcfC Referenced from: /private/var/containers/Bundle/Application/.../SwiftTranscriptionSampleApp.app/SwiftTranscriptionSampleApp.debug.dylib Expected in: /System/Library/Frameworks/FoundationModels.framework/FoundationModels Question Is this crash expected on Intel Macs and older iPhone models with the beta SDKs? Is there an official statement on whether macOS 26.x releases support Intel, or it exists only until macOS 26.1? Any suggested workarounds for testing this sample project on current hardware? Is this a known limitation for the 26.0 beta, and if so, should we expect a fix in 26.0 or only in subsequent releases? Attaching screenshots for reference. Thank you in advance.

Machine Learning & AI Foundation Models Speech

4

0

510

Aug ’25

How to use the SpeechDetector Module

I am trying to use SpeechDetector Module in Speech framework along with SpeechTranscriber. and it is giving me an error Cannot convert value of type 'SpeechDetector' to expected element type 'Array.ArrayLiteralElement' (aka 'any SpeechModule') Below is how I am using it let speechDetector = Speech.SpeechDetector() let transcriber = SpeechTranscriber(locale: Locale.current, transcriptionOptions: [], reportingOptions: [.volatileResults], attributeOptions: [.audioTimeRange]) speechAnalyzer = try SpeechAnalyzer(modules: [transcriber,speechDetector])

Media Technologies Audio Speech

4

2

366

Aug ’25

SensorKit Speech data question

We are a research team conducting a study collecting subject's SensorKit speech data, and we've encountered some questions we couldn't resolve ourselves or by looking up the online SensorKit documentation: Microphone Activation: In general, how is the microphone being turned on to capture a speech session? And how was each session determined to be an independent session? Negative Values: In the speech classification data, there are entries where some of the start and end values are negative (see screenshot below). How should we interpret and handle these values? Is it safe to filter them out? Duplicated sessions: From the same screenshot you can see there are multiple session identifiers linked to the same subject with the same timestamp - what does this represent? Another Negative Values: The same question for speech recognition data's average pause duration, what does the -1 mean and should we remove them as well? (Note that these screenshot got rid of subject IDs for privacy purposes but each screenshot was from one subject.) We greatly appreciate your time and help.

App & System Services General SensorKit Speech

3

0

136

Aug ’25

SpeechTranscriber/SpeechAnalyzer being relatively slow compared to FoundationModel and TTS

So, I've been wondering how fast a an offline STT -> ML Prompt -> TTS roundtrip would be. Interestingly, for many tests, the SpeechTranscriber (STT) takes the bulk of the time, compared to generating a FoundationModel response and creating the Audio using TTS. E.g. InteractionStatistics: - listeningStarted: 21:24:23 4480 2423 - timeTillFirstAboveNoiseFloor: 01.794 - timeTillLastNoiseAboveFloor: 02.383 - timeTillFirstSpeechDetected: 02.399 - timeTillTranscriptFinalized: 04.510 - timeTillFirstMLModelResponse: 04.938 - timeTillMLModelResponse: 05.379 - timeTillTTSStarted: 04.962 - timeTillTTSFinished: 11.016 - speechLength: 06.054 - timeToResponse: 02.578 - transcript: This is a test. - mlModelResponse: Sure! I'm ready to help with your test. What do you need help with? Here, between my audio input ending and the Text-2-Speech starting top play (using AVSpeechUtterance) the total response time was 2.5s. Of that time, it took the SpeechAnalyzer 2.1s to get the transcript finalized, FoundationModel only took 0.4s to respond (and TTS started playing nearly instantly). I'm already using reportingOptions: [.volatileResults, .fastResults] so it's probably as fast as possible right now? I'm just surprised the STT takes so much longer compared to the other parts (all being CoreML based, aren't they?)

Media Technologies Audio Speech

2

0

540

Aug ’25

Post

Replies

Boosts

Views

Activity

Speech

Posts under Speech tag

Post

Replies

Boosts

Views

Activity