Speech

Audio Recognition and Live captioning

Hi Apple Team, We have a technical query regarding one feature- Audio Recognition and Live captioning. We are developing an app for deaf community to avoid communication barriers. We want to know if there is any possibility to recognize the sound from other applications in an iPhone and show live captions in our application (based on iOS).

Machine Learning & AI General Speech

0

370

Dec ’23

IsFormatSampleRateAndChannelCountValid false when playing outside audio

My app listens for verbal commands "Roll" & "Skip". It was working well until I used it while listening to a podcast in another app. I am getting a crash with the error: Thread 1: "required condition is false: IsFormatSampleRateAndChannelCountValid(format)" . It crashes when I am playing audio from the apps Snipd (a podcast app) or the Apple Podcast app. When I am playing audio from Youtube or the Apple Music it does not crash. This is the code for when I start listening for the commands: // MARK: - Speech Recognition func startListening() { do { try configureAudioSession() createRecognitionRequest() try prepareAudioEngine() } catch { print("Audio Engine error: \(error.localizedDescription)") } } private func configureAudioSession() throws { let audioSession = AVAudioSession.sharedInstance() try audioSession.setCategory(.playAndRecord, mode: .measurement, options: [.interruptSpokenAudioAndMixWithOthers, .duckOthers]) try audioSession.setActive(true, options: .notifyOthersOnDeactivation) } private func createRecognitionRequest() { recognitionRequest = SFSpeechAudioBufferRecognitionRequest() guard let recognitionRequest = recognitionRequest else { return } recognitionRequest.shouldReportPartialResults = true recognitionTask = speechRecognizer?.recognitionTask(with: recognitionRequest, resultHandler: handleRecognitionResult) } private func prepareAudioEngine() throws { let inputNode = audioEngine.inputNode inputNode.removeTap(onBus: 0) let inputFormat = inputNode.inputFormat(forBus: 0) inputNode.installTap(onBus: 0, bufferSize: 1024, format: inputFormat) { [weak self] (buffer, _) in self?.recognitionRequest?.append(buffer) } audioEngine.prepare() try audioEngine.start() isActuallyListening = true } Thanks

App & System Services Core OS iOS Speech AVAudioEngine

2

1

1.2k

Jan ’24

AVAudioEngine & AVAudioPlayer Voice Processing Volume.

As the title suggests I am using AVAudioEngine for SpeechRecognition input & AVAudioPlayer for sound output. Apple says in this talk https://developer.apple.com/videos/play/wwdc2019/510 that the setVoiceProcessingEnabled function very usefully cancels the output from speaker to the mic. I set voiceProcessing on the Input and output nodes. It seems to work however the volume is low, even when the system volume is turned up. Any solution to this would be much appreciated.

Machine Learning & AI General Speech AVAudioEngine AVAudioNode

0

652

Dec ’23

Microphone not working in iOS simulators under macos Sonoma 14.1.2

Hello, I am trying to test a speech to text feature in several iPhone simulators, but microphones don't seem to work. The microphone and speech recognition permissions are correctly asked for the feature. My internal and external microphones are detected in I/O options in simulators. But nothing happens when I launch the recognition. The recognition doesn't work also for speech to text in native messages keyboard or Siri. This problem is the same for all the simulators so I believe the issue is about Xcode permissions not accessing microphone. In my settings > Privacy & Security > Microphone : can't see Xcode (Considering an other issue, I can't see Xcode Source Editor in Extensions as well) I've already tried to uninstall and reinstall Xcode. I use Xcode 15.0.1 under Sonoma 14.1.2. Any help is welcome.

Developer Tools & Services Xcode iOS Speech Xcode Simulator

4

3

1.3k

Dec ’23

NSSpeechRecognitionUsageDescription not working

I have gotten an error stating, "This app has crashed because it attempted to access privacy-sensitive data without a usage description. The app's Info.plist must contain an NSSpeechRecognitionUsageDescription key with a string value explaining to the user how the app uses this data." But I have already added NSSpeechRecognitionUsageDescription to my info.plist and the error is still occuring. Anyone have a solution to this?

Developer Tools & Services Xcode Speech Xcode

1

0

659

Nov ’23

SpeechSynthesis rate is not adjustable

In my app I have: let utterance = AVSpeechUtterance(string: recognizedTextView.text) utterance.voice = AVSpeechSynthesisVoice(language: "zh-HK") utterance.rate = 0.3 synthesizer.speak(utterance) However, after I upgraded my iOS to 17.1.1 the speed doesn't get adjusted. I tried 0.01 and 0.3 for rate, and there is no difference.

Machine Learning & AI General Speech

0

350

Nov ’23

Speech Recognition, ContextualStrings

Is it possible to limit the words recognized to a list of specific words to be provided in a contextualString or other similar parameter. My impression is that ContextualStrings, add additional words that might not be in the dictionary. Instead I want to limit the words recognized.

Machine Learning & AI General Speech

2

0

503

Nov ’23

Speech-to-text input in tvOS

How to achieve Speech-to-text input in tvOS, like - Tapping a mic button on speaking by the user, the audio should convert into text.

Machine Learning & AI General Speech tvOS

0

434

Oct ’23

Recognizing Speech in Live Audio Sample Project

I'm developing a project where I want to transcribe live speech from the user on IOS devices. I wanted to test out the Speech framework by downloading the sample code from https://developer.apple.com/documentation/speech/recognizing_speech_in_live_audio. I'm using Xcode 15 and running it on an Ipad with IOS 17 installed. I run the app and manage to approve the permissions to use the microphone and to use live speech transcription, but as soon as I press 'start recording', I get the following error in Xcode, and nothing happens on the ipad screen. +[SFUtilities issueReadSandboxExtensionForFilePath:error:] issueReadSandboxExtensionForFilePath:error:: Inaccessible file (/var/mobile/Containers/Data/Application/1F1AB092-95F2-4E5F-A369-475E15114F26/Library/Caches/Vocab) : error=Error Domain=kAFAssistantErrorDomain Code=203 "Failed to access path: /var/mobile/Containers/Data/Application/1F1AB092-95F2-4E5F-A369-475E15114F26/Library/Caches/Vocab method:issueReadSandboxExtensionForFilePath:error:" UserInfo={NSLocalizedDescription=Failed to access path: /var/mobile/Containers/Data/Application/1F1AB092-95F2-4E5F-A369-475E15114F26/Library/Caches/Vocab method:issueReadSandboxExtensionForFilePath:error:} Can someone guide me in the right direction to fix this?

Machine Learning & AI General Speech

1

0

473

Oct ’23

libiconv convert utf8 to gbk fialed when text contains `ellipsis(……)` only in iOS17

libiconv convert utf8 to gbk fialed when text contains ellipsis(……) only in iOS17, and in Xcode 15 I update the libiconv because it's old. I have test libiconv.tbd and libicon.2.tbd the same result. Condition: only iOS 17, iOS 16- is OK; Text contains ellipsis(……) such as ……测试字符串; Convert to gbk or gb18030 is failed and return -1, but gb2312 return is OK。 int code_convert(const char *from_charset, const char *to_charset, char *inbuf, size_t inlen, char *outbuf, size_t outlen) { iconv_t cd; char **pin = &inbuf; char **pout = &outbuf; cd = iconv_open(to_charset, from_charset); if (cd == 0) return -1; memset(outbuf, 0, outlen); if ((int)iconv(cd, pin, &inlen, pout, &outlen) == -1) { iconv_close(cd); std::cout<< "转换失败" << std::endl; return -1; } iconv_close(cd); return 0; } int u2g(char *inbuf, size_t inlen, char *outbuf, size_t outlen) { //gb18030 , gb2312 return code_convert("utf-8", "gb2312", inbuf, inlen, outbuf, outlen); } std::string UTFtoGBK(const char* utf8) { int length = strlen(utf8); char *temp = (char*)malloc(sizeof(char)*length); if(u2g((char*)utf8,length,temp,length) >= 0) { std::string str_result; str_result.append(temp); free(temp); return str_result; }else { free(temp); return ""; } }

Developer Tools & Services Xcode iOS Speech Xcode

2

0

839

Nov ’23

SFSpeechRecognizer.isAvailable returns wrong values

As of iOS 17 SFSpeechRecognizer.isAvailable returns true, even when recognition tasks cannot be fulfilled and immediately fail with error “Siri and Dictation are disabled”. The same speech recognition code works as expected in iOS 16. In iOS 16, neither Siri or Dictation needed to be enabled to have SpeechRecognition to be available and it works as expected. In the past, once permissions given, only an active network connection is required to have functional SpeechRecognition. There seems to be 2 issues in play: In iOS 17, SFSpeechRecognizer.isAvailable incorrectly returns true, when it can’t fulfil requests. In iOS 17 dictation or Siri being enabled is required to handle SpeechRecognition tasks, while in iOS 17 this isn’t the case. If issue 2. Is expected behaviour (I surely hope not), there is no way to actually query if Siri or dictation is enabled to properly handle those cases in code and inform the user why speech recognition doesn’t work. Expected behaviour: Speech recognition is available when Siri and dictation is disabled SFSpeechRecognizer.isAvailable returns correctly false when no SpeechRecognition requests can be handled. iOS Version 17.0 (21A329) Xcode Version 15.0 (15A240d) Anyone else experiencing the same issues or have a solution? Reported this to Apple as well -> FB13235751

App & System Services Core OS iOS iPadOS Speech

1

0

905

Apr ’24

Speech contextual string not working

I did add this word "sactional" in speechRequest.contextualStrings but when speaking it always autocorrects to sectional even i tried with training model for speech by adding SFCustomLanguageModelData.PhraseCount(phrase: "sactional", count: 10) and generating a model but didnt work either. is there a better way to make it work ?

Machine Learning & AI General Speech Machine Learning

0

495

Sep ’23

Can I create a Personal Voice on an iPad?

My iPad 8th gen running IPadOS 17.0 RC does not list Personal Voice in the Accessibility settings. Is Personal Voice creation supported on iPad? If so, on which iPad models is it supported?

Machine Learning & AI General Speech Accessibility wwdc2023-10033

2

0

640

Sep ’23

Failed to call didCancel - AVSpeechSynthesizerDelegate

As per apple's document didCancel method will be called after stopSpeaking(at:) method call, instead didFinish method is being called. But in reality, so far I've checked: it's working perfectly on iOS 13.2.2, but after iOS 15. Is there anything I'm missing to configure. But it did work perfectly without doing anything on previous version.

Machine Learning & AI General Speech

0

334

Sep ’23

Hello guys, how can I use the speak() API of AVSpeechSynthesizer when I turn off my iPhone?

Hello, I have struggled to resolve issue above question. I could speak utterance when I turn on my iPhone, but when my iPhone goes to background mode(turn off iPhone), It doesn't speak any more. I think it is possible to play audio or speak utterance because I can play music on background status in youtube. Any help please??

Machine Learning & AI General Speech AVAudioSession AVAudioEngine Audio

1

0

608

Sep ’23

How to Make Personal Voice Recording in My Languag

I've been deaf and blind for 15 years' I'm not good at pronunciation in English, since I don't hear what I say, much less hear it from others. When I went to read the phrases to record my personal voice in Accessibility > Personal Voice, the 150 phrases to read are in English' How do I record phrases in Brazilian Portuguese? I speak Portuguese well' My English is very bad in pronunciation and deafness contributed' Help me.

Machine Learning & AI General Speech Accessibility wwdc2023-10033

1

0

644

Aug ’23

Unable to use Personal Voice in background playback

Hi, When attempting to use the my Personal Voice with AVSpeechSythesizer with application in background I receive the below message: > Cannot use AVSpeechSynthesizerBufferCallback with Personal Voices, defaulting to output channel. Other voices can be used without issue. Is this a published limitation of Personal Voice within applications, i.e. no background playback?

Machine Learning & AI General Speech

1

0

621

Aug ’23

No Personal Voice icon in SF Symbols 5 beta

The WWDC video "Extend Speech Synthesis with personal and custom voices" here: https://developer.apple.com/wwdc23/10033 Shows what appears to be an icon for "Personal Voice at time 10:46. Suggest this be made available to developers for final release.

Machine Learning & AI General Speech

0

416

Aug ’23

Navigate/Link User to a Deep System Setting

How can i show a link in my app to direct access a deep system setting. if a user click a link, app should directly open the deep settings page. For Ex: "Enable Dictation" (Settings->General->Keyboards) App Type: Multiplatform(Swift) minimum deployments: ios: 16.4 Mac os: 13.3 Any Help really appreciated.

Programming Languages Swift Speech Swift Preference Panes

0

552

Aug ’23

AVSpeechSynthesisVoice.speechVoices() Includes Voices That Aren't Available after Upgrading iOS

AVSpeechSynthesisVoice.speechVoices() returns voices that are no longer available after upgrading from iOS 16 to iOS 17 (although this has been an issue for a long time, I think). To reproduce: On iOS 16 download 1 or more enhanced voices under “Accessibility > Spoken Content > Voices”. Upgrade to iOS 17 Call AVSpeechSynthesisVoice.speechVoices() and note that the voices installed in step (1) are still present, yet they are no longer downloaded, therefore they don’t work. And there is no property on AVSpeechSynthesisVoice to indicate if the voice is still available or not. This is a problem for apps that allow users to choose among the available system voices. I receive many support emails surrounding iOS upgrades about this issue. I have to tell them to re-download the voices which is not obvious to them. I've created a feedback item for this as well (FB12994908).

Machine Learning & AI General Speech AVAudioSession AVKit

1

745

Aug ’23

Post

Replies

Boosts

Views

Activity

Speech

Posts under Speech tag

Post

Replies

Boosts

Views

Activity