Recognize spoken words in recorded or live audio using Speech.

Posts under Speech tag

56 Posts
Sort by:

Post

Replies

Boosts

Views

Activity

How to Make Personal Voice Recording in My Languag
I've been deaf and blind for 15 years' I'm not good at pronunciation in English, since I don't hear what I say, much less hear it from others. When I went to read the phrases to record my personal voice in Accessibility > Personal Voice, the 150 phrases to read are in English' How do I record phrases in Brazilian Portuguese? I speak Portuguese well' My English is very bad in pronunciation and deafness contributed' Help me.
1
0
645
Aug ’23
Hello guys, how can I use the speak() API of AVSpeechSynthesizer when I turn off my iPhone?
Hello, I have struggled to resolve issue above question. I could speak utterance when I turn on my iPhone, but when my iPhone goes to background mode(turn off iPhone), It doesn't speak any more. I think it is possible to play audio or speak utterance because I can play music on background status in youtube. Any help please??
1
0
608
Sep ’23
SFSpeechRecognizer.isAvailable returns wrong values
As of iOS 17 SFSpeechRecognizer.isAvailable returns true, even when recognition tasks cannot be fulfilled and immediately fail with error “Siri and Dictation are disabled”. The same speech recognition code works as expected in iOS 16. In iOS 16, neither Siri or Dictation needed to be enabled to have SpeechRecognition to be available and it works as expected. In the past, once permissions given, only an active network connection is required to have functional SpeechRecognition. There seems to be 2 issues in play: In iOS 17, SFSpeechRecognizer.isAvailable incorrectly returns true, when it can’t fulfil requests. In iOS 17 dictation or Siri being enabled is required to handle SpeechRecognition tasks, while in iOS 17 this isn’t the case. If issue 2. Is expected behaviour (I surely hope not), there is no way to actually query if Siri or dictation is enabled to properly handle those cases in code and inform the user why speech recognition doesn’t work. Expected behaviour: Speech recognition is available when Siri and dictation is disabled SFSpeechRecognizer.isAvailable returns correctly false when no SpeechRecognition requests can be handled. iOS Version 17.0 (21A329) Xcode Version 15.0 (15A240d) Anyone else experiencing the same issues or have a solution? Reported this to Apple as well -> FB13235751
1
0
906
Apr ’24
libiconv convert utf8 to gbk fialed when text contains `ellipsis(……)` only in iOS17
libiconv convert utf8 to gbk fialed when text contains ellipsis(……) only in iOS17, and in Xcode 15 I update the libiconv because it's old. I have test libiconv.tbd and libicon.2.tbd the same result. Condition: only iOS 17, iOS 16- is OK; Text contains ellipsis(……) such as ……测试字符串; Convert to gbk or gb18030 is failed and return -1, but gb2312 return is OK。 int code_convert(const char *from_charset, const char *to_charset, char *inbuf, size_t inlen, char *outbuf, size_t outlen) { iconv_t cd; char **pin = &inbuf; char **pout = &outbuf; cd = iconv_open(to_charset, from_charset); if (cd == 0) return -1; memset(outbuf, 0, outlen); if ((int)iconv(cd, pin, &inlen, pout, &outlen) == -1) { iconv_close(cd); std::cout<< "转换失败" << std::endl; return -1; } iconv_close(cd); return 0; } int u2g(char *inbuf, size_t inlen, char *outbuf, size_t outlen) { //gb18030 , gb2312 return code_convert("utf-8", "gb2312", inbuf, inlen, outbuf, outlen); } std::string UTFtoGBK(const char* utf8) { int length = strlen(utf8); char *temp = (char*)malloc(sizeof(char)*length); if(u2g((char*)utf8,length,temp,length) >= 0) { std::string str_result; str_result.append(temp); free(temp); return str_result; }else { free(temp); return ""; } }
2
0
839
Nov ’23
Recognizing Speech in Live Audio Sample Project
I'm developing a project where I want to transcribe live speech from the user on IOS devices. I wanted to test out the Speech framework by downloading the sample code from https://developer.apple.com/documentation/speech/recognizing_speech_in_live_audio. I'm using Xcode 15 and running it on an Ipad with IOS 17 installed. I run the app and manage to approve the permissions to use the microphone and to use live speech transcription, but as soon as I press 'start recording', I get the following error in Xcode, and nothing happens on the ipad screen. +[SFUtilities issueReadSandboxExtensionForFilePath:error:] issueReadSandboxExtensionForFilePath:error:: Inaccessible file (/var/mobile/Containers/Data/Application/1F1AB092-95F2-4E5F-A369-475E15114F26/Library/Caches/Vocab) : error=Error Domain=kAFAssistantErrorDomain Code=203 "Failed to access path: /var/mobile/Containers/Data/Application/1F1AB092-95F2-4E5F-A369-475E15114F26/Library/Caches/Vocab method:issueReadSandboxExtensionForFilePath:error:" UserInfo={NSLocalizedDescription=Failed to access path: /var/mobile/Containers/Data/Application/1F1AB092-95F2-4E5F-A369-475E15114F26/Library/Caches/Vocab method:issueReadSandboxExtensionForFilePath:error:} Can someone guide me in the right direction to fix this?
1
0
473
Oct ’23
SpeechSynthesis rate is not adjustable
In my app I have: let utterance = AVSpeechUtterance(string: recognizedTextView.text) utterance.voice = AVSpeechSynthesisVoice(language: "zh-HK") utterance.rate = 0.3 synthesizer.speak(utterance) However, after I upgraded my iOS to 17.1.1 the speed doesn't get adjusted. I tried 0.01 and 0.3 for rate, and there is no difference.
0
0
350
Nov ’23
NSSpeechRecognitionUsageDescription not working
I have gotten an error stating, "This app has crashed because it attempted to access privacy-sensitive data without a usage description. The app's Info.plist must contain an NSSpeechRecognitionUsageDescription key with a string value explaining to the user how the app uses this data." But I have already added NSSpeechRecognitionUsageDescription to my info.plist and the error is still occuring. Anyone have a solution to this?
1
0
659
Nov ’23
Microphone not working in iOS simulators under macos Sonoma 14.1.2
Hello, I am trying to test a speech to text feature in several iPhone simulators, but microphones don't seem to work. The microphone and speech recognition permissions are correctly asked for the feature. My internal and external microphones are detected in I/O options in simulators. But nothing happens when I launch the recognition. The recognition doesn't work also for speech to text in native messages keyboard or Siri. This problem is the same for all the simulators so I believe the issue is about Xcode permissions not accessing microphone. In my settings > Privacy & Security > Microphone : can't see Xcode (Considering an other issue, I can't see Xcode Source Editor in Extensions as well) I've already tried to uninstall and reinstall Xcode. I use Xcode 15.0.1 under Sonoma 14.1.2. Any help is welcome.
4
3
1.3k
Dec ’23
AVAudioEngine & AVAudioPlayer Voice Processing Volume.
As the title suggests I am using AVAudioEngine for SpeechRecognition input & AVAudioPlayer for sound output. Apple says in this talk https://developer.apple.com/videos/play/wwdc2019/510 that the setVoiceProcessingEnabled function very usefully cancels the output from speaker to the mic. I set voiceProcessing on the Input and output nodes. It seems to work however the volume is low, even when the system volume is turned up. Any solution to this would be much appreciated.
0
0
652
Dec ’23
IsFormatSampleRateAndChannelCountValid false when playing outside audio
My app listens for verbal commands "Roll" & "Skip". It was working well until I used it while listening to a podcast in another app. I am getting a crash with the error: Thread 1: "required condition is false: IsFormatSampleRateAndChannelCountValid(format)" . It crashes when I am playing audio from the apps Snipd (a podcast app) or the Apple Podcast app. When I am playing audio from Youtube or the Apple Music it does not crash. This is the code for when I start listening for the commands: // MARK: - Speech Recognition func startListening() { do { try configureAudioSession() createRecognitionRequest() try prepareAudioEngine() } catch { print("Audio Engine error: \(error.localizedDescription)") } } private func configureAudioSession() throws { let audioSession = AVAudioSession.sharedInstance() try audioSession.setCategory(.playAndRecord, mode: .measurement, options: [.interruptSpokenAudioAndMixWithOthers, .duckOthers]) try audioSession.setActive(true, options: .notifyOthersOnDeactivation) } private func createRecognitionRequest() { recognitionRequest = SFSpeechAudioBufferRecognitionRequest() guard let recognitionRequest = recognitionRequest else { return } recognitionRequest.shouldReportPartialResults = true recognitionTask = speechRecognizer?.recognitionTask(with: recognitionRequest, resultHandler: handleRecognitionResult) } private func prepareAudioEngine() throws { let inputNode = audioEngine.inputNode inputNode.removeTap(onBus: 0) let inputFormat = inputNode.inputFormat(forBus: 0) inputNode.installTap(onBus: 0, bufferSize: 1024, format: inputFormat) { [weak self] (buffer, _) in self?.recognitionRequest?.append(buffer) } audioEngine.prepare() try audioEngine.start() isActuallyListening = true } Thanks
2
1
1.2k
Jan ’24
Audio Recognition and Live captioning
Hi Apple Team, We have a technical query regarding one feature- Audio Recognition and Live captioning. We are developing an app for deaf community to avoid communication barriers. We want to know if there is any possibility to recognize the sound from other applications in an iPhone and show live captions in our application (based on iOS).
0
0
370
Dec ’23
Crashed: AXSpeech EXC_BAD_ACCESS KERN_INVALID_ADDRESS 0x000056f023efbeb0
Application is getting Crashed: AXSpeech EXC_BAD_ACCESS KERN_INVALID_ADDRESS 0x000056f023efbeb0 Crashed: AXSpeech 0 libobjc.A.dylib 0x4820 objc_msgSend + 32 1 libsystem_trace.dylib 0x6c34 _os_log_fmt_flatten_object + 116 2 libsystem_trace.dylib 0x5344 _os_log_impl_flatten_and_send + 1884 3 libsystem_trace.dylib 0x4bd0 _os_log + 152 4 libsystem_trace.dylib 0x9c48 _os_log_error_impl + 24 5 TextToSpeech 0xd0a8c _pcre2_xclass_8 6 TextToSpeech 0x3bc04 TTSSpeechUnitTestingMode 7 TextToSpeech 0x3f128 TTSSpeechUnitTestingMode 8 AXCoreUtilities 0xad38 -[NSArray(AXExtras) ax_flatMappedArrayUsingBlock:] + 204 9 TextToSpeech 0x3eb18 TTSSpeechUnitTestingMode 10 TextToSpeech 0x3c948 TTSSpeechUnitTestingMode 11 TextToSpeech 0x48824 AXAVSpeechSynthesisVoiceFromTTSSpeechVoice 12 TextToSpeech 0x49804 AXAVSpeechSynthesisVoiceFromTTSSpeechVoice 13 Foundation 0xf6064 __NSThreadPerformPerform + 264 14 CoreFoundation 0x37acc CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION + 28 15 CoreFoundation 0x36d48 __CFRunLoopDoSource0 + 176 16 CoreFoundation 0x354fc __CFRunLoopDoSources0 + 244 17 CoreFoundation 0x34238 __CFRunLoopRun + 828 18 CoreFoundation 0x33e18 CFRunLoopRunSpecific + 608 19 Foundation 0x2d4cc -[NSRunLoop(NSRunLoop) runMode:beforeDate:] + 212 20 TextToSpeech 0x24b88 TTSCFAttributedStringCreateStringByBracketingAttributeWithString 21 Foundation 0xb3154 NSThread__start + 732 com.livingMedia.AajTakiPhone_issue_3ceba855a8ad2d1af83655803dc13f70_crash_session_9081fa41ced440ae9a57c22cb432f312_DNE_0_v2_stacktrace.txt 22 libsystem_pthread.dylib 0x24d4 _pthread_start + 136 23 libsystem_pthread.dylib 0x1a10 thread_start + 8
3
0
660
Jan ’24
iOS speech recognition: webkitSpeechRecognition in a WKWebview vs. native SFSpeechRecognizer
I have a prototype web view (in a WKWebView) that uses webkitSpeechRecognition for getting short snippets of text from speech. I'm not thrilled with the quality of the "recognition" - the text generally isn't very accurate. I'm wondering if I'll get any more accuracy by using the "native" SFSpeechRecognizer. It seems to me that webkitSpeechRecognition is likely just a Javascript wrapper interface for SFSpeechRecognizer, and the quality of the speech recognition won't improve. Does anyone know for sure if this is the case? Does webKitSpeechRecognition on iOS use SFSpeechRecognizer under the hood? Or are they two completely different recognition systems, and one could be more accurate than the other?
0
0
628
Jan ’24
Recognizing Speech in live Audio ; impossible to run data generator with 'fr' local identifier
Hy, I'm French developer and I downloaded the Recognizing Speech in live Audio sample code from Developer Apple website. I tried to execute data generator command after changing the local identifier from 'en_US' to 'fr' in data generator main file , but when I ran the command in Xcode, I had this error message : " Identifier 'fr' does not parse into two elements." I checked the xml files associated to the bin archive file and the identifiers are no correct (they keep 'en-US' value). Thanks for your help !
1
0
430
Jan ’24
URLs in iOS 17's New Speech Recognition API: prepareCustomLanguageModel vs Configuration URL
I'm working with the new speech recognition APIs in iOS 17 and have encountered some confusion regarding the use of URLs in SFSpeechLanguageModel.prepareCustomLanguageModel and the SFSpeechLanguageModel.Configuration. In the SFSpeechLanguageModel.Configuration initializer, I provide a URL that points to a custom language model .bin file. However, there's also a URL parameter in the prepareCustomLanguageModel method. I'm unclear about the purpose of this second URL and how it differs from the one in the configuration. To add to the confusion, the documentation for these new APIs is not fully fleshed out at this point. I've tried injecting both .bin files (for the custom language model and the one for prepareCustomLanguageModel) into the same URL, but the results haven't clarified their distinct roles. In experiments I conducted, I checked the confidence level of recognized phrases from the same audio file with and without the custom language model .bin file. Surprisingly, the confidence levels remained the same in both scenarios, leading me to question if the custom model is being utilized correctly. Has anyone else worked with these new APIs and can provide clarity on: The distinct roles of the URLs in SFSpeechLanguageModel.Configuration and prepareCustomLanguageModel. Why there might be no noticeable difference in confidence levels when using a custom language model. Any insights or experiences with these new aspects of the iOS 17 speech recognition API would be greatly appreciated.
0
1
589
Jan ’24