Recognize spoken words in recorded or live audio using Speech.

Speech Documentation

Posts under Speech tag

51 Posts
Sort by:
Post not yet marked as solved
0 Replies
29 Views
Hello, Please help. In our application we are using the speech recognizer. And I have 2 questions. Please let me know how can I turn on speech recognizer with the timer when the application is in the background. How can i use speech recognizer unlimited? Thanks in advance for your help.
Posted Last updated
.
Post not yet marked as solved
4 Replies
837 Views
Hi, I am facing a strange issue in my app there is an intermittent crash, I am using AVSpeechSynthesizer for speech discovery not sure if that is causing the problem crash log has below information: Firebase Crash log Crashed: AXSpeech 0 CoreFoundation 0x197325d00 _CFAssertMismatchedTypeID + 112 1 CoreFoundation 0x197229188 CFRunLoopSourceIsSignalled + 314 2 Foundation 0x198686ca0 performQueueDequeue + 440 3 Foundation 0x19868641c __NSThreadPerformPerform + 112 4 CoreFoundation 0x19722c990 __CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__ + 28 5 CoreFoundation 0x19722c88c __CFRunLoopDoSource0 + 208 6 CoreFoundation 0x19722bbfc __CFRunLoopDoSources0 + 376 7 CoreFoundation 0x197225b70 __CFRunLoopRun + 820 8 CoreFoundation 0x197225308 CFRunLoopRunSpecific + 600 9 Foundation 0x198514d8c -[NSRunLoop(NSRunLoop) runMode:beforeDate:] + 232 10 libAXSpeechManager.dylib 0x1c3ad0bbc -[AXSpeechThread main] 11 Foundation 0x19868630c __NSThread__start__ + 864 12 libsystem_pthread.dylib 0x1e2f20bfc _pthread_start + 320 13 libsystem_pthread.dylib 0x1e2f29758 thread_start + 8 Apple Crash log Crash log
Posted Last updated
.
Post not yet marked as solved
4 Replies
2.8k Views
Hi,I'm looking for the Repeat After Me application that used to be included in /Developer/Applications/Utilities/Speech. I can find mention of it (https://developer.apple.com/library/prerelease/content/documentation/UserExperience/Conceptual/SpeechSynthesisProgrammingGuide/FineTuning/FineTuning.html), but can't locate it with the most recent install of xcode.Do I need to download something else? Has it been moved or removed?Thanks!
Posted
by bputman.
Last updated
.
Post not yet marked as solved
0 Replies
56 Views
When we used the SFSpeechRecognizer last week, the returned results were normal. However, it was found during the use this week that the returned results contain punctuation marks. For example, we say yes, and the result returns yes?
Posted Last updated
.
Post not yet marked as solved
0 Replies
99 Views
Hi, I have a question regarding the integration of the speech to text library called SFSpeechRecognizer. I need SFSpeechRecognizer to recognize terms that are not present in the iOS dictionary like medication names, chemistry terms, etc. I would have to add them, somehow, for SFSpeechRecognizer to be able to recognise them. Is this possible? Thanks
Posted
by bcm1.
Last updated
.
Post not yet marked as solved
0 Replies
79 Views
Hi, I am trying to use the Speech Recognizer in the Apple's Official Document for my application, also I added the try and catch expression when calling SFSpeechRecognizer, if the user triggers Siri during the runtime, it would immediately crash the whole application when calling SFSpeechRecognizer again, has anyone encountered with similar problems? Here's the code from my application  func transcribe() {  DispatchQueue(label: "Speech Recognizer Queue", qos: .background).async { [weak self] in guard let self = self, let recognizer = self.recognizer, recognizer.isAvailable else {        self?.speakError(RecognizerError.recognizerIsUnavailable)                 return             }              do {           let (audioEngine, request) = try Self.prepareEngine()                 self.audioEngine = audioEngine                 self.request = request                 self.task = recognizer.recognitionTask(with: request, resultHandler: self.recognitionHandler(result:error:))             } catch {                 self.reset()                 self.speakError(error)             }         }     } private static func prepareEngine() throws -> (AVAudioEngine, SFSpeechAudioBufferRecognitionRequest) {         let audioEngine = AVAudioEngine()         let request = SFSpeechAudioBufferRecognitionRequest()         request.shouldReportPartialResults = true         let audioSession = AVAudioSession.sharedInstance()         try audioSession.setCategory(.record, mode: .measurement, options: .duckOthers)         try audioSession.setActive(true, options: .notifyOthersOnDeactivation)         let inputNode = audioEngine.inputNode                  let recordingFormat = inputNode.outputFormat(forBus: 0)         inputNode.installTap(onBus: 0, bufferSize: 1024, format: recordingFormat) { (buffer: AVAudioPCMBuffer, when: AVAudioTime) in             request.append(buffer)         }         audioEngine.prepare()         try audioEngine.start()         return (audioEngine, request)     }
Posted
by PterRim.
Last updated
.
Post not yet marked as solved
0 Replies
205 Views
I'm testing my App in the Xcode 14 beta (released with WWDC22) on iOS 16, and it seems that AVSpeechSynthesisVoice is not working correctly. The following code always returns an empty array: AVSpeechSynthesisVoice.speechVoices() Additionally, attempting to initialize AVSpeechSynthesisVoice returns nil for all of the following: AVSpeechSynthesisVoice(language: AVSpeechSynthesisVoice.currentLanguageCode()) AVSpeechSynthesisVoice(language: "en") AVSpeechSynthesisVoice(language: "en-US") AVSpeechSynthesisVoice(identifier: AVSpeechSynthesisVoiceIdentifierAlex) AVSpeechSynthesisVoice.speechVoices().first
Posted
by zabelc.
Last updated
.
Post not yet marked as solved
0 Replies
98 Views
I’m already a member of the beta program, have downloaded the profile through settings, have no pending software updates, restarted several times. Still, I can’t get live transcribe to appear as a feature under accessibility. Any ideas what to try?
Posted
by Okstabbe.
Last updated
.
Post not yet marked as solved
4 Replies
938 Views
Here is a simple app to demonstrate problem: import SwiftUI import AVFoundation struct ContentView: View {     var synthVM = SpeakerViewModel()     var body: some View {         VStack {             Text("Hello, world!")                 .padding()             HStack {               Button("Speak") {                   if self.synthVM.speaker.isPaused {                       self.synthVM.speaker.continueSpeaking()                 } else {                     self.synthVM.speak(text: "Привет на корабле! Кто это пришел к нам, чтобы посмотреть на это произведение?")                 }               }               Button("Pause") {                   if self.synthVM.speaker.isSpeaking {                       self.synthVM.speaker.pauseSpeaking(at: .word)                 }               }               Button("Stop") {                   self.synthVM.speaker.stopSpeaking(at: .word)               }             }         }     } } struct ContentView_Previews: PreviewProvider {     static var previews: some View {         ContentView()     } } class SpeakerViewModel: NSObject {     var speaker = AVSpeechSynthesizer()      override init() {     super.init()     self.speaker.delegate = self   }      func speak(text: String) {     let utterance = AVSpeechUtterance(string: text)       utterance.voice = AVSpeechSynthesisVoice(language: "ru")     speaker.speak(utterance)   } } extension SpeakerViewModel: AVSpeechSynthesizerDelegate {   func speechSynthesizer(_ synthesizer: AVSpeechSynthesizer, didStart utterance: AVSpeechUtterance) {     print("started")   }   func speechSynthesizer(_ synthesizer: AVSpeechSynthesizer, didPause utterance: AVSpeechUtterance) {     print("paused")   }   func speechSynthesizer(_ synthesizer: AVSpeechSynthesizer, didContinue utterance: AVSpeechUtterance) {}   func speechSynthesizer(_ synthesizer: AVSpeechSynthesizer, didCancel utterance: AVSpeechUtterance) {}   func speechSynthesizer(_ synthesizer: AVSpeechSynthesizer, willSpeakRangeOfSpeechString characterRange: NSRange, utterance: AVSpeechUtterance) {       guard let rangeInString = Range(characterRange, in: utterance.speechString) else { return }       print("Will speak: \(utterance.speechString[rangeInString])")   }   func speechSynthesizer(_ synthesizer: AVSpeechSynthesizer, didFinish utterance: AVSpeechUtterance) {     print("finished")   } } On simulator all works fine, but on real device there are many strange words appears in synthesis speak. And willSpeakRangeOfSpeechString output is different on simulator and real device Simulator: started Will speak: Привет Will speak: на Will speak: корабле! Will speak: Кто Will speak: это Will speak: пришел Will speak: к Will speak: нам, Will speak: чтобы Will speak: посмотреть Will speak: на Will speak: это Will speak: произведение? finished iPhone output have errors: 2021-10-12 17:09:32.613273+0300 VoiceTest[9027:203522] [AXTTSCommon] Broken user rule: \b([234567890]+)2 (мили|кварты|чашки|{столовых }ложки)(?=$|\s|[[:punct:]»\xa0]) > Error Domain=NSCocoaErrorDomain Code=2048 "The value “\b([234567890]+)2 (мили|кварты|чашки|{столовых }ложки)(?=$|\s|[[:punct:]»\xa0])” is invalid." UserInfo={NSInvalidValue=\b([234567890]+)2 (мили|кварты|чашки|{столовых }ложки)(?=$|\s|[[:punct:]»\xa0])} 2021-10-12 17:09:32.613548+0300 VoiceTest[9027:203522] [AXTTSCommon] Broken user rule: \b(1\d+)2 (мили|кварты|чашки|{столовых }ложки)(?=$|\s|[[:punct:]»\xa0]) > Error Domain=NSCocoaErrorDomain Code=2048 "The value “\b(1\d+)2 (мили|кварты|чашки|{столовых }ложки)(?=$|\s|[[:punct:]»\xa0])” is invalid." UserInfo={NSInvalidValue=\b(1\d+)2 (мили|кварты|чашки|{столовых }ложки)(?=$|\s|[[:punct:]»\xa0])} 2021-10-12 17:09:32.613725+0300 VoiceTest[9027:203522] [AXTTSCommon] Broken user rule: \b2 (мили|кварты|чашки|{столовых }ложки)(?=$|\s|[[:punct:]»\xa0]) > Error Domain=NSCocoaErrorDomain Code=2048 "The value “\b2 (мили|кварты|чашки|{столовых }ложки)(?=$|\s|[[:punct:]»\xa0])” is invalid." UserInfo={NSInvalidValue=\b2 (мили|кварты|чашки|{столовых }ложки)(?=$|\s|[[:punct:]»\xa0])} started Will speak: Привет Will speak: на Will speak: ивет на корабле! Will speak: Кто Will speak: это Will speak: Кто это пришел Will speak: к Will speak: нам, Will speak: чтобы Will speak: посмотреть Will speak: на Will speak: реть на это Will speak: на это произведение? finished Error appears on iOS / iPadOS 15.0, 15.0.1, 15.0.2, 14.7 But all works fine on 14.8 Looks like engine error. How to fix that issue?
Posted
by sanctor.
Last updated
.
Post not yet marked as solved
2 Replies
242 Views
Hola a todos DEVS Actualmente estoy terminando mi aprendizaje autodidacta Swift, y estaba pensando que una vez que terminé, estaba pensando en conseguir un MacBook Pro de 14" para empezar a trabajar con él, pero no sé qué configuración sería la más recomendable tomar. un M1 Pro con CPU de 10 núcleos, GPU de 16 núcleos, Neural Engine de 16 núcleos, 16 RAM y 512 SSD o recomiendan aumentar la RAM a 32 o la SSD a 1 TB. Dime que me gustaría hacer una inversión que me dure al menos 5 años y no porque me quede corto en la configuración después de unos años, tengo que adquirir otro equipo. ¿Qué recomiendas? Gracias por sus respuestas y su tiempo. Saludos cordiales
Posted Last updated
.
Post not yet marked as solved
0 Replies
152 Views
Problem: AVSpeechSynthesiser sometimes describes words rather than just speaking them as a real person would. When speaking in English AVSpeechSynthesiser pronounces the word "A" on its own as "Capital A", while the phrase "A little test" is pronounced correctly. A workaround of lowercasing the speech string - so "A" becomes "a" fixes this specific example. (I'm not yet sure if lowercasing sentences could affect pronunciation badly in some instances.) A more serious example: When speaking French the word "allé" on its own is pronounced by AVSpeechSynthesiser as "allé - e accent aigu" (accent aigu = acute accent). And here the problem exists even when the word is part of a sentence! With "Je suis allé au cinéma" (I went to the cinema) AVSpeechSynthesiser says "Je suis allé e accent aigu au cinéma" which is clearly wrong and unhelpful. Is there a way to fix this?
Posted
by boop.
Last updated
.
Post not yet marked as solved
1 Replies
4.3k Views
I am implementing accessibility into our current project and have some issues on setting focus on an element in a previous view.An example would be:A user selects on a tableview cell at index 3. In didSelectRowAt, we present to the user a UIActionSheet where the user makes a selection. Once the user makes a selection and the presented UIActionSheet is dismissed, the accessibility focus (SHOULD?) be selected at tableview cell index 3, but instead the focus is set back to the initial element in the view, in my case most top-left element.I have used UIAccessibility.Notification .screenChanged and UIView.accessibilityViewIsModalto set a new focus when presenting a modal view, but this doesn't seem to have a good solution to point back to a previous element when dismissing a view.Any insight on how to track the previous accessibility element focus would be greatly appreciated.
Posted Last updated
.
Post not yet marked as solved
1 Replies
689 Views
I am attempting to utilize alternative pronunciation utilizing the IPA notation for AVSpeechSynthesizer on macOS (Big Sur 11.4). The attributed string is being ignored and so the functionality is not working. I tried this on iOS simulator and it works properly. The India English voice pronounces the word "shame" as shy-em, so I applied the correct pronunciation but no change was heard. I then substituted the pronunciation for a completely different word but there was no change. Is there something else that must be done to make this work? AVSpeechSynthesisIPANotationAttribute Attributed String: It's a '{ }shame{ AVSpeechSynthesisIPANotationAttribute = "\U0283\U02c8e\U0361\U026am"; }' it didn't work out.{ } Target Range: {8, 5} Target String: shame, Substitution: ʃˈe͡ɪm Attributed String: It's a '{ }shame{ AVSpeechSynthesisIPANotationAttribute = "\U0283\U02c8e\U0361\U026am"; }' it didn't work out.{ } Target Range: {8, 5} Target String: shame, Substitution: ʃˈe͡ɪm Attributed String: It's a '{ }shame{ AVSpeechSynthesisIPANotationAttribute = "t\U0259.\U02c8me\U0361\U026a.do\U0361\U028a"; }' it didn't work out.{ } Target Range: {8, 5} Target String: shame, Substitution: tə.ˈme͡ɪ.do͡ʊ Attributed String: It's a '{ }shame{ AVSpeechSynthesisIPANotationAttribute = "t\U0259.\U02c8me\U0361\U026a.do\U0361\U028a"; }' it didn't work out.{ } Target Range: {8, 5} Target String: shame, Substitution: tə.ˈme͡ɪ.do͡ʊ class SpeakerTest: NSObject, AVSpeechSynthesizerDelegate { let synth = AVSpeechSynthesizer() func speakIPA_Substitution(subst: String, voice: AVSpeechSynthesisVoice) { let text = "It's a 'shame' it didn't work out." let mutAttrStr = NSMutableAttributedString(string: text) let range = NSString(string: text).range(of: "shame") let pronounceKey = NSAttributedString.Key(rawValue: AVSpeechSynthesisIPANotationAttribute) mutAttrStr.setAttributes([pronounceKey: subst], range: range) let utterance = AVSpeechUtterance(attributedString: mutAttrStr) utterance.voice = voice utterance.postUtteranceDelay = 1.0 let swiftRange = Range(range, in: text)! print("Attributed String: \(mutAttrStr)") print("Target Range: \(range)") print("Target String: \(text[swiftRange]), Substitution: \(subst)\n") synth.speak(utterance) } func customPronunciation() { let shame = "ʃˈe͡ɪm" // substitute correct pronunciation let tomato = "tə.ˈme͡ɪ.do͡ʊ" // completely different word pronunciation let britishVoice = AVSpeechSynthesisVoice(language: "en-GB")! let indiaVoice = AVSpeechSynthesisVoice(language: "en-IN")! speakIPA_Substitution(subst: shame, voice: britishVoice) // already correct, no substitute needed // pronounced incorrectly and ignoring the corrected pronunciation from IPA Notation speakIPA_Substitution(subst: shame, voice: indiaVoice) // ignores substitution speakIPA_Substitution(subst: tomato, voice: britishVoice) // ignores substitution speakIPA_Substitution(subst: tomato, voice: indiaVoice) // ignores substitution } }
Posted
by MisterE.
Last updated
.
Post not yet marked as solved
1 Replies
164 Views
@interface MineViewController () @property (nonatomic, strong) AVSpeechSynthesizer *speechSynthesizer; @end @implementation MineViewController (void)speak { //version1 self.speechSynthesizer = [[AVSpeechSynthesizer alloc] init]; self.speechSynthesizer.delegate = self; AVSpeechUtterance *utterance = [[AVSpeechUtterance alloc] initWithString:@"12345678"]; AVSpeechSynthesisVoice *voice = [AVSpeechSynthesisVoice voiceWithLanguage:@"zh-CN"]; [utterance setVoice:voice]; //(worked speakUtterance successed) [self.speechSynthesizer speakUtterance:utterance]; //version2 AVSpeechSynthesizer *speechSynthesizer = [[AVSpeechSynthesizer alloc] init]; speechSynthesizer.delegate = self; AVSpeechUtterance *utterance = [[AVSpeechUtterance alloc] initWithString:@"12345678"]; AVSpeechSynthesisVoice *voice = [AVSpeechSynthesisVoice voiceWithLanguage:@"zh-CN"]; [utterance setVoice:voice]; //(not worked speakUtterance no response) [speechSynthesizer speakUtterance:utterance]; } @end
Posted
by seppuu.
Last updated
.
Post not yet marked as solved
0 Replies
157 Views
I installed last update and now my only phone is useless!! I can’t make even a phone call. No apps will open and I can’t even restart it. I tried the suggestions and nothing. If you don’t update you constantly are reminded, which I try to ignore as my phone was working fine now it is useless!
Posted
by LindaSK54.
Last updated
.
Post not yet marked as solved
0 Replies
183 Views
I have updated to macOS Monterrey and my code for SFSPeechRecognizer just broke. I get this error if I try to configure an offline speech recognizer for macOS Error Domain=kLSRErrorDomain Code=102 "Failed to access assets" UserInfo={NSLocalizedDescription=Failed to access assets, NSUnderlyingError=0x6000003c5710 {Error Domain=kLSRErrorDomain Code=102 "No asset installed for language=es-ES" UserInfo={NSLocalizedDescription=No asset installed for language=es-ES}}} Here is a code snippet from a demo project: private func process(url: URL) throws {     speech = SFSpeechRecognizer.init(locale: Locale(identifier: "es-ES"))     speech.supportsOnDeviceRecognition = true     let request = SFSpeechURLRecognitionRequest(url: url)     request.requiresOnDeviceRecognition = true     request.shouldReportPartialResults = false     speech.recognitionTask(with: request) { result, error in       guard let result = result else {         if let error = error {           print(error)           return         }         return       }       if let error = error {         print(error)         return       }       if result.isFinal {         print(result.bestTranscription.formattedString)       }     }   } I have tried with different languages (es-ES, en-US) and it says the same error each time. Any idea on how to install these assets or how to fix this?
Posted Last updated
.
Post not yet marked as solved
7 Replies
1.4k Views
I updated Xcode to Xcode 13 and iPadOS to 15.0. Now my previously working application using SFSpeechRecognizer fails to start, regardless of whether I'm using on device mode or not. I use the delegate approach, and it looks like although the plist is set-up correctly (the authorization is successful and I get the orange circle indicating the microphone is on), the delegate method speechRecognitionTask(_:didFinishSuccessfully:) always returns false, but there is no particular error message to go along with this. I also downloaded the official example from Apple's documentation pages: SpokenWord SFSpeechRecognition example project page Unfortunately, it also does not work anymore. I'm working on a time-sensitive project and don't know where to go from here. How can we troubleshoot this? If it's an issue with Apple's API update or something has changed in the initial setup, I really need to know as soon as possible. Thanks.
Posted Last updated
.
Post not yet marked as solved
0 Replies
203 Views
Hello, My application has functionality to record a speech and convert the recorded speech to text. The application also tells the user what action he must perform using TTS (Text-to-Speech). When I start the screen recording from control centre and the app starts recording voice. This works. But as soon as the TTS voice is played the recorder will stop recording my voice or the voice played TTS. Please let me know what additional information is required from my side to debug this issue.
Posted
by amanoj.
Last updated
.
Post not yet marked as solved
0 Replies
264 Views
Hi, I'm trying to get this example working on MacOS now that SFSpeechRecognizer is available for the platform. A few questions ... Do I need to make an authorization request of the user if I intend to use "on device recognition"? When I ask for authorization to use speech recognition the dialog that pops up contains text that's not in my speech recognition usage description indicating that recordings will be sent to Apple's servers. But that is not accurate if I am using on device recognition (as far as I can tell). Is there a way to suppress that language if I am not using online speech recognition? Is there an updated example of the article I linked to that describes how to accomplish the same thing with MacOS instead of IOS? My compiler is complaining that AVAudioSession() is not available in MacOS and I'm not sure how to set things up for passing audio from the microphone to the speech recognizer. Thanks :-D Brian Duffy
Posted
by brduffy.
Last updated
.