Post not yet marked as solved
Hi,I'm looking for the Repeat After Me application that used to be included in /Developer/Applications/Utilities/Speech. I can find mention of it (https://developer.apple.com/library/prerelease/content/documentation/UserExperience/Conceptual/SpeechSynthesisProgrammingGuide/FineTuning/FineTuning.html), but can't locate it with the most recent install of xcode.Do I need to download something else? Has it been moved or removed?Thanks!
Post not yet marked as solved
Hello to all kind developers out there,I’m currently developing a audio-messaging app, for users to send short audio to each other.To make the communication experience as smooth and natural as possible, we are using Speech Framework for transcribing user-input live.Since this feature is high in demand for some of our users, we are worried about unexpected quotas and limits.(1) We know that individual recordings should be less than one minute(2) The answer here says about 1000 reqs/hour per device:https://developer.apple.com/library/archive/qa/qa1951/_index.html#//apple_ref/doc/uid/DTS40017662(3) The documentation says: “Individual devices may be limited in the number of recognitions that can be performed per day and an individual app may be throttled globally, based on the number of requests it makes per day”.We are well under limits for (1) and (2), but there are no specific documentation for limits per-app, and being “throttled globally” sounds scary.Can anyone give us information about per-app limits or any other kinds of limit that might potentially put an end to our lives?Thank you
Post not yet marked as solved
I am implementing accessibility into our current project and have some issues on setting focus on an element in a previous view.An example would be:A user selects on a tableview cell at index 3. In didSelectRowAt, we present to the user a UIActionSheet where the user makes a selection. Once the user makes a selection and the presented UIActionSheet is dismissed, the accessibility focus (SHOULD?) be selected at tableview cell index 3, but instead the focus is set back to the initial element in the view, in my case most top-left element.I have used UIAccessibility.Notification .screenChanged and UIView.accessibilityViewIsModalto set a new focus when presenting a modal view, but this doesn't seem to have a good solution to point back to a previous element when dismissing a view.Any insight on how to track the previous accessibility element focus would be greatly appreciated.
Post not yet marked as solved
Hi!
Great to see this forum!
I’m new to developing WatchOS apps, and I have a question with regard to dictation.
As far as I know I can develop an independent Apple Watch app with dictation capabilities that can connect to Apple services for speech recognition through WiFi, 4G, etc.
I’ve experienced with another app that after about 30 seconds, live dictation cuts off, and the “Done” button on the top disappears, leaving only the “Cancel” button. I’m not sure if this is an app specific issue, but it results in loss of input due to being forced to press the “Cancel” button.
Apart from that, I like to prepare speeches while I take a jog, so I want to develop a practical app where I can continue to speak with dictation.
My questions are:
Is there a way to increase live dictation timeout?
Can we expect offline dictation anytime soon?
Post not yet marked as solved
Hello everybody!
In my app I allow the user to change TTS voices and English Alex voice is one of possible options. However, there are some cases when it's treated as available when it's actually not. It results in pronouncing the TTS utterance with another voice.
To prepare the list of available voices I use next code:
NSMutableArray *voices = [NSMutableArray new];
for (AVSpeechSynthesisVoice *voice in [AVSpeechSynthesisVoice speechVoices]) {
		[voices addObject:@{
				@"id": voice.identifier,
				@"name": voice.name,
				@"language": voice.language,
				@"quality": (voice.quality == AVSpeechSynthesisVoiceQualityEnhanced) ? @500 : @300
		}];
}
To start the playback I use the next code (here it's simplified a bit):
AVSpeechUtterance *utterance = [[AVSpeechUtterance alloc] initWithString:text];
utterance.voice = [AVSpeechSynthesisVoice voiceWithIdentifier:voice];
[AVSpeechSynthesizer speakUtterance:utterance];
Cases when AVSpeechSynthesisVoice returns Alex as available when it's not:
The easiest way to reproduce it is on simulator. When I download Alex from iOS settings, the download button disappear, but when I press on the voice nothing happens. As a result it seems to be downloaded, however it can't be deleted.
In some cases Alex is dowloaded correctly and is actually available in the app, but when I try to delete it, it looks like it's not fully deleted. As a result it's treated as available in my app, but in iOS settings it's shown as not downloaded.
If the iPhone storage is close to the full state as much as possible and Alex voice hasn't been used recently, looks like it's being offloaded, so it's shown as available both in iOS settings and in my app, but in fact the utterance is being pronounced by another voice.
For all cases above in my app Alex looks like it's available, but when I pass it to the utterance it's pronounced with some different voice. Note that it happens only with this voice, I haven't seen such a case for others. Maybe this voice should be treated separately somehow?
Post not yet marked as solved
Obviously I’m aware there’s no tab key on the iPhone keyboard, but I swear on my previous iPhone - which was definitely not on the latest iOS up because of storage shortages - I could use the dictate/mic button to say “tab key” and it would indent the paragraph in google docs etc. as if I’d pressed the tab key on my computer. However I just tried on my new iPhone and it just types out the words tab key? Super confused as to why this is something Apple would get rid of, but I checked the list of dictation commands and it’s not there? Is there any other way to indent paragraphs now?
I am unable to get AVSpeechSynthesizer to write or to acknowledge the delegate actions .
I was informed this was resolved in macOS 11.
I thought it was a lot to ask but am now running on macOS 11.4 (Big Sur).
My target is to output speech faster than real-time and and drive the output through AVAudioengine.
First, I need to know why the write doesnt occur and neither do delegates get called whether I am using write or simply uttering to the default speakers in "func speak(_ string: String)".
What am I missing?
Is there a workaround?
Reference: https://developer.apple.com/forums/thread/678287
let sentenceToSpeak = "This should write to buffer and also call 'didFinish' and 'willSpeakRangeOfSpeechString' delegates."
SpeakerTest().writeToBuffer(sentenceToSpeak)
SpeakerTest().speak(sentenceToSpeak)
class SpeakerTest: NSObject, AVSpeechSynthesizerDelegate {
let synth = AVSpeechSynthesizer()
override init() {
super.init()
synth.delegate = self
}
func speechSynthesizer(_ synthesizer: AVSpeechSynthesizer, didFinish utterance: AVSpeechUtterance) {
print("Utterance didFinish")
}
func speechSynthesizer(_ synthesizer: AVSpeechSynthesizer,
willSpeakRangeOfSpeechString characterRange: NSRange,
utterance: AVSpeechUtterance)
{
print("speaking range: \(characterRange)")
}
func speak(_ string: String) {
let utterance = AVSpeechUtterance(string: string)
var usedVoice = AVSpeechSynthesisVoice(language: "en") // should be the default voice
let voices = AVSpeechSynthesisVoice.speechVoices()
let targetVoice = "Allison"
for voice in voices {
// print("\(voice.identifier) \(voice.name) \(voice.quality) \(voice.language)")
if (voice.name.lowercased() == targetVoice.lowercased())
{
usedVoice = AVSpeechSynthesisVoice(identifier: voice.identifier)
break
}
}
utterance.voice = usedVoice
print("utterance.voice: \(utterance.voice)")
synth.speak(utterance)
}
func writeToBuffer(_ string: String)
{
print("entering writeToBuffer")
let utterance = AVSpeechUtterance(string: string)
synth.write(utterance) { (buffer: AVAudioBuffer) in
print("executing synth.write")
guard let pcmBuffer = buffer as? AVAudioPCMBuffer else {
fatalError("unknown buffer type: \(buffer)")
}
if pcmBuffer.frameLength == 0 {
print("buffer is empty")
} else {
print("buffer has content \(buffer)")
}
}
}
}
Post not yet marked as solved
Is the format description AVSpeechSynthesizer for the speech buffer is correct?
When I attempt to convert it, I get back noise from two different conversion methods.
I am seeking to convert the speech buffer provided by the AVSpeechSynthesizer "func write(_ utterance: AVSpeechUtterance..." method.
The goal is to convert the sample type, change the sample rate and change from mono to stereo buffer.
I later manipulate the buffer data and pass it through AVAudioengine.
For testing purposes, I have kept the sample rate to the original 22050.0
What have I tried?
I have a method that I've been using for years named "resampleBuffer" that does this.
When I apply it to the speech buffer, I get back noise.
When I attempt to manually convert format and to stereo with "convertSpeechBufferToFloatStereo", I am getting back clipped output.
I tested flipping the samples, addressing the Big Endian, Signed Integer but that didn't work.
The speech buffer description is
inBuffer description: <AVAudioFormat 0x6000012862b0: 1 ch, 22050 Hz, 'lpcm' (0x0000000E) 32-bit big-endian signed integer>
import Cocoa
import AVFoundation
class SpeakerTest: NSObject, AVSpeechSynthesizerDelegate {
let synth = AVSpeechSynthesizer()
override init() {
super.init()
}
func resampleBuffer( inSource: AVAudioPCMBuffer, newSampleRate: Double) -> AVAudioPCMBuffer?
{
// resample and convert mono to stereo
var error : NSError?
let kChannelStereo = AVAudioChannelCount(2)
let convertRate = newSampleRate / inSource.format.sampleRate
let outFrameCount = AVAudioFrameCount(Double(inSource.frameLength) * convertRate)
let outFormat = AVAudioFormat(standardFormatWithSampleRate: newSampleRate, channels: kChannelStereo)!
let avConverter = AVAudioConverter(from: inSource.format, to: outFormat )
let outBuffer = AVAudioPCMBuffer(pcmFormat: outFormat, frameCapacity: outFrameCount)!
let inputBlock : AVAudioConverterInputBlock = { (inNumPackets, outStatus) -> AVAudioBuffer? in
outStatus.pointee = AVAudioConverterInputStatus.haveData // very important, must have
let audioBuffer : AVAudioBuffer = inSource
return audioBuffer
}
avConverter?.sampleRateConverterAlgorithm = AVSampleRateConverterAlgorithm_Mastering
avConverter?.sampleRateConverterQuality = .max
if let converter = avConverter
{
let status = converter.convert(to: outBuffer, error: &error, withInputFrom: inputBlock)
// print("\(status): \(status.rawValue)")
if ((status != .haveData) || (error != nil))
{
print("\(status): \(status.rawValue), error: \(String(describing: error))")
return nil // conversion error
}
} else {
return nil // converter not created
}
// print("success!")
return outBuffer
}
func writeToFile(_ stringToSpeak: String, speaker: String)
{
var output : AVAudioFile?
let utterance = AVSpeechUtterance(string: stringToSpeak)
let desktop = "~/Desktop"
let fileName = "Utterance_Test.caf" // not in sandbox
var tempPath = desktop + "/" + fileName
tempPath = (tempPath as NSString).expandingTildeInPath
let usingSampleRate = 22050.0 // 44100.0
let outSettings = [
AVFormatIDKey : kAudioFormatLinearPCM, // kAudioFormatAppleLossless
AVSampleRateKey : usingSampleRate,
AVNumberOfChannelsKey : 2,
AVEncoderAudioQualityKey : AVAudioQuality.max.rawValue
] as [String : Any]
// temporarily ignore the speaker and use the default voice
let curLangCode = AVSpeechSynthesisVoice.currentLanguageCode()
utterance.voice = AVSpeechSynthesisVoice(language: curLangCode)
// utterance.volume = 1.0
print("Int32.max: \(Int32.max), Int32.min: \(Int32.min)")
synth.write(utterance) { (buffer: AVAudioBuffer) in
guard let pcmBuffer = buffer as? AVAudioPCMBuffer else {
fatalError("unknown buffer type: \(buffer)")
}
if ( pcmBuffer.frameLength == 0 ) {
// done
} else {
// append buffer to file
var outBuffer : AVAudioPCMBuffer
outBuffer = self.resampleBuffer( inSource: pcmBuffer, newSampleRate: usingSampleRate)! // doesnt work
// outBuffer = self.convertSpeechBufferToFloatStereo( pcmBuffer ) // doesnt work
// outBuffer = pcmBuffer // original format does work
if ( output == nil ) {
//var bufferSettings = utterance.voice?.audioFileSettings
// Audio files cannot be non-interleaved.
var outSettings = outBuffer.format.settings
outSettings["AVLinearPCMIsNonInterleaved"] = false
let inFormat = pcmBuffer.format
print("inBuffer description: \(inFormat.description)")
print("inBuffer settings: \(inFormat.settings)")
print("inBuffer format: \(inFormat.formatDescription)")
print("outBuffer settings: \(outSettings)\n")
print("outBuffer format: \(outBuffer.format.formatDescription)")
output = try! AVAudioFile( forWriting: URL(fileURLWithPath: tempPath),settings: outSettings)
}
try! output?.write(from: outBuffer)
print("done")
}
}
}
}
class ViewController: NSViewController {
let speechDelivery = SpeakerTest()
override func viewDidLoad() {
super.viewDidLoad()
let targetSpeaker = "Allison"
var sentenceToSpeak = ""
for indx in 1...10
{
sentenceToSpeak += "This is sentence number \(indx). [[slnc 3000]] \n"
}
speechDelivery.writeToFile(sentenceToSpeak, speaker: targetSpeaker)
}
}
Three test can be performed. The only one that works is to directly write the buffer to disk
Is this really "32-bit big-endian signed integer"?
Am I addressing this correctly or is this a bug?
I'm on macOS 11.4
Post not yet marked as solved
I am attempting to utilize alternative pronunciation utilizing the IPA notation for AVSpeechSynthesizer on macOS (Big Sur 11.4). The attributed string is being ignored and so the functionality is not working. I tried this on iOS simulator and it works properly.
The India English voice pronounces the word "shame" as shy-em, so I applied the correct pronunciation but no change was heard. I then substituted the pronunciation for a completely different word but there was no change.
Is there something else that must be done to make this work?
AVSpeechSynthesisIPANotationAttribute
Attributed String: It's a '{
}shame{
AVSpeechSynthesisIPANotationAttribute = "\U0283\U02c8e\U0361\U026am";
}' it didn't work out.{
}
Target Range: {8, 5}
Target String: shame, Substitution: ʃˈe͡ɪm
Attributed String: It's a '{
}shame{
AVSpeechSynthesisIPANotationAttribute = "\U0283\U02c8e\U0361\U026am";
}' it didn't work out.{
}
Target Range: {8, 5}
Target String: shame, Substitution: ʃˈe͡ɪm
Attributed String: It's a '{
}shame{
AVSpeechSynthesisIPANotationAttribute = "t\U0259.\U02c8me\U0361\U026a.do\U0361\U028a";
}' it didn't work out.{
}
Target Range: {8, 5}
Target String: shame, Substitution: tə.ˈme͡ɪ.do͡ʊ
Attributed String: It's a '{
}shame{
AVSpeechSynthesisIPANotationAttribute = "t\U0259.\U02c8me\U0361\U026a.do\U0361\U028a";
}' it didn't work out.{
}
Target Range: {8, 5}
Target String: shame, Substitution: tə.ˈme͡ɪ.do͡ʊ
class SpeakerTest: NSObject, AVSpeechSynthesizerDelegate {
let synth = AVSpeechSynthesizer()
func speakIPA_Substitution(subst: String, voice: AVSpeechSynthesisVoice)
{
let text = "It's a 'shame' it didn't work out."
let mutAttrStr = NSMutableAttributedString(string: text)
let range = NSString(string: text).range(of: "shame")
let pronounceKey = NSAttributedString.Key(rawValue: AVSpeechSynthesisIPANotationAttribute)
mutAttrStr.setAttributes([pronounceKey: subst], range: range)
let utterance = AVSpeechUtterance(attributedString: mutAttrStr)
utterance.voice = voice
utterance.postUtteranceDelay = 1.0
let swiftRange = Range(range, in: text)!
print("Attributed String: \(mutAttrStr)")
print("Target Range: \(range)")
print("Target String: \(text[swiftRange]), Substitution: \(subst)\n")
synth.speak(utterance)
}
func customPronunciation()
{
let shame = "ʃˈe͡ɪm" // substitute correct pronunciation
let tomato = "tə.ˈme͡ɪ.do͡ʊ" // completely different word pronunciation
let britishVoice = AVSpeechSynthesisVoice(language: "en-GB")!
let indiaVoice = AVSpeechSynthesisVoice(language: "en-IN")!
speakIPA_Substitution(subst: shame, voice: britishVoice) // already correct, no substitute needed
// pronounced incorrectly and ignoring the corrected pronunciation from IPA Notation
speakIPA_Substitution(subst: shame, voice: indiaVoice) // ignores substitution
speakIPA_Substitution(subst: tomato, voice: britishVoice) // ignores substitution
speakIPA_Substitution(subst: tomato, voice: indiaVoice) // ignores substitution
}
}
Post not yet marked as solved
To whom this may concern,
I am Nakata in Japan. in charge of Software developer.
I am writing to you for the first time.
I am contacting you to enquire about .
【Question】
1.What are you doing by connecting to the network?
2.Please tell me how to use it without connecting to the network.
【Thing you want to do】
I want to perform voice recognition offline and in Japanese.
【Current status】
I'm using SFSpeechRecognizer.
When voice recognition is performed on a tablet that is not connected to the network.
Speech recognition is not available because 「SFSpeechRecognitionResult == Null」.
You can use it when you are connected to the network,
with the application that implements voice recognition running.
【Development environment】
Xamarin
Xcode
C#
iPad mini with ios13.4.1.
Post not yet marked as solved
Hi,
I am facing a strange issue in my app there is an intermittent crash, I am using AVSpeechSynthesizer for speech discovery not sure if that is causing the problem crash log has below information:
Firebase Crash log
Crashed: AXSpeech
0 CoreFoundation 0x197325d00 _CFAssertMismatchedTypeID + 112
1 CoreFoundation 0x197229188 CFRunLoopSourceIsSignalled + 314
2 Foundation 0x198686ca0 performQueueDequeue + 440
3 Foundation 0x19868641c __NSThreadPerformPerform + 112
4 CoreFoundation 0x19722c990 __CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__ + 28
5 CoreFoundation 0x19722c88c __CFRunLoopDoSource0 + 208
6 CoreFoundation 0x19722bbfc __CFRunLoopDoSources0 + 376
7 CoreFoundation 0x197225b70 __CFRunLoopRun + 820
8 CoreFoundation 0x197225308 CFRunLoopRunSpecific + 600
9 Foundation 0x198514d8c -[NSRunLoop(NSRunLoop) runMode:beforeDate:] + 232
10 libAXSpeechManager.dylib 0x1c3ad0bbc -[AXSpeechThread main]
11 Foundation 0x19868630c __NSThread__start__ + 864
12 libsystem_pthread.dylib 0x1e2f20bfc _pthread_start + 320
13 libsystem_pthread.dylib 0x1e2f29758 thread_start + 8
Apple Crash log Crash log
Post not yet marked as solved
Can you perform two or more OFFLINE speech recognition tasks simultaneously?
SFSpeechRecognizer, SFSpeechURLRecognitionRequest offline limitation?
Running on macOS Big Sur 11.5.2
I would like to be perform two or more offline speech recognition tasks simultaneously.
I've executed two tasks in the same application AND executed two different applications, both using offline recognition.
Once I initiate the other thread or other application, the first recognition stops.
Since the computer supports multiple threads, I planned to take make use of the concurrency.
Use cases
#1 multiple Audio or video files that I wish to transcribe -- cuts down on the wait time.
#2 split a single large file up into multiple sections and stitch the results together -- again cuts down on the wait time.
I set on device recognition to TRUE because my target files can be up to two hours in length.
My test files are 15-30 minutes in length and I have a number of them, so recognition must be done on the device.
func recognizeFile_Compact(url:NSURL) {
let language = "en-US" //"en-GB"
let recognizer = SFSpeechRecognizer(locale: Locale.init(identifier: language))!
let recogRequest = SFSpeechURLRecognitionRequest(url: url as URL)
recognizer.supportsOnDeviceRecognition = true // ensure the DEVICE does the work -- don't send to cloud
recognizer.defaultTaskHint = .dictation // give a hint as dictation
recogRequest.requiresOnDeviceRecognition = true // don
recogRequest.shouldReportPartialResults = false // we dont want partial results
var strCount = 0
let recogTask = recognizer.recognitionTask(with: recogRequest, resultHandler: { (result, error) in
guard let result = result else {
print("Recognition failed, \(error!)")
return
}
let text = result.bestTranscription.formattedString
strCount += 1
print(" #\(strCount), "Best: \(text) \n" )
if (result.isFinal) { print("WE ARE FINALIZED") }
})
}
Post not yet marked as solved
I use AVSpeechSynthesizer to pronounce some text in German. Sometimes it works just fine and sometimes it doesn't for some unknown to me reason (there is no error, because the speak() method doesn't throw and the only thing I am able to observe is the following message logged in the console):
_BeginSpeaking: couldn't begin playback
I tried to find some API in the AVSpeechSynthesizerDelegate to register a callback when error occurs, but I have found none.
The closest match was this (but it appears to be only available for macOS, not iOS):
https://developer.apple.com/documentation/appkit/nsspeechsynthesizerdelegate/1448407-speechsynthesizer?changes=_10
Below you can find how I initialize and use the speech synthesizer in my app:
class Speaker: NSObject, AVSpeechSynthesizerDelegate {
class func sharedInstance() -> Speaker {
struct Singleton {
static var sharedInstance = Speaker()
}
return Singleton.sharedInstance
}
let audioSession = AVAudioSession.sharedInstance()
let synth = AVSpeechSynthesizer()
override init() {
super.init()
synth.delegate = self
}
func initializeAudioSession() {
do {
try audioSession.setCategory(.playback, mode: .spokenAudio, options: .duckOthers)
try audioSession.setActive(true, options: .notifyOthersOnDeactivation)
} catch {
}
}
func speak(text: String, language: String = "de-DE") {
guard !self.synth.isSpeaking else { return }
let utterance = AVSpeechUtterance(string: text)
let voice = AVSpeechSynthesisVoice.speechVoices().filter { $0.language == language }.first!
utterance.voice = voice
self.synth.speak(utterance)
}
}
The audio session initialization is ran during app started just once.
Afterwards, speech is synthesized by running the following code:
Speaker.sharedInstance.speak(text: "Lederhosen")
The problem is that I have no way of knowing if the speech synthesis succeeded—the UI is showing "speaking" state, but nothing is actually being spoken.
Post not yet marked as solved
I'm trying to find specific information on how Apple transfers & stores the voice data that's transferred for speech recognition in Safari as part of WebSpeechAPI.
All I keep seeing are generic privacy documents that do not provide any detail. Is anyone able to point me in the right direction of an explanation of how customer data is used?
Post not yet marked as solved
I updated Xcode to Xcode 13 and iPadOS to 15.0.
Now my previously working application using SFSpeechRecognizer fails to start, regardless of whether I'm using on device mode or not.
I use the delegate approach, and it looks like although the plist is set-up correctly (the authorization is successful and I get the orange circle indicating the microphone is on), the delegate method speechRecognitionTask(_:didFinishSuccessfully:) always returns false, but there is no particular error message to go along with this.
I also downloaded the official example from Apple's documentation pages:
SpokenWord SFSpeechRecognition example project page
Unfortunately, it also does not work anymore.
I'm working on a time-sensitive project and don't know where to go from here. How can we troubleshoot this? If it's an issue with Apple's API update or something has changed in the initial setup, I really need to know as soon as possible.
Thanks.
I’m getting a flood of these errors in a shipping speech recognition app since users started upgrading to iOS15. It’s usually being returned by the speech recogniser a few seconds after recognition begins.
I can’t find any reference to it anywhere in Apple’s documentation. What is it?
Code: 301
Domain: kLSRErrorDomain
Description: Recognition request was canceled
Post not yet marked as solved
Hi,
I use device-local speech recognition for speech input.
Now some iOS 15 upgraded devices return the new error domain / code
kLSRErrorDomain, code 201
(previously the errors were mostly in kAFAssistantErrorDomain). Has anybody an idea what it means and how to fix it?
Thanks!
Post not yet marked as solved
in iOS 15, on stopSpeaking of AVSpeechSynthesizer,
didFinish delegate method getting called instead of didCancel which is working fine in iOS 14 and below version.
Post not yet marked as solved
Since Version 14.2 we are having issues with STT. By the past we were using Azure and it was working fine. Since you've implemented partially the Speech Recognition API things are getting worse, on IOS. No problem on Osx.
It seems like the recording we send to STT has a very poor quality and some part of sentence missing. When I implement it solo ti works fine, but soon as I play an audio before opening the microphone it does'nt work anymore (or only partially).
I come to the question : Would there be a solution while waiting for you to deploy a working Speech Recongnition API ?
Post not yet marked as solved
Here is a simple app to demonstrate problem:
import SwiftUI
import AVFoundation
struct ContentView: View {
var synthVM = SpeakerViewModel()
var body: some View {
VStack {
Text("Hello, world!")
.padding()
HStack {
Button("Speak") {
if self.synthVM.speaker.isPaused {
self.synthVM.speaker.continueSpeaking()
} else {
self.synthVM.speak(text: "Привет на корабле! Кто это пришел к нам, чтобы посмотреть на это произведение?")
}
}
Button("Pause") {
if self.synthVM.speaker.isSpeaking {
self.synthVM.speaker.pauseSpeaking(at: .word)
}
}
Button("Stop") {
self.synthVM.speaker.stopSpeaking(at: .word)
}
}
}
}
}
struct ContentView_Previews: PreviewProvider {
static var previews: some View {
ContentView()
}
}
class SpeakerViewModel: NSObject {
var speaker = AVSpeechSynthesizer()
override init() {
super.init()
self.speaker.delegate = self
}
func speak(text: String) {
let utterance = AVSpeechUtterance(string: text)
utterance.voice = AVSpeechSynthesisVoice(language: "ru")
speaker.speak(utterance)
}
}
extension SpeakerViewModel: AVSpeechSynthesizerDelegate {
func speechSynthesizer(_ synthesizer: AVSpeechSynthesizer, didStart utterance: AVSpeechUtterance) {
print("started")
}
func speechSynthesizer(_ synthesizer: AVSpeechSynthesizer, didPause utterance: AVSpeechUtterance) {
print("paused")
}
func speechSynthesizer(_ synthesizer: AVSpeechSynthesizer, didContinue utterance: AVSpeechUtterance) {}
func speechSynthesizer(_ synthesizer: AVSpeechSynthesizer, didCancel utterance: AVSpeechUtterance) {}
func speechSynthesizer(_ synthesizer: AVSpeechSynthesizer, willSpeakRangeOfSpeechString characterRange: NSRange, utterance: AVSpeechUtterance) {
guard let rangeInString = Range(characterRange, in: utterance.speechString) else { return }
print("Will speak: \(utterance.speechString[rangeInString])")
}
func speechSynthesizer(_ synthesizer: AVSpeechSynthesizer, didFinish utterance: AVSpeechUtterance) {
print("finished")
}
}
On simulator all works fine, but on real device there are many strange words appears in synthesis speak.
And willSpeakRangeOfSpeechString output is different on simulator and real device
Simulator:
started
Will speak: Привет
Will speak: на
Will speak: корабле!
Will speak: Кто
Will speak: это
Will speak: пришел
Will speak: к
Will speak: нам,
Will speak: чтобы
Will speak: посмотреть
Will speak: на
Will speak: это
Will speak: произведение?
finished
iPhone output have errors:
2021-10-12 17:09:32.613273+0300 VoiceTest[9027:203522] [AXTTSCommon] Broken user rule: \b([234567890]+)2 (мили|кварты|чашки|{столовых }ложки)(?=$|\s|[[:punct:]»\xa0]) > Error Domain=NSCocoaErrorDomain Code=2048 "The value “\b([234567890]+)2 (мили|кварты|чашки|{столовых }ложки)(?=$|\s|[[:punct:]»\xa0])” is invalid." UserInfo={NSInvalidValue=\b([234567890]+)2 (мили|кварты|чашки|{столовых }ложки)(?=$|\s|[[:punct:]»\xa0])}
2021-10-12 17:09:32.613548+0300 VoiceTest[9027:203522] [AXTTSCommon] Broken user rule: \b(1\d+)2 (мили|кварты|чашки|{столовых }ложки)(?=$|\s|[[:punct:]»\xa0]) > Error Domain=NSCocoaErrorDomain Code=2048 "The value “\b(1\d+)2 (мили|кварты|чашки|{столовых }ложки)(?=$|\s|[[:punct:]»\xa0])” is invalid." UserInfo={NSInvalidValue=\b(1\d+)2 (мили|кварты|чашки|{столовых }ложки)(?=$|\s|[[:punct:]»\xa0])}
2021-10-12 17:09:32.613725+0300 VoiceTest[9027:203522] [AXTTSCommon] Broken user rule: \b2 (мили|кварты|чашки|{столовых }ложки)(?=$|\s|[[:punct:]»\xa0]) > Error Domain=NSCocoaErrorDomain Code=2048 "The value “\b2 (мили|кварты|чашки|{столовых }ложки)(?=$|\s|[[:punct:]»\xa0])” is invalid." UserInfo={NSInvalidValue=\b2 (мили|кварты|чашки|{столовых }ложки)(?=$|\s|[[:punct:]»\xa0])}
started
Will speak: Привет
Will speak: на
Will speak: ивет на корабле!
Will speak: Кто
Will speak: это
Will speak: Кто это пришел
Will speak: к
Will speak: нам,
Will speak: чтобы
Will speak: посмотреть
Will speak: на
Will speak: реть на это
Will speak: на это произведение?
finished
Error appears on iOS / iPadOS 15.0, 15.0.1, 15.0.2, 14.7
But all works fine on 14.8
Looks like engine error. How to fix that issue?