Apple Intelligence

RSS for tag

Apple Intelligence is the personal intelligence system that puts powerful generative models right at the core of your iPhone, iPad, and Mac and powers incredible new features to help users communicate, work, and express themselves.

Posts under Apple Intelligence subtopic

Post

Replies

Boosts

Views

Activity

Accessibility & Inclusion
We are developing Apple AI for foreign markets and adapting it for iPhone models 17 and above. When the system language and Siri language are not the same—for example, if the system is in English and Siri is in Chinese—it can cause a situation where Apple AI cannot be used. So, may I ask if there are any other reasons that could cause Apple AI to be unavailable within the app, even if it has been enabled?
0
0
576
Dec ’25
Threading issues when using debugger
Hi, I am modifying the sample camera app that is here: https://developer.apple.com/tutorials/sample-apps/capturingphotos-camerapreview ... In the processPreviewImages, I am using the Vision APIs to generate a segmentation mask for a person/object, then compositing that person onto a different background (with some other filtering). The filtering and compositing is done via CoreImage. At the end, I convert the CIImage to a CGImage then to a SwiftUI Image. When I run it on my iPhone, it works fine, and has not crashed. When I run it on the iPhone with the debugger, it crashes within a few seconds with: EXC_BAD_ACCESS in libRPAC.dylib`std::__1::__hash_table<std::__1::__hash_value_type<long, qos_info_t>, std::__1::__unordered_map_hasher<long, std::__1::__hash_value_type<long, qos_info_t>, std::__1::hash, std::__1::equal_to, true>, std::__1::__unordered_map_equal<long, std::__1::__hash_value_type<long, qos_info_t>, std::__1::equal_to, std::__1::hash, true>, std::__1::allocator<std::__1::__hash_value_type<long, qos_info_t>>>::__emplace_unique_key_args<long, std::__1::piecewise_construct_t const&, std::__1::tuple<long const&>, std::__1::tuple<>>: It had previously been working fine with the debugger, so I'm not sure what has changed. Is there a difference in how the Vision APIs are executed if the debugger is attached vs. not?
1
0
554
Jan ’26
Image Playground files suddenly not available
My app lets you create images with Image Playground. When the user approves an image I move it to the documents dir from the temp storage. With over a year of usage I’ve created a lot of images over time. Out of nowhere the app stopped loading my custom creations from Image Playground saying it couldn’t find the files. It still had my VoiceOver strings I had added for each image and still had the custom categories I assigned them. Debug code to look in the docs dir doesn’t find them. I downloaded the app’s container and only see the images I created as a test after the problem started. But my ~70MB app is still taking up 300MB on my iPhone so it feels like they’re there but not accessible. Is there anything else I can try?
2
0
1.2k
Jan ’26
CoreML Instrument Testing Native Clawbot using FM.SyML & OAIC & Diffusion
After running performance test on my CoreML qwen3 vision, I appreciated the update where results were viewable... ON Mac it mentions Ios18 and im not sure if or how to change.. that bottle neck lead to rebuilding CoreML view. I woke up and realized I have all the pieces together... and ended up with a swift package working demo of Clawbot.. the current issue is Im trying to use gguf 3b to code it.. I have become well aware that everything I create using the big models, they soon become the default themes /layouts for everyone else simply asking for this or that (I appoligise) so here I am asking (while looking to schedule meet with dev) if its possible to speak with anyone about th 1000s of Apple Intelligence PCC, Xcode, and vision reports and feedback ive sent , in terms of just general ways I can work more efficiently without the crash... ive already build a TUI for MLX but the tools for coreML while seems promising are not intuitive, but the vision format instruction was nice to see. Anyway my question is:
0
0
335
Feb ’26
Parallel/Steam processing of Apple Intelligence
I have built a MAC-OS machine intelligence application that uses Apple Intelligence. A part of the application is to preprocess text. For longer text content I have implemented chunking to get around the token limit. However the application performance is now limited by the fact that Apple Intelligence is sequential in operation. This has a large impact on the application performance. Is there any approach to operate Apple Intelligence in a parallel mode or even a streaming interface. As Apple Intelligence has Private Cloud Services I was hoping to be able to send multiple chunks in parallel as that would significantly improve performance. Any suggestions would be welcome. This could also be considered a request for a future enhancement.
2
0
465
Feb ’26
Best approach for animating a speaking avatar in a macOS/iOS SwiftUI application
I am developing a macOS application using SwiftUI (with an iOS version as well). One feature we are exploring is displaying an avatar that reads or speaks dynamically generated text produced by an AI service. The basic flow would be: Text generated by an AI service Text converted to speech using a TTS engine An avatar (2D or 3D) rendered in the app that animates lip movement synchronized with the speech Ideally the avatar would render locally on the device. Questions: What Apple frameworks would be most appropriate for implementing a speaking avatar? SceneKit RealityKit SpriteKit (for 2D avatars) Is there any recommended way to drive lip-sync animation from speech audio using Apple frameworks? Does AVSpeechSynthesizer expose phoneme or viseme timing information that could be used for avatar animation? If such timing information is not available, what is the recommended approach for synchronizing character mouth animation with speech audio on macOS/iOS? Are there examples of real-time character animation synchronized with speech on macOS/iOS? Any architectural guidance or references would be greatly appreciated.
0
0
932
Mar ’26
Building Real-Time Voice Input on macOS 26 with SpeechAnalyzer + ScreenCaptureKit
We built an open-source macOS menu bar app that turns speech into text and pastes it into the active app — using SpeechAnalyzer for on-device transcription, ScreenCaptureKit + Vision for screen-aware context, and FluidAudio for speaker diarization in meeting mode. Here's what we learned shipping it on macOS 26. GitHub: github.com/Marvinngg/ambient-voice Architecture The app has two modes: hotkey dictation (press to talk, release to inject) and meeting recording (continuous transcription with a floating panel). Dictation Mode Audio capture uses AVCaptureSession (more on why below). The captured audio feeds into SpeechAnalyzer via an AsyncStream: let transcriber = SpeechTranscriber( locale: locale, transcriptionOptions: [], reportingOptions: [.volatileResults, .alternativeTranscriptions], attributeOptions: [.audioTimeRange, .transcriptionConfidence] ) let analyzer = SpeechAnalyzer(modules: [transcriber]) let (inputSequence, inputBuilder) = AsyncStream.makeStream() try await analyzer.start(inputSequence: inputSequence) While recording, we capture a screenshot of the focused window using ScreenCaptureKit, run Vision OCR (VNRecognizeTextRequest), extract keywords, and inject them into SpeechAnalyzer as contextual bias: let context = AnalysisContext() context.contextualStrings[.general] = ocrKeywords try await analyzer.setContext(context) This improves accuracy for technical terms and proper nouns visible on screen. If your screen shows "SpeechAnalyzer", saying it out loud is more likely to be transcribed correctly. After transcription, an optional L2 step sends the text through a local LLM (ollama) for spoken-to-written cleanup, then CGEvent simulates Cmd+V to paste into the active app. Meeting Mode Meeting mode forks the same audio stream to two consumers: SpeechAnalyzer — real-time streaming transcription, displayed in a floating NSPanel FluidAudio buffer — accumulates 16kHz Float32 mono samples for batch speaker diarization after recording stops When the user ends the meeting, FluidAudio's performCompleteDiarization() runs on the accumulated audio. We align transcription segments with speaker segments using audioTimeRange overlap matching — each transcription segment gets assigned the speaker ID with the most time overlap. Results export to Markdown. Pitfalls We Hit on macOS 26 1. AVAudioEngine installTap doesn't fire with Bluetooth devices We started with AVAudioEngine.inputNode.installTap() for audio capture. It worked fine with built-in mics but the tap callback never fired with Bluetooth devices (tested with vivo TWS 4 Hi-Fi). Fix: switched to AVCaptureSession. The delegate callback captureOutput(_:didOutput:from:) fires reliably regardless of audio device. The tradeoff is you get CMSampleBuffer instead of AVAudioPCMBuffer, so you need a conversion step. 2. NSEvent addGlobalMonitorForEvents crashes Our global hotkey listener used NSEvent.addGlobalMonitorForEvents. On macOS 26, this crashes with a Bus error inside GlobalObserverHandler — appears to be a Swift actor runtime issue. Fix: switched to CGEventTap. Works reliably, but the callback runs on a CFRunLoop context, which Swift doesn't recognize as MainActor. 3. CGEventTap callbacks aren't on MainActor If your CGEventTap callback touches any @MainActor state, you'll get concurrency violations. The callback runs on whatever thread owns the CFRunLoop. Fix: bridge with DispatchQueue.main.async {} inside the tap callback before touching any MainActor state. 4. CGPreflightScreenCaptureAccess doesn't request permission We used CGPreflightScreenCaptureAccess() as a guard before calling ScreenCaptureKit. If it returned false, we'd bail out. The problem: this function only checks — it never triggers macOS to add your app to the Screen Recording permission list. Chicken-and-egg: you can't get permission because you never ask for it. Fix: call CGRequestScreenCaptureAccess() at app startup. This adds your app to System Settings → Screen Recording. Then let ScreenCaptureKit calls proceed without the preflight guard — SCShareableContent will also trigger the permission prompt on first use. 5. Ad-hoc signing breaks TCC permissions on every rebuild During development, codesign --sign - (ad-hoc) generates a different code directory hash on every build. macOS TCC tracks permissions by this hash, so every rebuild = new app identity = all permissions reset. Fix: sign with a stable certificate. If you have an Apple Development certificate, use that. The TeamIdentifier stays constant across rebuilds, so TCC permissions persist. We also discovered that launching via open WE.app (LaunchServices) instead of directly executing the binary is required — otherwise macOS attributes TCC permissions to Terminal, not your app. Benchmarks We ran end-to-end benchmarks on public datasets (Mac Mini M4 16GB, macOS 26): Transcription (SpeechAnalyzer, AliMeeting Chinese): • Near-field CER 34% (excluding outliers ~25%) • Far-field CER 40% (single channel, no beamforming, >30% overlap) • Processing speed 74-89x real-time Speaker diarization (FluidAudio offline): • AMI English 16 meetings: avg DER 23.2% (collar=0.25s, ignoreOverlap=True) • AliMeeting Chinese 8 meetings: DER 48.5% (including overlap regions) • Memory: RSS ~500MB, peak 730-930MB Full evaluation methodology, scripts, and raw results are in the repo. Open Source The project is MIT licensed: github.com/Marvinngg/ambient-voice It includes the macOS client (Swift 6.2, SPM), server-side distillation/training scripts (Python), and a complete evaluation framework with reproducible benchmarks. Feedback and contributions welcome.
0
0
805
Mar ’26
Programmatic image creation using ImageCreator
Hello, Could you please provide details for maximum string length of the prompt and the title when using ImageCreator and the method extracted(from:title:)? static func extracted( from text: String, title: String? = nil ) -> ImagePlaygroundConcept Any additional details or example of prompt and title would help. Additionally, are ImagePlaygroundStyle.animation, ImagePlaygroundStyle.illustration and ImagePlaygroundStyle.sketch all available when using extracted(from:title:)? I am trying to generate images programmatically and would appreciate your guidance. Thank you.
1
0
586
Mar ’26
Genmoji API — are local edits and non-text usage (e.g., widgets) allowed?
Hi, I'm integrating the Genmoji API (NSAdaptiveImageGlyph) into my app and would like to confirm two things: Local editing of generated Genmoji. After the user creates a Genmoji, can the app apply edits to the resulting image (e.g., pixelation)? The edited image would be stored only on-device within the app and never shared externally. Use outside text contexts. Can a generated Genmoji be used in other parts of the app, such as a home screen widget? Apple's documentation and the WWDC24 session focus on inline text, stickers, and Tapbacks, but I couldn't find explicit guidance on widget or other UI usage. I checked the Human Interface Guidelines, WWDC24 session "Bring expression to your app with Genmoji," and the App Store Review Guidelines, but couldn't find clear answers. Any guidance or pointers would be appreciated. Thanks!
0
0
1.7k
Apr ’26
Visual Intelligence API SemanticContentDescriptor labels are empty
I'm trying to use Apple's new Visual Intelligence API for recommending content through screenshot image search. The problem I encountered is that the SemanticContentDescriptor labels are either completely empty or super misleading, making it impossible to query for similar content on my app. Even the closest matching example was inaccurate, returning a single label ["cardigan"] for a Supreme T-Shirt. I see other apps using this API like Etsy for example, and I'm wondering if they're using the input pixel buffer to query for similar content rather than using the labels? If anyone has a similar experience or something that wasn't called out in the documentation please lmk! Thanks.
2
0
1.3k
2w
Cloud based Siri AI for older devices
If some Siri AI features work on cloud why not for older device? Apple Devs mentioned that the Siri AI is available for iPhone 15 and later devices and we can use iCloud+ Subscription to use it more on Apple’s Private Cloud Compute (PCC). So if it can run some functions on Private Cloud why not for people with older device? Or will Apple make it compatible with older devices on upcoming beta updates? Also, I noticed that other Phone companies have these features in phones that are way less powerful than iPhone 11 (Which is a oldest device support iOS 27) I tested some ai models from 3rd party apps that are less than 300MB and it worked very well than the old siri so can We add some Siri features to older iPhone? So we can avoid people from switching to other phones. Because I don’t want my friends leaving iPhone.
0
0
75
5d
Siri Ai waitlist.
I’ve been waiting since 2:30 PM EST yesterday. It’s been well over 24 hours. It’s never taken this long before my phones finish indexing, so I know it’s not that. Also signed up for it right away. And still not accepted, which is very weird because I’ve seen it on Reddit pages where people have signed up earlier today and have been accepted already. No clue what’s going on. Wish we can get some type of answer
0
0
20
4d
Nearly 70 hours on Siri waitlist
Downloaded the moment WWDC presentation ended at 11a PT. Phone 17 Air, English (US) Plugged in over (3) nights Still indexing Rebooted 5+ times Still waitlisted. The only feature anyone cares about is gated. So much for “available immediately“. Losing trust with Apple fast here.
0
0
175
3d
S5 - Specific Siri Security Situation in Slovakia
Dears, I have reached out to Apple Research and Apple Security but this is NOT really for them. This is a developer topic !! Apple Research and Security are trying to find a malicious code, bugs ect, but what I am whitnessing is different and much much deeper into the code. Apple Intelligence in Slovakia is much more limited then in other countries. A specific security configuration due to EU regulations in combination with Siri NOT able to speak or understand Slovak. At low level this combination with a small PUSH with good timing, makes the devices to completely strip themselves off, of all security and trust certifications. What follows is a blank completely from scratch processed reinstall, where the attacker only prepares the "CORRECT" files and information and all the work is done by Apple system itself !!! The result is a complete domination of hardware using the NPU (ANE) chip, which does all the job. And I mean each pixel, sound, connection ect... What is the MOST ALARMING is that due to the proud declaration of customer data privacy this is the exact spot where if something like this happens, Apple will NOT be able to see it. The Customer is then in an extreme situation, where he knows that the devices, accounts, keychain, bank account, each app, each picture or sound.... Everything is compromised, but online help and the retailers are too short for this and further to this Apple DONT HAVE AN OFFICE in Slovakia. Only thing left are the contracted service (repair) shops, which are capable to perform a DFU Restore, which does NOT help. I have requested DFU Restore approx 15x in the last 9 months. Once you turn on and you only pick the language, there is a GLITCH and you know this is back again. A very quick and not too detailed process: It is a very silent and extremely sophisticated takeover without an ovious crash at the beginning. Using various tools, which I can describe and present examples. One variation is a HTML code a DOM which is recursive, calling functions and cancelling. Too many functions with offset which results in a graphics freeze, overload or similiar.. The object itself is not frozen and it is carefully prepared !! It will mostly copy and clone the target and NEST inside without knowing. What happens here is that this recursive DOM was applied and therefore the SHUTDOWN MONITOR LOG occured. This froze also mds index which blocked the mounting and unmounting of Volumes.... This is ofcourse carefully instrumented not to raise any attention. Same structure can be used in any code, any language, pdf, it can be nested in a wallpaper or a standard image, library, anywhere ... I can provide a proof and a functional script... The install log is showing - Untracked client connected - RemoteManagement which REINSTALLED the OS. After that Launchd skipps almost all tasks on the next run .... After this mounting volumes block, the system will not restart as standard, insted forced to early boot as possible which starts with PKI TRUST and SIRI UNDERSTANDING ... The PKI TRUST is manipulated and prepared and Siri is not called by the system as Apple Inteligence. So with reinstalled and carefully prepared OS, Launchd who skipped most tasks at the start and without proper encryption. There is a direct open path to Siri and her ASR HAMMERING.... I have personally checked almost 10 different electronics shops and checked the console on each Macbook that was free to try. In each of them these four Protolol logs were the exact same !!!! But after that a brutal iphone reinstall and even over lockdown mode reinstall will follow... Can also provide logs and information... And there is a SIMPLE LOGIC PARADOX with HUGE impact. Any document can be signed by Apple in a second. That is how the PKI TRUST was manipulated without any problem. That is also extremely important ... I can present this, but I must know that somebody is listening.... otherwise the only way is press... Apple Research and Security is blind here and I simply cannot get any answer.... If you know anybody in Slovakia, tell them to go to check this out !!! Get this information to Somebody who could just check it please .... This is probably the largest Supply Chain Attack ever ... And all it takes is a phone call to iStores to Slovakia so they can check for you ... From what I can see, now an update is prepared for Siri. It is based on Ruby but mostly Nokogiri and Gumbo. It will be presented as a 8 bit range training for local LLM, as super fast, but really it will be a combination of Hohner Electric Piano from the 70s with 8 bit sound which will use DTrace and its ROOT privileges. The sound is a square frequency which can be used to hide communication or something we dont know yet. And it does not matter anymore... With a direct connection to GitHub or just the internet ... Any code can be signed and stored anywhere .... The codename is ELECTRA, from what I know this tag was used for jailbreak of Siri in the past. So I belive this will be the final act ... Is there somebody to whom I can speak to about this ?? No generic mails ... THX Mike
0
0
20
3d
siri waitlist
How are some people getting access within 10 minutes while I've been waiting for over a week? In fact, it's been 48 hrs since I joined the waitlist. This is honestly ridiculous. I've met all the requirements, enabled every necessary setting, and still have no access. Meanwhile, others are getting approved almost instantly. If the rollout is based on a waitlist, then it should be handled fairly and consistently. Waiting this long for a feature update is extremely frustrating, especially when there has been little to no communication about the delay.
0
0
27
3d
New Siri AI and indexing stuck
New siri AI and indexing stuck for over 65 hours i’ve tried everything. Hard restarting my phone, putting an airplane mode doing the diagnostics, everything, but nothing still helps. I also downloaded the iOS 27 beta on my iPad Air 13 inch M3 and the same thing happened to it I’m waiting over 24 hours now. Does anybody know how to resolve this because I am tired of waiting and waiting
0
0
121
3d
Accessibility & Inclusion
We are developing Apple AI for foreign markets and adapting it for iPhone models 17 and above. When the system language and Siri language are not the same—for example, if the system is in English and Siri is in Chinese—it can cause a situation where Apple AI cannot be used. So, may I ask if there are any other reasons that could cause Apple AI to be unavailable within the app, even if it has been enabled?
Replies
0
Boosts
0
Views
576
Activity
Dec ’25
Image playground stuck
Got new iPhone Boxing Day all works bar image playground uninstalled/reinstalled turns ai on/off still stuck
Replies
1
Boosts
0
Views
601
Activity
Dec ’25
Threading issues when using debugger
Hi, I am modifying the sample camera app that is here: https://developer.apple.com/tutorials/sample-apps/capturingphotos-camerapreview ... In the processPreviewImages, I am using the Vision APIs to generate a segmentation mask for a person/object, then compositing that person onto a different background (with some other filtering). The filtering and compositing is done via CoreImage. At the end, I convert the CIImage to a CGImage then to a SwiftUI Image. When I run it on my iPhone, it works fine, and has not crashed. When I run it on the iPhone with the debugger, it crashes within a few seconds with: EXC_BAD_ACCESS in libRPAC.dylib`std::__1::__hash_table<std::__1::__hash_value_type<long, qos_info_t>, std::__1::__unordered_map_hasher<long, std::__1::__hash_value_type<long, qos_info_t>, std::__1::hash, std::__1::equal_to, true>, std::__1::__unordered_map_equal<long, std::__1::__hash_value_type<long, qos_info_t>, std::__1::equal_to, std::__1::hash, true>, std::__1::allocator<std::__1::__hash_value_type<long, qos_info_t>>>::__emplace_unique_key_args<long, std::__1::piecewise_construct_t const&, std::__1::tuple<long const&>, std::__1::tuple<>>: It had previously been working fine with the debugger, so I'm not sure what has changed. Is there a difference in how the Vision APIs are executed if the debugger is attached vs. not?
Replies
1
Boosts
0
Views
554
Activity
Jan ’26
Image Playground files suddenly not available
My app lets you create images with Image Playground. When the user approves an image I move it to the documents dir from the temp storage. With over a year of usage I’ve created a lot of images over time. Out of nowhere the app stopped loading my custom creations from Image Playground saying it couldn’t find the files. It still had my VoiceOver strings I had added for each image and still had the custom categories I assigned them. Debug code to look in the docs dir doesn’t find them. I downloaded the app’s container and only see the images I created as a test after the problem started. But my ~70MB app is still taking up 300MB on my iPhone so it feels like they’re there but not accessible. Is there anything else I can try?
Replies
2
Boosts
0
Views
1.2k
Activity
Jan ’26
CoreML Instrument Testing Native Clawbot using FM.SyML & OAIC & Diffusion
After running performance test on my CoreML qwen3 vision, I appreciated the update where results were viewable... ON Mac it mentions Ios18 and im not sure if or how to change.. that bottle neck lead to rebuilding CoreML view. I woke up and realized I have all the pieces together... and ended up with a swift package working demo of Clawbot.. the current issue is Im trying to use gguf 3b to code it.. I have become well aware that everything I create using the big models, they soon become the default themes /layouts for everyone else simply asking for this or that (I appoligise) so here I am asking (while looking to schedule meet with dev) if its possible to speak with anyone about th 1000s of Apple Intelligence PCC, Xcode, and vision reports and feedback ive sent , in terms of just general ways I can work more efficiently without the crash... ive already build a TUI for MLX but the tools for coreML while seems promising are not intuitive, but the vision format instruction was nice to see. Anyway my question is:
Replies
0
Boosts
0
Views
335
Activity
Feb ’26
Parallel/Steam processing of Apple Intelligence
I have built a MAC-OS machine intelligence application that uses Apple Intelligence. A part of the application is to preprocess text. For longer text content I have implemented chunking to get around the token limit. However the application performance is now limited by the fact that Apple Intelligence is sequential in operation. This has a large impact on the application performance. Is there any approach to operate Apple Intelligence in a parallel mode or even a streaming interface. As Apple Intelligence has Private Cloud Services I was hoping to be able to send multiple chunks in parallel as that would significantly improve performance. Any suggestions would be welcome. This could also be considered a request for a future enhancement.
Replies
2
Boosts
0
Views
465
Activity
Feb ’26
Best approach for animating a speaking avatar in a macOS/iOS SwiftUI application
I am developing a macOS application using SwiftUI (with an iOS version as well). One feature we are exploring is displaying an avatar that reads or speaks dynamically generated text produced by an AI service. The basic flow would be: Text generated by an AI service Text converted to speech using a TTS engine An avatar (2D or 3D) rendered in the app that animates lip movement synchronized with the speech Ideally the avatar would render locally on the device. Questions: What Apple frameworks would be most appropriate for implementing a speaking avatar? SceneKit RealityKit SpriteKit (for 2D avatars) Is there any recommended way to drive lip-sync animation from speech audio using Apple frameworks? Does AVSpeechSynthesizer expose phoneme or viseme timing information that could be used for avatar animation? If such timing information is not available, what is the recommended approach for synchronizing character mouth animation with speech audio on macOS/iOS? Are there examples of real-time character animation synchronized with speech on macOS/iOS? Any architectural guidance or references would be greatly appreciated.
Replies
0
Boosts
0
Views
932
Activity
Mar ’26
Building Real-Time Voice Input on macOS 26 with SpeechAnalyzer + ScreenCaptureKit
We built an open-source macOS menu bar app that turns speech into text and pastes it into the active app — using SpeechAnalyzer for on-device transcription, ScreenCaptureKit + Vision for screen-aware context, and FluidAudio for speaker diarization in meeting mode. Here's what we learned shipping it on macOS 26. GitHub: github.com/Marvinngg/ambient-voice Architecture The app has two modes: hotkey dictation (press to talk, release to inject) and meeting recording (continuous transcription with a floating panel). Dictation Mode Audio capture uses AVCaptureSession (more on why below). The captured audio feeds into SpeechAnalyzer via an AsyncStream: let transcriber = SpeechTranscriber( locale: locale, transcriptionOptions: [], reportingOptions: [.volatileResults, .alternativeTranscriptions], attributeOptions: [.audioTimeRange, .transcriptionConfidence] ) let analyzer = SpeechAnalyzer(modules: [transcriber]) let (inputSequence, inputBuilder) = AsyncStream.makeStream() try await analyzer.start(inputSequence: inputSequence) While recording, we capture a screenshot of the focused window using ScreenCaptureKit, run Vision OCR (VNRecognizeTextRequest), extract keywords, and inject them into SpeechAnalyzer as contextual bias: let context = AnalysisContext() context.contextualStrings[.general] = ocrKeywords try await analyzer.setContext(context) This improves accuracy for technical terms and proper nouns visible on screen. If your screen shows "SpeechAnalyzer", saying it out loud is more likely to be transcribed correctly. After transcription, an optional L2 step sends the text through a local LLM (ollama) for spoken-to-written cleanup, then CGEvent simulates Cmd+V to paste into the active app. Meeting Mode Meeting mode forks the same audio stream to two consumers: SpeechAnalyzer — real-time streaming transcription, displayed in a floating NSPanel FluidAudio buffer — accumulates 16kHz Float32 mono samples for batch speaker diarization after recording stops When the user ends the meeting, FluidAudio's performCompleteDiarization() runs on the accumulated audio. We align transcription segments with speaker segments using audioTimeRange overlap matching — each transcription segment gets assigned the speaker ID with the most time overlap. Results export to Markdown. Pitfalls We Hit on macOS 26 1. AVAudioEngine installTap doesn't fire with Bluetooth devices We started with AVAudioEngine.inputNode.installTap() for audio capture. It worked fine with built-in mics but the tap callback never fired with Bluetooth devices (tested with vivo TWS 4 Hi-Fi). Fix: switched to AVCaptureSession. The delegate callback captureOutput(_:didOutput:from:) fires reliably regardless of audio device. The tradeoff is you get CMSampleBuffer instead of AVAudioPCMBuffer, so you need a conversion step. 2. NSEvent addGlobalMonitorForEvents crashes Our global hotkey listener used NSEvent.addGlobalMonitorForEvents. On macOS 26, this crashes with a Bus error inside GlobalObserverHandler — appears to be a Swift actor runtime issue. Fix: switched to CGEventTap. Works reliably, but the callback runs on a CFRunLoop context, which Swift doesn't recognize as MainActor. 3. CGEventTap callbacks aren't on MainActor If your CGEventTap callback touches any @MainActor state, you'll get concurrency violations. The callback runs on whatever thread owns the CFRunLoop. Fix: bridge with DispatchQueue.main.async {} inside the tap callback before touching any MainActor state. 4. CGPreflightScreenCaptureAccess doesn't request permission We used CGPreflightScreenCaptureAccess() as a guard before calling ScreenCaptureKit. If it returned false, we'd bail out. The problem: this function only checks — it never triggers macOS to add your app to the Screen Recording permission list. Chicken-and-egg: you can't get permission because you never ask for it. Fix: call CGRequestScreenCaptureAccess() at app startup. This adds your app to System Settings → Screen Recording. Then let ScreenCaptureKit calls proceed without the preflight guard — SCShareableContent will also trigger the permission prompt on first use. 5. Ad-hoc signing breaks TCC permissions on every rebuild During development, codesign --sign - (ad-hoc) generates a different code directory hash on every build. macOS TCC tracks permissions by this hash, so every rebuild = new app identity = all permissions reset. Fix: sign with a stable certificate. If you have an Apple Development certificate, use that. The TeamIdentifier stays constant across rebuilds, so TCC permissions persist. We also discovered that launching via open WE.app (LaunchServices) instead of directly executing the binary is required — otherwise macOS attributes TCC permissions to Terminal, not your app. Benchmarks We ran end-to-end benchmarks on public datasets (Mac Mini M4 16GB, macOS 26): Transcription (SpeechAnalyzer, AliMeeting Chinese): • Near-field CER 34% (excluding outliers ~25%) • Far-field CER 40% (single channel, no beamforming, >30% overlap) • Processing speed 74-89x real-time Speaker diarization (FluidAudio offline): • AMI English 16 meetings: avg DER 23.2% (collar=0.25s, ignoreOverlap=True) • AliMeeting Chinese 8 meetings: DER 48.5% (including overlap regions) • Memory: RSS ~500MB, peak 730-930MB Full evaluation methodology, scripts, and raw results are in the repo. Open Source The project is MIT licensed: github.com/Marvinngg/ambient-voice It includes the macOS client (Swift 6.2, SPM), server-side distillation/training scripts (Python), and a complete evaluation framework with reproducible benchmarks. Feedback and contributions welcome.
Replies
0
Boosts
0
Views
805
Activity
Mar ’26
Programmatic image creation using ImageCreator
Hello, Could you please provide details for maximum string length of the prompt and the title when using ImageCreator and the method extracted(from:title:)? static func extracted( from text: String, title: String? = nil ) -> ImagePlaygroundConcept Any additional details or example of prompt and title would help. Additionally, are ImagePlaygroundStyle.animation, ImagePlaygroundStyle.illustration and ImagePlaygroundStyle.sketch all available when using extracted(from:title:)? I am trying to generate images programmatically and would appreciate your guidance. Thank you.
Replies
1
Boosts
0
Views
586
Activity
Mar ’26
Genmoji API — are local edits and non-text usage (e.g., widgets) allowed?
Hi, I'm integrating the Genmoji API (NSAdaptiveImageGlyph) into my app and would like to confirm two things: Local editing of generated Genmoji. After the user creates a Genmoji, can the app apply edits to the resulting image (e.g., pixelation)? The edited image would be stored only on-device within the app and never shared externally. Use outside text contexts. Can a generated Genmoji be used in other parts of the app, such as a home screen widget? Apple's documentation and the WWDC24 session focus on inline text, stickers, and Tapbacks, but I couldn't find explicit guidance on widget or other UI usage. I checked the Human Interface Guidelines, WWDC24 session "Bring expression to your app with Genmoji," and the App Store Review Guidelines, but couldn't find clear answers. Any guidance or pointers would be appreciated. Thanks!
Replies
0
Boosts
0
Views
1.7k
Activity
Apr ’26
Visual Intelligence API SemanticContentDescriptor labels are empty
I'm trying to use Apple's new Visual Intelligence API for recommending content through screenshot image search. The problem I encountered is that the SemanticContentDescriptor labels are either completely empty or super misleading, making it impossible to query for similar content on my app. Even the closest matching example was inaccurate, returning a single label ["cardigan"] for a Supreme T-Shirt. I see other apps using this API like Etsy for example, and I'm wondering if they're using the input pixel buffer to query for similar content rather than using the labels? If anyone has a similar experience or something that wasn't called out in the documentation please lmk! Thanks.
Replies
2
Boosts
0
Views
1.3k
Activity
2w
Cloud based Siri AI for older devices
If some Siri AI features work on cloud why not for older device? Apple Devs mentioned that the Siri AI is available for iPhone 15 and later devices and we can use iCloud+ Subscription to use it more on Apple’s Private Cloud Compute (PCC). So if it can run some functions on Private Cloud why not for people with older device? Or will Apple make it compatible with older devices on upcoming beta updates? Also, I noticed that other Phone companies have these features in phones that are way less powerful than iPhone 11 (Which is a oldest device support iOS 27) I tested some ai models from 3rd party apps that are less than 300MB and it worked very well than the old siri so can We add some Siri features to older iPhone? So we can avoid people from switching to other phones. Because I don’t want my friends leaving iPhone.
Replies
0
Boosts
0
Views
75
Activity
5d
A really long waitlist
Since i got the iPadOS27 update signed in the waitlist and haven’t got any new yet
Replies
2
Boosts
0
Views
61
Activity
5d
Waiting for indexing and Siri setup
I have been waiting for both of these going on 24 hours with nothing happening
Replies
0
Boosts
0
Views
54
Activity
5d
Siri Ai waitlist.
I’ve been waiting since 2:30 PM EST yesterday. It’s been well over 24 hours. It’s never taken this long before my phones finish indexing, so I know it’s not that. Also signed up for it right away. And still not accepted, which is very weird because I’ve seen it on Reddit pages where people have signed up earlier today and have been accepted already. No clue what’s going on. Wish we can get some type of answer
Replies
0
Boosts
0
Views
20
Activity
4d
Nearly 70 hours on Siri waitlist
Downloaded the moment WWDC presentation ended at 11a PT. Phone 17 Air, English (US) Plugged in over (3) nights Still indexing Rebooted 5+ times Still waitlisted. The only feature anyone cares about is gated. So much for “available immediately“. Losing trust with Apple fast here.
Replies
0
Boosts
0
Views
175
Activity
3d
S5 - Specific Siri Security Situation in Slovakia
Dears, I have reached out to Apple Research and Apple Security but this is NOT really for them. This is a developer topic !! Apple Research and Security are trying to find a malicious code, bugs ect, but what I am whitnessing is different and much much deeper into the code. Apple Intelligence in Slovakia is much more limited then in other countries. A specific security configuration due to EU regulations in combination with Siri NOT able to speak or understand Slovak. At low level this combination with a small PUSH with good timing, makes the devices to completely strip themselves off, of all security and trust certifications. What follows is a blank completely from scratch processed reinstall, where the attacker only prepares the "CORRECT" files and information and all the work is done by Apple system itself !!! The result is a complete domination of hardware using the NPU (ANE) chip, which does all the job. And I mean each pixel, sound, connection ect... What is the MOST ALARMING is that due to the proud declaration of customer data privacy this is the exact spot where if something like this happens, Apple will NOT be able to see it. The Customer is then in an extreme situation, where he knows that the devices, accounts, keychain, bank account, each app, each picture or sound.... Everything is compromised, but online help and the retailers are too short for this and further to this Apple DONT HAVE AN OFFICE in Slovakia. Only thing left are the contracted service (repair) shops, which are capable to perform a DFU Restore, which does NOT help. I have requested DFU Restore approx 15x in the last 9 months. Once you turn on and you only pick the language, there is a GLITCH and you know this is back again. A very quick and not too detailed process: It is a very silent and extremely sophisticated takeover without an ovious crash at the beginning. Using various tools, which I can describe and present examples. One variation is a HTML code a DOM which is recursive, calling functions and cancelling. Too many functions with offset which results in a graphics freeze, overload or similiar.. The object itself is not frozen and it is carefully prepared !! It will mostly copy and clone the target and NEST inside without knowing. What happens here is that this recursive DOM was applied and therefore the SHUTDOWN MONITOR LOG occured. This froze also mds index which blocked the mounting and unmounting of Volumes.... This is ofcourse carefully instrumented not to raise any attention. Same structure can be used in any code, any language, pdf, it can be nested in a wallpaper or a standard image, library, anywhere ... I can provide a proof and a functional script... The install log is showing - Untracked client connected - RemoteManagement which REINSTALLED the OS. After that Launchd skipps almost all tasks on the next run .... After this mounting volumes block, the system will not restart as standard, insted forced to early boot as possible which starts with PKI TRUST and SIRI UNDERSTANDING ... The PKI TRUST is manipulated and prepared and Siri is not called by the system as Apple Inteligence. So with reinstalled and carefully prepared OS, Launchd who skipped most tasks at the start and without proper encryption. There is a direct open path to Siri and her ASR HAMMERING.... I have personally checked almost 10 different electronics shops and checked the console on each Macbook that was free to try. In each of them these four Protolol logs were the exact same !!!! But after that a brutal iphone reinstall and even over lockdown mode reinstall will follow... Can also provide logs and information... And there is a SIMPLE LOGIC PARADOX with HUGE impact. Any document can be signed by Apple in a second. That is how the PKI TRUST was manipulated without any problem. That is also extremely important ... I can present this, but I must know that somebody is listening.... otherwise the only way is press... Apple Research and Security is blind here and I simply cannot get any answer.... If you know anybody in Slovakia, tell them to go to check this out !!! Get this information to Somebody who could just check it please .... This is probably the largest Supply Chain Attack ever ... And all it takes is a phone call to iStores to Slovakia so they can check for you ... From what I can see, now an update is prepared for Siri. It is based on Ruby but mostly Nokogiri and Gumbo. It will be presented as a 8 bit range training for local LLM, as super fast, but really it will be a combination of Hohner Electric Piano from the 70s with 8 bit sound which will use DTrace and its ROOT privileges. The sound is a square frequency which can be used to hide communication or something we dont know yet. And it does not matter anymore... With a direct connection to GitHub or just the internet ... Any code can be signed and stored anywhere .... The codename is ELECTRA, from what I know this tag was used for jailbreak of Siri in the past. So I belive this will be the final act ... Is there somebody to whom I can speak to about this ?? No generic mails ... THX Mike
Replies
0
Boosts
0
Views
20
Activity
3d
Siri Ai
I don’t think this Siri waitlist is normal, I am on iPhone 16 Pro Max in the US set to English and I’ve been on the new Siri waitlist for >48 hours. Is this a bug?
Replies
0
Boosts
0
Views
48
Activity
3d
siri waitlist
How are some people getting access within 10 minutes while I've been waiting for over a week? In fact, it's been 48 hrs since I joined the waitlist. This is honestly ridiculous. I've met all the requirements, enabled every necessary setting, and still have no access. Meanwhile, others are getting approved almost instantly. If the rollout is based on a waitlist, then it should be handled fairly and consistently. Waiting this long for a feature update is extremely frustrating, especially when there has been little to no communication about the delay.
Replies
0
Boosts
0
Views
27
Activity
3d
New Siri AI and indexing stuck
New siri AI and indexing stuck for over 65 hours i’ve tried everything. Hard restarting my phone, putting an airplane mode doing the diagnostics, everything, but nothing still helps. I also downloaded the iOS 27 beta on my iPad Air 13 inch M3 and the same thing happened to it I’m waiting over 24 hours now. Does anybody know how to resolve this because I am tired of waiting and waiting
Replies
0
Boosts
0
Views
121
Activity
3d