Explore the integration of media technologies within your app. Discuss working with audio, video, camera, and other media functionalities.

All subtopics
Posts under Media Technologies topic

Post

Replies

Boosts

Views

Activity

PDFKit doesn't return the correct page
Hello, We are experiencing on some occasions a wrong behavior with PDFDocument method: func page(at index: Int) -> PDFPage? With certain PDF files, this method returns the wrong PDFPage. This occurs on iOS 18.3, 18.5 and 18.6.2 (an maybe on other versions). Try this PDF for instance (page 81 is returned when index = 2): https://drive.google.com/open?id=1MHm2wjfsbWB8OiRmARUMmvODYxp4DIqP&usp=drive_fs Also, I mention that this doesn't occur systematically with this PDF. When making a copy of this file we don't observe the issue. Could this be linked some kind of internal cache issue ?
1
0
273
1w
ProRAW: Demystify 48MP vs 12MP binning based on lighting?
Hi everyone, does anybody have any resources I could check out regarding the 48->12mp binning behavior on supported sensors? I know the 48mp sensor on iPhone can automatically bin pixels for better low light performance. But not sure how to reliably make this happen in practice. On iPhone 14 Pro+ with a 48MP sensor, I want the best of both worlds for ProRAW: ∙ Bright light: 48MP full resolution ∙ Low light: 12MP pixel-binned for better noise `photoOutput.maxPhotoDimensions = CMVideoDimensions(width: 8064, height: 6048) let settings = AVCapturePhotoSettings(rawPixelFormatType: proRawFormat, processedFormat: [...]) settings.photoQualityPrioritization = .quality // NOT setting settings.maxPhotoDimensions — always get 12MP` When I omit maxPhotoDimensions, iOS always returns 12MP regardless of lighting. When I set it to 48MP, I always get 48MP. Is there an API to let iOS automatically choose the optimal resolution based on conditions, or should I detect low light myself (via device.iso / exposureDuration) and set maxPhotoDimensions accordingly? Any help or direction would be much appreciated!
0
0
208
1w
AVCaptureDevice.isFaceDrivenAutoFocusEnabled not working with continuousAutoFocus
Hi everyone, I'm developing a camera application that requires precise, predictable control over the focus system. I'm encountering unexpected behavior with face-driven autofocus in continuous autofocus mode. Issue: When using AVCaptureDevice.FocusMode.continuousAutoFocus, the system continues to prioritize faces for focus even after attempting to disable face-driven autofocus with: device.automaticallyAdjustsFaceDrivenAutoFocusEnabled = false device.isFaceDrivenAutoFocusEnabled = false Observations: The behavior is inconsistent across different scenes. In well-lit/properly exposed scenes: focus persistently locks onto faces, ignoring my configuration. . In underexposed scenes: the intended focus behavior is more consistently respected.
0
0
263
1w
CIRAWFilter.outputImage first-time cost is huge (~3s), subsequent calls are ~3ms. Any official way to pre-initialize RAW pipeline (without taking a real photo)?
Hi Apple Developer Forums, I’m developing an iOS camera app that processes RAW captures using Core Image. I’m seeing a large “first use” performance penalty specifically when creating the CIImage from CIRAWFilter.outputImage. What’s slow (important detail) I’m measuring the time for: let rawFilter = CIRAWFilter(imageData: rawData, identifierHint: hint) let ciImage = rawFilter.outputImage This is not CIContext.render(...) / createCGImage(...). It’s just the time to access outputImage (i.e., building the Core Image graph / RAW pipeline setup). Observed behavior First time accessing CIRAWFilter.outputImage: ~3 seconds Second time (same app session, similar RAW): ~3 milliseconds So something heavy is happening only on first use (decoder initialization, pipeline setup, shader/library compilation, caching, etc.). Using Metal System Trace, I also noticed that during the slow first call there are many “Create MTLLibrary” events, while the second call doesn’t show this pattern. Warm-up attempts using bundled DNG I tried to “warm up” early (e.g., on camera screen entry) by loading a bundled DNG and then accessing CIRAWFilter.outputImage by taking a photo: Warm-up with a ~247 KB DNG → first real RAW outputImage cost drops to ~1.42s Warm-up with a ~25 MB DNG → first real RAW outputImage cost drops to ~843ms This helps, but it’s still far from the steady-state ~3ms. Warm-up by capturing a real RAW (works, but concerns) The only method that fully eliminates the delay is to trigger a real RAW capture programmatically before the user’s first photo, then use that captured rawData to warm up the CIRAWFilter.outputImage path. This brings the first user-facing capture close to the steady-state timing. However: In some regions, the camera shutter sound cannot be suppressed, so “hidden warm-up capture” is unacceptable UX. I’m also unsure whether triggering a real capture without an explicit user action could raise compliance/privacy concerns, even if the image is immediately discarded and never saved/uploaded. Questions Is the large first-time cost of CIRAWFilter.outputImage expected (RAW pipeline initialization / shader compilation)? Is there an Apple-recommended way to pre-initialize the Core Image RAW pipeline / Metal resources so the first outputImage is fast, without taking a real photo? Are there any best practices (e.g. CIContext creation timing, prepareRender(...), specific options) that reliably reduce this first-use overhead for CIRAWFilter? Attachments Figure 1: First RAW capture with no warm-up (~3s outputImage time) Figure 2: First RAW capture after warm-up with bundled DNG (improved but still hundreds of ms) Thanks for any guidance or experience sharing!
3
2
517
1w
SpeechAnalyzer error "asset not found after attempted download" for certain languages
I am trying to use the new SpeechAnalyzer framework in my Mac app, and am running into an issue for some languages. When I call AssetInstallationRequest.downloadAndInstall() for some languages, it throws an error: Error Domain=SFSpeechErrorDomain Code=1 "transcription.ar asset not found after attempted download." The ".ar" appears to be the language code, which in this case was Arabic. When I call AssetInventory.status(forModules:) before attempting the download, it is giving me a status of "downloading" (perhaps from an earlier attempt?). If this language was completely unsupported, I would expect it to return a status of "unsupported", so I'm not sure what's going on here. For other languages (Polish, for example) SpeechTranscriber.supportedLocale(equivalentTo:) is returning nil, so that seems like a clearly unsupported language. But I can't tell if the languages I'm trying, like Arabic, are supported and something is going wrong, or if this error represents something I can work around. Here's the relevant section of code. The error is thrown from downloadAndInstall(), so I never even get as far as setting up the SpeechAnalyzer itself. private func setUpAnalyzer() async throws { guard let sourceLanguage else { throw Error.languageNotSpecified } guard let locale = await SpeechTranscriber.supportedLocale(equivalentTo: Locale(identifier: sourceLanguage.rawValue)) else { throw Error.unsupportedLanguage } let transcriber = SpeechTranscriber(locale: locale, preset: .progressiveTranscription) self.transcriber = transcriber let reservedLocales = await AssetInventory.reservedLocales if !reservedLocales.contains(locale) && reservedLocales.count == AssetInventory.maximumReservedLocales { if let oldest = reservedLocales.last { await AssetInventory.release(reservedLocale: oldest) } } do { let status = await AssetInventory.status(forModules: [transcriber]) print("status: \(status)") if let installationRequest = try await AssetInventory.assetInstallationRequest(supporting: [transcriber]) { try await installationRequest.downloadAndInstall() } } ...
8
0
830
1w
Live Streaming issue for the RTSP
We have the application 'ADS Smart', a companion application for our ADS Dashcam. We offer a feature that lets users stream the live footage of the dashcam cameras through the app. Currently, we are experiencing a time delay of 30+ seconds to see the live stream, i.e the first frame of the live footage is taking around 30+ seconds to display in the app. We are using the MobileVLCKit library to stream the videos in the app. The current flow of the code, Flutter triggers the native playback via a method channel The Dart side calls the iOS method channel <identifier_name>/ts_player with method playTSFromURL passing: url(e.g rtsp://.... for live), playerId viewId (stable ID used to host native UI) showControls optional localIp AppDelegate receives the call and prepares networking Entry point: AppDelegate.tsChannel handler for "playTSFromURL" in AppDelegate.swift. It resolves the Wi‑Fi interface and local IP if possible: Sets VLC_SOURCE_ADDRESS to the Wi‑Fi IP (when available) to prefer Wi‑Fi for the stream. Uses NWPathMonitor and direct interface inspection to find the Wi‑Fi interface (e.g., en0) and IP. Kicks off best-effort route priming to the dashcam IP/ports (non-blocking), see establishWiFiRoutePriority. AppDelegate chooses the right player implementation createPlayerForURL(_: ) decides: RTSP(rasp://..) --> use VLCKit-backed player (class TSStreamPlayer -> TSStreamPlayer class provides a VLC-backed video player for iOS, handling playback of Transport Stream(TS) URLs with strict main-thread UI updates, view safety, and stream management, using MobileVLCKit) .ts files --> use VLCKit-based player for playing already recorded videos in the app. If the selected player supports extras (e.g. TSStreamPlayerExtras), it sets LocalIP (if resolved) Wi-fi interface name AppDeletegate creates the native 'platform view' container and overlay platformView(for:parent:showControls:): Creates a container UIView attached to the Flutter root view Adds a dedicated child videoHost[viewId]-the host UIView for rendering video. If showControls == true, adds TSPlayerControlsOverlay over the video and wires overlay callbacks back to the Flutter via controlChannel (/player_controls) If showControls == false, adds a minimal back button and wires it to onGalleryBack. The player starts playback inside the host view Class player.playTSFromURL(urlString, in:host){ success, error in...} on the main thread. For RTSP/TS streams: this is handled by TSStreamPlayer(VLCKit). Success/failure is reported back to Flutter The completion closure invoked in step 5 returns true on first real playback or an error message on failure. The method channel result responds: true --> Flutter knows playback started FlutterError -> Flutter can show an error Stopping and cleanup "stop" on tsChannel stops and disposes the player(s). "removePlatformView" removes the overlay, back button, the host, and the container, and disposes any remaining players. I am attaching the logs of the app while running. The actual issue happening is that when the iOS device is connected to the dashcam's Wi-Fi, for the app's live streaming-related information, the iOS is using Mobile Data even though the wifi is the main communication channel. The iOS device takes approximately 30 seconds to display the first frame of live footage in the app. Despite being connected to the dashcam’s Wi-Fi, the iOS device sets the value of ES (en0) to Wi-Fi after multiple attempts, causing the live footage to appear in the app after this delay. So, how can we set up the configuration to display the live footage from the dashcam cameras within just 2 to 3 seconds in the iOS device? ios.txt
0
0
165
1w
iOS Speech Error on Mobile Simulator (Error fetching voices)
I'm writing a simple app for iOS and I'd like to be able to do some text to speech in it. I have a basic audio manager class with a "speak" function: import Foundation import AVFoundation class AudioManager { static let shared = AudioManager() var audioPlayer: AVAudioPlayer? var isPlaying: Bool { return audioPlayer?.isPlaying ?? false } var playbackPosition: TimeInterval = 0 func playSound(named name: String) { guard let url = Bundle.main.url(forResource: name, withExtension: "mp3") else { print("Sound file not found") return } do { if audioPlayer == nil || !isPlaying { audioPlayer = try AVAudioPlayer(contentsOf: url) audioPlayer?.currentTime = playbackPosition audioPlayer?.prepareToPlay() audioPlayer?.play() } else { print("Sound is already playing") } } catch { print("Error playing sound: \(error.localizedDescription)") } } func stopSound() { if let player = audioPlayer { playbackPosition = player.currentTime player.stop() } } func speak(text: String) { let synthesizer = AVSpeechSynthesizer() let utterance = AVSpeechUtterance(string: text) utterance.voice = AVSpeechSynthesisVoice(language: "en-GB") synthesizer.speak(utterance) } } And my app shows text in a ScrollView: ScrollView { Text(self.description) .padding() .foregroundColor(.black) .font(.headline) .background(Color.gray.opacity(0)) }.onAppear { AudioManager.shared.speak(text: self.description) } However, the text doesn't get read out (in the simulator). I see some output in the console: Error fetching voices: Swift.DecodingError.dataCorrupted(Swift.DecodingError.Context(codingPath: [], debugDescription: "Invalid container metadata for _UnkeyedDecodingContainer, found keyedGraphEncodingNodeID", underlyingError: nil)). Using fallback voices. I'm probably doing something wrong here, but not sure what.
5
1
642
1w
Broadcast Upload Extension not work in Apple Vision Pro, throw throw "getMXSessionProperty unsupported"
I am working on Screen Record function in Apple Vision Pro, when I use broadcast upload extension, after I click record button, the XCode console show the exception: <<<< FigAudioSession(AV) >>>> audioSessionAVAudioSession_CopyMXSessionProperty signalled err=-19224 (kFigAudioSessionError_UnsupportedOperation) (getMXSessionProperty unsupported) at FigAudioSession_AVAudioSession.m:606 we create and config the project as flow: Create a Apple Vision Project. Create a Broadcast Upload Extension Target. Add App Group for Project Target and Extension Target, both use the same identifier. Add "Main Camera Access", "Passthrough in Screen Capture" Capabilities for all targets. Add "NSScreenCaptureUsageDescription", "NSMicrophoneUsageDescription" in Plist. Add record button in view Run debug in Apple Vision Pro device, after click record button, throw the exception.
1
0
509
2w
Photos & Imaging Régression - Photos app folders no longer function as folders in iOS 26
I’m writing to report a serious usability regression in the iOS 26 Photos app. Folders can still be created and albums can still be assigned to them, but folders can no longer be opened to view the albums they contain. A container that cannot be opened is not a container, and this breaks a fundamental information architecture model that has existed in Photos for well over a decade. This change disproportionately harms users who maintain large, intentional photo libraries—travel archives, projects, professional work, or long-term personal documentation—where hierarchy and ordering are essential. Search and automated surfacing are not substitutes for deliberate structure. Removing the ability to browse folder → album hierarchy on iOS strips users of control while still exposing the UI for folder creation, which is internally inconsistent. If this behavior is intentional, it should be clearly documented and the folder UI removed to avoid misleading users. If it is not intentional, it needs urgent correction. At minimum, iOS should retain parity with macOS Photos for basic navigation of folders and albums. This is not a niche request; it is a regression in core functionality.
1
0
571
2w
sysEx struct in CoreMIDI/MIDIMessages.h
The sysEx struct in the MIDIUniversalMessage struct has a channel member but the System Exclusive (7-Bit) Message doesn't have a channel field. The System Exclusive (7-Bit) Message has a # of bytes field but the sysEx struct doesn't have a nrOfBytes, byteCount or bytesUsed member. It looks like the channel member of the sysEx struct contains the number of used bytes. Is this a mistake in the header or did I misunderstand something?
1
0
551
2w
Indexing of Music App
Recently, after the update of 26.3 Mac OS (Tahoe), the ordering of my music app has been horrible at best - music disappearing, tracks not aligning with albums (even if the albums are different years). It's created quite a problem, because the disappearing tracks issue seems to be replicating to iOS devices as well (although track numbering and album association seem to be stable). Has anyone else heard of this issue?
0
0
212
2w
AVAudioEngine obtains channel audio data
Currently, I have successfully used ChannelMap to map hardware input channels and obtained audio data from the hardware device's MIC and OTG inputs. Additionally, I have used ChannelMap to map output channels to freely feed data for playback to each output channel. However, I now have a problem. I have a hardware device that only has output channels (no input channels), and the system has set this hardware device as the default playback device. In this case, how can I obtain the audio data being played to the output channels for modification?
0
0
202
2w
_MPRemoteCommandEventDispatch crashes on iOS 26.x devices.
I'm seeing crashes in _MPRemoteCommandEventDispatch on iOS 26.x devices in 3 apps. According to Bugsnag logs they are: NSInternalInconsistencyException: event dispatch <_MPRemoteCommandEventDispatch: <MPRemoteCommandEvent: 0x11c049500 commandID=THV0 command=<MPRemoteCommand: 0x109ad1ea0 type=Play (0) enabled=YES handlers=[0x109b6a310]> sourceID=(null) ([HostedRoutingSessionDataSource] handleControlSendingCommand<2W5E>)> state:201> deallocated without calling continuation I attached a log from Xcode organizer matching Bugsnag crash. mpr_remote_command_event.crash When I set the brakpoint on the -[_MPRemoteCommandEventDispatch dealloc] I can see it it's hit every time I tap play or pause on locked screen play button. Thread 0 Crashed: 0 libsystem_kernel.dylib 0x00000002370420cc __pthread_kill + 8 (:-1) 1 libsystem_pthread.dylib 0x00000001e975c810 pthread_kill + 268 (pthread.c:1721) 2 libsystem_c.dylib 0x0000000198f8ff64 abort + 124 (abort.c:122) 3 libc++abi.dylib 0x000000018a7cf808 __abort_message + 132 (abort_message.cpp:66) 4 libc++abi.dylib 0x000000018a7be484 demangling_terminate_handler() + 304 (cxa_default_handlers.cpp:76) 5 libobjc.A.dylib 0x000000018a6cff78 _objc_terminate() + 156 (objc-exception.mm:496) 6 xxxxxxxxxxxxxx 0x00000001003a7db8 CPPExceptionTerminate() + 416 (BSG_KSCrashSentry_CPPException.mm:156) 7 libc++abi.dylib 0x000000018a7cebdc std::__terminate(void (*)()) + 16 (cxa_handlers.cpp:59) 8 libc++abi.dylib 0x000000018a7ceb80 std::terminate() + 108 (cxa_handlers.cpp:88) 9 CoreFoundation 0x000000018d7341c4 __CFRunLoopPerCalloutARPEnd + 256 (CFRunLoop.c:769) 10 CoreFoundation 0x000000018d70bb5c __CFRunLoopRun + 1976 (CFRunLoop.c:3179) 11 CoreFoundation 0x000000018d70aa6c _CFRunLoopRunSpecificWithOptions + 532 (CFRunLoop.c:3462) 12 GraphicsServices 0x000000022e31c498 GSEventRunModal + 120 (GSEvent.c:2049) 13 UIKitCore 0x00000001930ceba4 -[UIApplication _run] + 792 (UIApplication.m:3902) 14 UIKitCore 0x0000000193077a78 UIApplicationMain + 336 (UIApplication.m:5577) 15 xxxxxxxxxxxxxx 0x00000001000c0134 main + 308 (main.swift:15) 16 dyld 0x000000018a722e28 start + 7116 (dyldMain.cpp:1477) Is the crash happening when the app is being terminated? Thank you!
0
0
262
3w
SpeechAnalyzer > AnalysisContext lack of documentation
I'm using the new SpeechAnalyzer framework to detect certain commands and want to improve accuracy by giving context. Seems like AnalysisContext is the solution for this, but couldn't find any usage example. So I want to make sure that I'm doing it right or not. let context = AnalysisContext() context.contextualStrings = [ AnalysisContext.ContextualStringsTag("commands"): [ "set speed level", "set jump level", "increase speed", "decrease speed", ... ], AnalysisContext.ContextualStringsTag("vocabulary"): [ "speed", "jump", ... ] ] try await analyzer.setContext(context) With this implementation, it still gives outputs like "Set some speed level", "It's speed level", etc. Also, is it possible to make it expect number after those commands, in order to eliminate results like "set some speed level to" (instead of two).
0
0
265
3w
CoreMediaErrorDomain -15628 playback failure in iOS 26 (React Native / AVPlayer, HLS stream)
Hi, After updating to iOS 26, our app is facing playback failures with AVPlayer. The same code and streams work fine on iOS 18 and earlier. Error - Domain[CoreMediaErrorDomain]:Code[-15628]:Desc[The operation couldn’t be completed.]:Underlying Error Domain[(null)]:Code[0]:Desc[(null)] Environment: iOS version: ios 26 React Native: 0.69 Video library: react-native-video (AVPlayer under the hood) Stream type: HLS (m3u8) with segment (.ts) files Observed behaviour: Playback works initially on iOS 26. On iOS 26, the stream fails at runtime after a few seconds/minutes (not on first load). Network logs show 307 redirects on some segment requests. After this, AVPlayer throws the above error. Playback fails intermittently on slow/unstable networks.
6
18
1.4k
3w
CoreMediaErrorDomain error -42681
We are trying to port our code to Apple TV on tvosVersion 17.6 while running the sample we are getting error CoreMediaErrorDomain error -42681. We understand that this error occurs when the FairPlay license (CKC) returned by the server contains incompatible or malformed version information that the iOS/tvOS FairPlay CDM cannot parse. Can you please specify tvos 17.6 expect what fairplay version number or what fields are mandartory for fps version metadata ?
0
0
244
3w
iOS 26 Beta Personal Voice bug affecting AVSpeechSynthesizer
I have sent in a feedback report (FB18222398) but I have no idea if anyone has looked at it. I know from past experiences that Apple devs do look at these forums. This applies to each of the betas, 1, 2 and 3. I have created a new Personal Voice with each beta. I create a personal voice in English. When it's done processing, I tap Preview and it says in English what is expected. But after some time, an hour or a day, the language of the voice file changes languages and no longer works properly. If I press Preview it is no longer intelligible. I have a text to speech app and initially the created voice works but then when the language of the file changes, it no longer works. I have run an app on my iphone through Xcode that prints to the console the voices installed on the device with the language. Currently this is the voice file: Voice Identifier: com.apple.speech.personalvoice.AAA9C6F2-9125-475F-BA2F-22C63274991D Language: es-MX and on a second device the same personal voice is in a different language: Voice Identifier: com.apple.speech.personalvoice.AAA9C6F2-9125-475F-BA2F-22C63274991D Language: zh-CN Although, a previous personal voice file that listed as Spanish-Mexican played in English with a Spanish accent or when playing Spanish text, it sounded almost perfect. This current personal voice doesn't do that, and is unintelligible. Previous attempts have converted to Chinese. I hope someone can look into this.
2
0
468
3w
VideoPlayer crashes on Archive build
I have found that following code runs without issue from Xcode, either in Debug or Release mode, yet crashes when running from the binary produced by archiving - i.e. what will be sent to the app store. import SwiftUI import AVKit @main struct tcApp: App { var body: some Scene { WindowGroup { VideoPlayer(player: nil) } } } This is the most stripped down code that shows the issue. One can try and point the VideoPlayer at a file and the same issue will occur. I've attached the crash log: Crash log Please note that this was seen with Xcode 26.2 and MacOS 26.2.
1
0
367
3w
Recurring FigXPCUtilities / FigCaptureSourceRemote err=-17281 logs when using AVCaptureVideoDataOutput on iOS 26.x
Hi everyone, I’m seeing recurring internal AVFoundation camera logs on iOS 26.2 and I’m trying to understand whether this is expected behavior or a regression in the capture pipeline. These logs appear shortly after starting an AVCaptureSession, while video frames are being delivered, and also when the camera is stopped or the capture session is torn down. <<<< FigXPCUtilities >>>> signalled err=-17281 at <>:302 <<<< FigCaptureSourceRemote >>>> Fig assert: "err == 0 " at bail (FigCaptureSourceRemote.m:569) - (err=-17281) Even in this clean, minimal setup, the same logs appear on iOS 26.2 The exact same logic did not produce these logs on iOS 18.x. To rule out issues caused by my own code, GPT created a minimal SwiftUI example from scratch. My primary interest is to perform real-time processing on the video frames delivered by the camera (via AVCaptureVideoDataOutput), for tasks such as analysis, computer vision, or custom frame handling, while simultaneously displaying the live preview. Thanks in advance for any insight. Example Code
0
0
328
3w