hello, I'm using VideoTololbox VTFrameRateConversionConfiguration to perform frame interpolation: https://developer.apple.com/documentation/videotoolbox/vtframerateconversionconfiguration?language=objc ,when using 640x480 vidoe input, I got error:
Error ! Invalid configuration
[VEEspressoModel] build failure : flow_adaptation_feature_extractor_rev2.espresso.net. Configuration: landscape640x480
[EpsressoModel] Cannot load Net file flow_adaptation_feature_extractor_rev2.espresso.net. Configuration: landscape640x480
Error: failed to create FRCFlowAdaptationFeatureExtractor for usage 8
Failed to switch (0x12c40e140) [usage:8, 1/4 flow:0, adaptation layer:1, twoStage:0, revision:2, flow size (320x240)].
Could not init FlowAdaptation
initFlowAdaptationWithError fail
tried 2048x1080 is ok.
Explore the integration of media technologies within your app. Discuss working with audio, video, camera, and other media functionalities.
Selecting any option will automatically load the page
Post
Replies
Boosts
Views
Activity
Hello,
I am wondering if it is possible to have audio from my AirPods be sent to my speech to text service and at the same time have the built in mic audio input be sent to recording a video?
I ask because I want my users to be able to say "CAPTURE" and I start recording a video (with audio from the built in mic) and then when the user says "STOP" I stop the recording.
Hi all,
I've been working on some personal programming projects and have gotten into using the Apple Music API. I'm currently looking to get a list of recent songs using the /v1/me/recent/played/tracks endpoint and it's working well.
However, I know there are some songs I've listened to multiple times in a row, and those are not showing up as unique tracks when querying this endpoint. I'm only seeing a list of the different songs I've listened to lately, not a true list of the most recent plays on my account.
Is this intended behavior or am I going about something incorrectly here? My query is using that endpoint & specifying the types to be only [songs].
Thanks in advance for any ideas or insight.
Hi team,
In the Apple Music Feed datasets, we've noticed some unexpected values in the song and album tables.
The primaryartists column from either song or album may contain a "non-default" artist name such as the katakana name shown in the example below:
select id, name, namedefault, primaryartists from amf_song where id = '1698723329'
id | name | namedefault | primaryartists
----------------------------------------
1698723329 | {default=California} | California | [{id=1264818718, name=チャペル・ローン}]
select * from amf_artist where id = '1264818718'
id | name | namedefault | namepronunciation |
----------------------------------------------
1264818718 | {default=Chappell Roan, ja=チャペル・ローン} | Chappell Roan | {ja=チャペルローン} |
Shouldn't the primaryartists column be showing the namedefault instead of the Japanese language version?
When can we expect this bug to resolved?
Thanks,
PHPhotoLibrary.authorizationStatus(for: .readWrite) == .authorized
Iinfo.plist Privacy - Photo Library Usage Description set
I check authorization before attempting to get the photoPickerItem.itemIdentifier, but every time the return value from itemIdentifier is nil. Seems I missing some permissions, but unsure why the system is still keeping _shouldExposeItemIdentifier set to false.
Topic:
Media Technologies
SubTopic:
Photos & Camera
Hi everyone, I’m working on an iOS MusicKit app that overlays a metronome on top of Apple Music playback. To line the clicks up perfectly I’d like access to low-level audio analysis data—ideally a waveform / spectrogram or beat grid—while the track is playing. I’ve noticed that several approved DJ apps (e.g. djay, Serato, rekordbox) can already: • Display detailed scrolling waveforms of Apple Music songs • Scratch, loop or time-stretch those tracks in real time That implies they receive decoded PCM frames or at least high-resolution analysis data from Apple Music under a special entitlement. My questions: 1. Does MusicKit (or any public framework) expose real-time audio buffers, FFT bins, or beat markers for streaming Apple Music content? 2. If not, is there an Apple program or entitlement that developers can apply for—similar to the “DJ with Apple Music” initiative—to gain that deeper access? 3. Where can I find official documentation or a point of contact for this kind of request? I’ve searched the docs and forums but only see standard MusicKit playback APIs, which don’t appear to expose raw audio for DRM-protected songs. Any guidance, links or insider tips on the proper application process would be hugely appreciated! Thanks in advance.
Topic:
Media Technologies
SubTopic:
Audio
I develop a application with an uvc camera, this camera is a webcam, I use the AVFoundation library ,but when I run the code "[self.mCaptureSession startRunning]" ,I can not get the buffer, I already set the delegate, any answer will help.
Hello,
I've discovered a buffer initialization bug in AVAudioUnitSampler that happens when loading presets with multiple zones referencing different regions in the same audio file (monolith/concatenated samples approach).
Almost all zones output silence (i.e. zeros) at the beginning of playback instead of starting with actual audio data.
The Problem
Setup:
Single audio file (monolith) containing multiple concatenated samples
Multiple zones in an .aupreset, each with different sample start and sample end values pointing to different regions of the same file
All zones load successfully without errors
Expected Behavior:
All zones should play their respective audio regions immediately from the first sample.
Actual Behavior:
Last zone in the zone list: Works perfectly - plays audio immediately
All other zones: Output [0, 0, 0, 0, ..., _audio_data] instead of [real_audio_data]
The number of zeros varies from event to event for each zone. It can be a couple of samples (<30) up to several buffers.
After the initial zeros, the correct audio plays normally, so there is no shift in audio playback, just missing samples at the beginning.
Minimal Reproduction
1. Create Test Monolith Audio File
Create a single Wav file with 3 concatenated 1-second samples (44.1kHz):
Sample 1: frames 0-44099 (constant amplitude 0.3)
Sample 2: frames 44100-88199 (constant amplitude 0.6)
Sample 3: frames 88200-132299 (constant amplitude 0.9)
2. Create Test Preset
Create an .aupreset with 3 zones all referencing the same file:
Pseudo code
<Zone array>
<zone 1> start : 0, end: 44099, note: 60, waveform: ref_to_monolith.wav;
<zone 2> start sample: 44100, note: 62, end sample: 88199, waveform: ref_to_monolith.wav;
<zone 3> start sample: 88200, note: 64, end sample: 132299, waveform: ref_to_monolith.wav;
</Zone array>
3. Load and Test
// Load preset into AVAudioUnitSampler
let sampler = AVAudioUnitSampler()
try sampler.loadAudioFiles(from: presetURL)
// Play each zone (MIDI notes C4=60, D4=62, E4=64)
sampler.startNote(60, withVelocity: 64, onChannel: 0) // Zone 1
sampler.startNote(62, withVelocity: 64, onChannel: 0) // Zone 2
sampler.startNote(64, withVelocity: 64, onChannel: 0) // Zone 3
4. Observed Result
Zone 1 (C4): [0, 0, 0, ..., 0.3, 0.3, 0.3] ❌ Zeros at beginning
Zone 2 (D4): [0, 0, 0, ..., 0.6, 0.6, 0.6] ❌ Zeros at beginning
Zone 3 (E4): [0.9, 0.9, 0.9, ...] ✅ Works correctly (last zone)
What I've Extensively Tested
What DOES Work
Separate files per zone:
Each zone references its own individual audio file
All zones play correctly without zeros
Problem: Not viable for iOS apps with 500+ sample libraries due to file handle limitations
What DOESN'T Work (All Tested)
1. Different Audio Formats:
CAF (Float32 PCM, Int16 PCM, both interleaved and non-interleaved)
M4A (AAC compressed)
WAV (uncompressed)
SF2 (SoundFont2)
Bug persists across all formats
2. CAF Region Chunks:
Created CAF files with embedded region chunks defining zone boundaries
Set zones with no sampleStart/sampleEnd in preset (nil values)
AVAudioUnitSampler completely ignores CAF region metadata
Bug persists
3. Unique Waveform IDs:
Gave each zone a unique waveform ID (268435456, 268435457, 268435458)
Each ID has its own file reference entry (all pointing to same physical file)
Hypothesized this might trigger separate buffer initialization
Bug persists - no improvement
4. Different Sample Rates:
Tested: 44.1kHz, 48kHz, 96kHz
Bug occurs at all sample rates
5. Mono vs Stereo:
Bug occurs with both mono and stereo files
Environment
macOS: Sonoma 14.x (tested across multiple minor versions)
iOS: Tested on iOS 17.x with same results
Xcode: 16.x
Frameworks: AVFoundation, AudioToolbox
Reproducibility: 100% reproducible with setup described above
Impact & Use Case
This bug severely impacts professional music applications that need:
Small file sizes: Monolith files allow sharing compressed audio data (AAC/M4A)
iOS file handle limits: Opening 400+ individual sample files is not viable on iOS
Performance: Single file loading is much faster than hundreds of individual files
Standard industry practice: Monolith/concatenated samples are used by EXS24, Kontakt, and most professional samplers
Current Impact:
Cannot use monolith files with AVAudioUnitSampler on iOS
Forced to choose between: unusable audio (zeros at start) OR hitting iOS file limits
No viable workaround exists
Root Cause Hypothesis
The bug appears to be in AVAudioUnitSampler's internal buffer initialization when:
Multiple zones share the same source audio file
Each zone specifies different sampleStart/sampleEnd offsets
Key observation: The last zone in the zone array always works correctly.
This is NOT related to:
File permissions or security-scoped resources (separate files work fine)
Audio codec issues (happens with uncompressed PCM too)
Preset parsing (preset loads correctly, all zones are valid)
Questions
Is this a known issue? I couldn't find any documentation, bug reports, or discussions about this.
Is there ANY workaround that allows monolith files to work with AVAudioUnitSampler?
Alternative APIs? Is there a different API or approach for iOS that properly supports monolith sample files?
I wrote a Swift macOS app to control a PCI audio device. The code switches between the default output and input channels. As soon as I launch the Audio-Midi Setup utility, channel switching stops working. The driver properties allow switching, but the system doesn't respond. I have to delete the contents of /Library/Preferences/Audio and reset Core Audio. What am I missing?
func setDefaultChannelsOutput() {
guard let deviceID = getDeviceIDByName(deviceName: "PCI-424") else { return }
let selectedIndex = DefaultChannelsOutput.indexOfSelectedItem
if selectedIndex < 0 || selectedIndex >= 24 { return }
let channel1 = UInt32(selectedIndex * 2 + 1)
let channel2 = UInt32(selectedIndex * 2 + 2)
var channels: [UInt32] = [channel1, channel2]
var propertyAddress = AudioObjectPropertyAddress(
mSelector: kAudioDevicePropertyPreferredChannelsForStereo,
mScope: kAudioDevicePropertyScopeOutput,
mElement: kAudioObjectPropertyElementWildcard
)
let dataSize = UInt32(MemoryLayout<UInt32>.size * channels.count)
let status = AudioObjectSetPropertyData(deviceID, &propertyAddress, 0, nil, dataSize, &channels)
if status != noErr {
print("Error setting default output channels: \(status)")
}
}
Topic:
Media Technologies
SubTopic:
Audio
We’re developing an AVFoundation-based video recording app (4K @ 60 fps required for biomechanical analysis). On most devices this works perfectly (iPhone 12/14/15/16 non-Pro models), but on several iPhone Pro models (12 Pro, 13 Pro, 14 Pro, 15 Pro/Pro Max), we consistently get 4K 30 fps recordings—even when the device should support 4K 60 fps on the wide-angle camera.
What we observe
We configure the session for .hd4K3840x2160.
We iterate through AVCaptureDevice.formats and select formats that:
have 3840×2160 resolution
support ≥60 fps (videoSupportedFrameRateRanges)
On some Pro devices, this format search returns no results, even though:
The Camera app records 4K60 fine.
External references list the wide camera as 4K60 capable.
The fallback becomes the device's default 4K30 format, so final files are 3840×2160 @ 30 fps.
This happens immediately on app launch (not after heating), so not thermal-related.
What we’ve tried
Force selecting .builtInWideAngleCamera instead of dual/triple cameras.
Disabling HDR (videoHDREnabled = false).
Disabling low-light boost.
Allowing 59.94 fps formats (in case exact 60.0 isn’t exposed).
Logging all videoSupportedFrameRateRanges per format.
What we’re seeing in logs
On affected Pro devices, the capture device reports only 4K formats with maxFrameRate ≈ 30 fps, despite the hardware being able to do 4K60.
Main question
Has anyone encountered cases where 4K60 formats are available in the Camera app but not exposed through AVFoundation, especially on Pro models or multi-camera devices?
Could HEVC/HDR capability or multi-camera constraints be preventing certain formats from appearing?
Are there known conditions where 4K60 formats are hidden unless specific device configuration is applied?
Any guidance on reliably locking 4K60 on iPhone Pro models via AVFoundation would be hugely appreciated.
I am developing an iOS application that supports screen mirroring to Google TV (or Chromecast with Google TV). My goal is to mirror the iPhone/iPad screen in real time to a Google TV device.
What I Have Tried So Far
I have explored multiple approaches but haven't found a direct way to achieve low-latency screen mirroring. Here are some of my findings:
Google Cast SDK:
Google Cast SDK is primarily designed for casting media (videos, images, audio) rather than real-time mirroring. It supports custom receiver applications, but there are no direct APIs for full screen mirroring. Casting a recorded video is possible, but it introduces latency and is not real-time.
ReplayKit for Screen Capture:
RPScreenRecorder.shared().startCapture(handler: ...) allows capturing the iPhone screen as a video stream. However, sending this stream to Google TV in real time is a challenge. I could potentially encode the video as HLS and stream it, but the delay is significant.
RTSP/UDP Streaming:
Some third-party libraries support RTSP/UDP streaming for real-time screen sharing. Google TV does not natively support RTSP, making this approach difficult.
My Questions:
Is it possible to achieve real-time screen mirroring on Google TV using Google Cast SDK? Does Google TV support WebRTC or any low-latency streaming protocol that can be used from iOS? Are there any alternative approaches to mirror an iOS screen to Google TV with minimal latency? I would appreciate any guidance, code examples, or references to relevant documentation.
Is there limits on the supported dimension for VTLowLatencyFrameInterpolationConfiguration. Querying VTLowLatencyFrameInterpolationConfiguration.maximumDimensions and VTLowLatencyFrameInterpolationConfiguration.minimumDimensions returns nil. When I try the WWDC sample project EnhancingYourAppWithMachineLearningBasedVideoEffects with a 4k video this statement try frameProcessor.startSession(configuration: configuration) executes but try await frameProcessor.process(parameters: parameters) throws error Error Domain=VTFrameProcessorErrorDomain Code=-19730 "Processor is not initialized" UserInfo={NSLocalizedDescription=Processor is not initialized}.
Also, why is VTLowLatencyFrameInterpolationConfiguration able to run while app is backgrounded but VTFrameRateConversionParameters can't (due to gpu usage)?
Hi,
I have an app that displays tens of short (<1mb) mp4 videos stored in a remote server in a vertical UICollectionView that has horizontally scrollable sections.
I'm caching all mp4 files on disk after downloading, and I also have a in-memory cache that holds a limited number (around 30) of players. The players I'm using are simple views that wrap an AVPlayerLayer and its AVPlayerItem, along with a few additional UI components.
The scrolling performance was good before iOS 26, but with the release of iOS 26, I noticed that there is significant stuttering during scrolling while creating players with a fileUrl. It happens even if use the same video file cached on disk for each cell for testing.
I also started getting this kind of log messages after the players are deinitialized:
<<<< PlayerRemoteXPC >>>> signalled err=-12785 at <>:1107
<<<< PlayerRemoteXPC >>>> signalled err=-12785 at <>:1095
<<<< PlayerRemoteXPC >>>> signalled err=-12785 at <>:1095
There's also another log message that I see occasionally, but I don't know what triggers it.
<< FigXPC >> signalled err=-16152 at <>:1683
Is there anyone else that experienced this kind of problem with the latest release?
Also, I'm wondering what's the best way to resolve the issue. I could increase the size of the memory cache to something large like 100, but I'm not sure if it is an acceptable solution because:
1- There will be 100 player instance in memory at all times.
2- There will still be stuttering during the initial loading of the videos from the web.
Any help is appreciated!
Our license service is based on version 4.5.4 and we make use of sample .c/.h files for building license service.
We are told that version 4.5.4 is going to be deprecated in 2026 and we should migrate to latest SDK version 26.
When explored the SDK, we noticed that only python and Swift based SDk is provided.
Does Apple also provide C/C++ based SDK as it is going to easier for us to integrate.
If yes, please share the SDK package and sample license service solution.
We are experiencing an issue related to DepthData from the TrueDepth camera on a specific device.
On December 1, we tested with the complainant’s device iPhone 14 / iOS 26.0.1, and observed that the depth image is received with empty values.
However, the same implementation works normally on iPhone 17 Pro Max (iOS 26.1) and iPhone 13 Pro Max (iOS 26.0.1), where depth data is delivered correctly.
In the problematic case:
TrueDepth camera is active
Face ID works normally
The app receives a DepthData object, but all values are empty (0), not nil
Because the DepthData object is not nil, this makes it difficult to detect the issue through software fallback handling.
We developed the feature with reference to the following Apple sample:
https://developer.apple.com/documentation/AVFoundation/streaming-depth-data-from-the-truedepth-camera
We would like to ask:
Are there known cases where Face ID functions normally but DepthData from the TrueDepth camera is returned as empty values?
If so, is there a recommended approach for identifying or handling this situation?
Any guidance from Apple engineers or the community would be greatly appreciated.
Thank you.
For iOS17 we've had no problem playing Apple Fairplay encrypted content with keys delivered from our key server running on FairPlay Streaming Server SDK 5.1 and subsequently FairPlay Streaming Server SDK 26. It's built and deployed using Xcode Version 26.1.1 (17B100) with no changes to the code and - as expected - the content continued to be successfully decrypted and played (so far so good). However, as soon as a device was updated to iOS26, that device would no longer play the encrypted content.
Devices remaining on iOS17 continue to work normally and the debugging logs are a sanity-check that proves that. Is anyone else experiencing this issue?
Here's the code (you should be able to drop it into a fresh iOS Xcode project and provide a server url, content url and certificate).
Hi
Is it possible to have a playlist where I have a indication of a stream in clear, but then, someone started a DRM encrypted period and then someone turns it off.
Can I just do the following (I've removed the video segments part, I'm just interested in the parts where I want notify the new drm region )?
#EXT-X-MAP:URI="video_2_10000000_t17586401730000000_init.mp4"
#EXT-X-KEY:METHOD=NONE
...
#EXT-X-MAP:URI="video_2_10000000_t17587374640000000_init.mp4"
#EXT-X-KEY:METHOD=SAMPLE-AES,URI="skd://5df0b36ac4bb4d0ff954a73b502ac332",KEYFORMAT="com.apple.streamingkeydelivery",KEYFORMATVERSIONS="1"
...
#EXT-X-MAP:URI="video_2_10000000_t17587376740000000_init.mp4?"
#EXT-X-KEY:METHOD=NONE
Should I insert discontinuity tags or something else?
Right now what I can observe is that I got some audio drops when I try to do this.
quotes are displayed incorrectly in subtitles of AVPlayerViewController when streaming VOD content using HLS.
single quote ' (escaped ') is displayed as apos;
double quotes " (escaped ") is displayed as quot;
following the vtt specification.
The same stream works fine in VLC player, showing quotes correctly in subtitles.
subtitle vtt files use
Content-Type: text/vtt
WEBVTT
X-TIMESTAMP-MAP=LOCAL:490014:06:04.000,MPEGTS:158764568760056
example line:
490014:05:46.000 --> 490014:05:50.440 align:start line:83% position:14%
and the playlist has:
#EXT-X-MEDIA:TYPE=SUBTITLES,GROUP-ID="subs",LANGUAGE="da",NAME="Dansk",AUTOSELECT=YES,CHARACTERISTICS="public.accessibility.transcribes-spoken-dialog,public.accessibility.describes-music-and-sound",URI="subs/dan_5/playlist.m3u8
#EXT-X-STREAM-INF:BANDWIDTH=780000,CODECS="mp4a.40.5,avc1.42c01e",RESOLUTION=256x144,AUDIO="audio-aac",SUBTITLES="subs"
lære dig endnu bedre at kende."
adding 'wvtt' to CODECS list in playlist does not make a difference.
Is this a known bug? Is there a workaround?
I guess the AVResourceLoaderDelegate can be used to intercept and parse the subtitle files, but it seems like quite a hack and not really intended to be used for this.
Hello,
I'm investigating an issue with LL-HLS playback using AVPlayer, specifically during DVR Live seeking (seeking to a past time).
I noticed that in certain seeking scenarios, AVPlayer sends a Blocking Playlist Reload request that includes the _HLS_msn parameter but is missing the _HLS_part parameter.
While I understand this is compliant with the HLS spec, I would like to know the specific criteria AVPlayer uses to decide when to drop the _HLS_part parameter. Does AVPlayer intentionally omit the part info when it determines that loading a specific partial segment is unnecessary during a seek operation?
Clarification on this behavior would help us greatly in debugging our stream delivery.
Thanks in advance.
Hi folks - I'm having trouble finding specific documentation about Audio Unit MIDI plugins - as in MIDI -only. Any suggestions welcome as searches aren't returning much. (too niche? user error?)
Topic:
Media Technologies
SubTopic:
Audio