I'm receiving output from avcapturesession and capturing an image using Vision, but the image is output in landscape orientation instead of portrait.
Even when I set the orientation to up in ciimage, cgimage, and uiimage, the image is still output in landscape orientation.
On iPhones 16 and below, the image is output in portrait orientation.
But on iPhones 17 and above, the image is output in landscape orientation.
Please help.
AVFoundation
RSS for tagWork with audiovisual assets, control device cameras, process audio, and configure system audio interactions using AVFoundation.
Posts under AVFoundation tag
200 Posts
Selecting any option will automatically load the page
Post
Replies
Boosts
Views
Activity
Hi, as other threads have already discussed, I'd like to record audio from a keyboard extension.
The keyboard has been granted both full access and microphone access. Nonetheless whenever I attempt to start a recording from my keyboard, it fails to start with the following error:
Recording failed to start: Error Domain=com.apple.coreaudio.avfaudio Code=561145187 "(null)" UserInfo={failed call=err = PerformCommand(*ioNode, kAUStartIO, NULL, 0)}
This is the code I am using:
import Foundation
import AVFoundation
protocol AudioRecordingServiceDelegate: AnyObject {
func audioRecordingDidStart()
func audioRecordingDidStop(withAudioData: Data?)
func audioRecordingPermissionDenied()
}
class AudioRecordingService {
weak var delegate: AudioRecordingServiceDelegate?
private var audioEngine: AVAudioEngine?
private var audioSession: AVAudioSession?
private var isRecording = false
private var audioData = Data()
private let targetFormat = AVAudioFormat(commonFormat: .pcmFormatInt16,
sampleRate: 16000,
channels: 1,
interleaved: false)!
private func setupAudioSession() throws {
let session = AVAudioSession.sharedInstance()
try session.setCategory(.playAndRecord, mode: .spokenAudio,
options: [.mixWithOthers, .allowBluetooth, .defaultToSpeaker])
try session.setPreferredIOBufferDuration(0.005)
try session.setActive(true, options: .notifyOthersOnDeactivation)
audioSession = session
}
func checkMicrophonePermission(completion: @escaping (Bool) -> Void) {
switch AVAudioApplication.shared.recordPermission {
case .granted:
completion(true)
case .denied:
delegate?.audioRecordingPermissionDenied()
completion(false)
case .undetermined:
AVAudioApplication.requestRecordPermission { [weak self] granted in
if !granted {
self?.delegate?.audioRecordingPermissionDenied()
}
completion(granted)
}
@unknown default:
delegate?.audioRecordingPermissionDenied()
completion(false)
}
}
func toggleRecording() {
if isRecording {
stopRecording()
} else {
checkMicrophonePermission { [weak self] granted in
if granted {
self?.startRecording()
}
}
}
}
private func startRecording() {
guard !isRecording else { return }
do {
try setupAudioSession()
audioEngine = AVAudioEngine()
guard let engine = audioEngine else { return }
let inputNode = engine.inputNode
let inputFormat = inputNode.inputFormat(forBus: 0)
audioData.removeAll()
guard let converter = AVAudioConverter(from: inputFormat, to: targetFormat) else {
print("Failed to create audio converter")
return
}
inputNode.installTap(onBus: 0, bufferSize: 1024, format: inputFormat) { [weak self] buffer, _ in
guard let self = self else { return }
let frameCount = AVAudioFrameCount(Double(buffer.frameLength) * 16000.0 / buffer.format.sampleRate)
guard let outputBuffer = AVAudioPCMBuffer(pcmFormat: self.targetFormat,
frameCapacity: frameCount) else { return }
outputBuffer.frameLength = frameCount
var error: NSError?
converter.convert(to: outputBuffer, error: &error) { _, outStatus in
outStatus.pointee = .haveData
return buffer
}
if error == nil, let channelData = outputBuffer.int16ChannelData {
let dataLength = Int(outputBuffer.frameLength) * 2
let data = Data(bytes: channelData.pointee, count: dataLength)
self.audioData.append(data)
}
}
engine.prepare()
try engine.start()
isRecording = true
delegate?.audioRecordingDidStart()
} catch {
print("Recording failed to start: \(error)")
stopRecording()
}
}
private func stopRecording() {
audioEngine?.inputNode.removeTap(onBus: 0)
audioEngine?.stop()
isRecording = false
let finalData = audioData
audioData.removeAll()
delegate?.audioRecordingDidStop(withAudioData: finalData)
try? audioSession?.setActive(false, options: .notifyOthersOnDeactivation)
}
deinit {
if isRecording {
stopRecording()
}
}
}
Granting the deprecated "Inter-App Audio" capability did not solve the problem either.
Is recording audio from a keyboard extension even possible in general? If so, how do I fix it?
Related threads:
https://developer.apple.com/forums/thread/108055
https://developer.apple.com/forums/thread/742601
(This only started happening as of Xcode 26.)
I know macOS and watchOS don't support this property, but all other platforms do (did?) up until I upgraded Xcode. Now when I compile I get this:
Value of type 'AVPlayerItem' has no member 'externalMetadata'
Hi, I downloaded and ran https://developer.apple.com/documentation/realitykit/rendering-stereoscopic-video-with-realitykit
and noticed that memory usage grows linearly.
I replaced the sample video with a different 8k side by side video, and the app crashed almost immediately due to memory leak.
it looks like the culprit is from makeMutablePixelBuffer() function and the allocated pixelBuffers are not recycled after being used.
screenshot is from a physical device.
I tried to modify the AVCam sample code by copying the code here https://developer.apple.com/documentation/avfoundation/adopting-smart-framing-in-your-camera-app#Configure-the-smart-framing-monitor
smart framing monitors
I can ensure the activeformat supports smart framing, but the supported frames in monitor is always nil.
In my another project it has supported value, but the observation has never been triggered, then I tried to keep printing the recommended frame, it's always nil.
Could the engineer embed the code into AVCam rather than posting a few code pieces?
Hello everyone,
I'm looking for a definitive clarification on how to completely disable all video stabilization, including the hardware OIS, using AVFoundation. The goal is to achieve a completely raw, unstabilized video feed, which is crucial when using external equipment like gimbals to avoid conflicting stabilization motions.
My research points to using the AVCaptureConnection property preferredVideoStabilizationMode and setting it to AVCaptureVideoStabilizationMode.off.
The documentation for the .off case states:
A mode that doesn’t stabilize video capture.
This description is slightly ambiguous. It's unclear whether this only affects software-level stabilization (EIS, EIS+OIS, etc) or if it guarantees the complete deactivation of the physical OIS module. For professional video applications, this is a critical distinction.
So, I'd like to ask the community:
Has anyone been able to definitively confirm that setting preferredVideoStabilizationMode to .off also disables the hardware OIS? Are there any known tests or documentation that prove this behavior?
Is there an alternative or more direct method to ensure the OIS module is physically inactive during video capture?
What is the community's best practice for ensuring absolutely no stabilization is applied to the video pipeline?
Any insights or shared experiences on this topic would be greatly appreciated.
Thank you!
I'm getting hundreds of the message below in Xcode. I've narrowed it down to when I instantiate the following
AVAudioUnitComponentManager.shared()
Message send exceeds rate-limit threshold and will be dropped. { reporterID=231700600717315, rateLimit=32hz }
Hello,
Environment
macOS 15.6.1 / Xcode 26 beta 7 / iOS 26 Beta 9
In a simple AVFoundation video-playback sample, I’m seeing different behavior between iOS 18 and iOS 26 regarding AVPlayerItem.didPlayToEndTimeNotification.
I’ve attached a minimal sample below. Please replace videoURL with a valid short video URL.
Repro steps
Tap “Play” to start playback and let the video finish.
The AVPlayerItem.didPlayToEndTimeNotification registered with NotificationCenter should fire, and you should see Play finished. in the console.
Without relaunching, tap “Play” again. This is where the issue arises.
Observed behavior
On iOS 18 and earlier: The video does not play again (it does not restart from the beginning), but AVPlayerItem.didPlayToEndTimeNotification is posted and Play finished. appears in the console. The same happens every time you press “Play”.
On iOS 26: Pressing “Play” does not post AVPlayerItem.didPlayToEndTimeNotification. The code path that prints Play finished. is never called (the callback enclosing that line is not invoked again).
Building the same program with Xcode 16.4 and running it on an iOS 26 beta device shows the same phenomenon, which suggests there has been a behavioral change for AVPlayerItem.didPlayToEndTimeNotification on iOS 26. I couldn’t find any mention of this in the release notes or API Reference.
Because the semantics around AVPlayerItem.didPlayToEndTimeNotification appear to differ, we’re forced to adjust our logic. If there is a way to achieve the iOS 18–style behavior on iOS 26, I would appreciate guidance.
Alternatively, if this change is intentional, could you share the reasoning? Is iOS 26 the correct behavior from Apple’s perspective and iOS 18 (and earlier) behavior considered incorrect? Any official clarification would be extremely helpful.
import UIKit
import AVFoundation
final class ViewController: UIViewController {
private let videoURL = URL(string: "https://......mp4")!
private var player: AVPlayer?
private var playerItem: AVPlayerItem?
private var playerLayer: AVPlayerLayer?
private var observeForComplete: NSObjectProtocol?
// UI
private let playerContainerView = UIView()
private let playButton = UIButton(type: .system)
private let stopButton = UIButton(type: .system)
private let replayButton = UIButton(type: .system)
deinit {
if let observeForComplete {
NotificationCenter.default.removeObserver(observeForComplete)
}
}
override func viewDidLoad() {
super.viewDidLoad()
view.backgroundColor = .systemBackground
setupUI()
setupPlayer()
}
override func viewDidLayoutSubviews() {
super.viewDidLayoutSubviews()
playerLayer?.frame = playerContainerView.bounds
}
// MARK: - Setup
private func setupUI() {
playerContainerView.translatesAutoresizingMaskIntoConstraints = false
playerContainerView.backgroundColor = .black
view.addSubview(playerContainerView)
// Buttons
playButton.setTitle("Play", for: .normal)
stopButton.setTitle("Pause", for: .normal)
replayButton.setTitle("RePlay", for: .normal)
[playButton, stopButton, replayButton].forEach {
$0.titleLabel?.font = .systemFont(ofSize: 16, weight: .semibold)
$0.translatesAutoresizingMaskIntoConstraints = false
$0.contentEdgeInsets = UIEdgeInsets(top: 10, left: 16, bottom: 10, right: 16)
}
let stack = UIStackView(arrangedSubviews: [playButton, stopButton, replayButton])
stack.axis = .horizontal
stack.spacing = 16
stack.alignment = .center
stack.distribution = .equalCentering
stack.translatesAutoresizingMaskIntoConstraints = false
view.addSubview(stack)
NSLayoutConstraint.activate([
playerContainerView.topAnchor.constraint(equalTo: view.safeAreaLayoutGuide.topAnchor, constant: 20),
playerContainerView.leadingAnchor.constraint(equalTo: view.leadingAnchor),
playerContainerView.trailingAnchor.constraint(equalTo: view.trailingAnchor),
playerContainerView.heightAnchor.constraint(equalToConstant: 200),
stack.topAnchor.constraint(equalTo: playerContainerView.bottomAnchor, constant: 20),
stack.centerXAnchor.constraint(equalTo: view.centerXAnchor)
])
// Action
playButton.addTarget(self, action: #selector(didTapPlay), for: .touchUpInside)
stopButton.addTarget(self, action: #selector(didTapStop), for: .touchUpInside)
replayButton.addTarget(self, action: #selector(didTapReplayFromStart), for: .touchUpInside)
}
private func setupPlayer() {
// AVURLAsset -> AVPlayerItem → AVPlayer
let asset = AVURLAsset(url: videoURL)
let item = AVPlayerItem(asset: asset)
self.playerItem = item
let player = AVPlayer(playerItem: item)
player.automaticallyWaitsToMinimizeStalling = true
self.player = player
let layer = AVPlayerLayer(player: player)
layer.videoGravity = .resizeAspect
playerContainerView.layer.addSublayer(layer)
layer.frame = playerContainerView.bounds
self.playerLayer = layer
// Notification
if let observeForComplete {
NotificationCenter.default.removeObserver(observeForComplete)
}
if let playerItem {
observeForComplete = NotificationCenter.default.addObserver(
forName: AVPlayerItem.didPlayToEndTimeNotification,
object: playerItem,
queue: .main
) { [weak self] _ in
guard self != nil else { return }
Task { @MainActor in
print("Play finished.")
}
}
}
}
// MARK: - Actions
@objc private func didTapPlay() {
player?.play()
}
@objc private func didTapStop() {
player?.pause()
}
// RePlay
@objc private func didTapReplayFromStart() {
player?.seek(to: .zero, toleranceBefore: .zero, toleranceAfter: .zero) { [weak self] _ in
self?.player?.play()
}
}
}
I would greatly appreciate an official response from Apple engineering on whether this is an intentional change, a regression, or an API contract clarification, and what the recommended approach is going forward. Thank you.
I'm creating an app that uses AVCaptureSession to pass camera input to AVCaptureMetadataOutput type set [metaout setMetadataObjectTypes:@[AVMetadataObjectTypeFace]] and scan Face.
After updating to OS 26 Beta2 and iOS 26 Beta2, an issue has occurred where the delegate method of AVCaptureMetadataOutputObjectsDelegate is not called on some devices.
The following devices are experiencing this issue.
iPad (9th Gen)
iPad air (4th Gen)
iPhone 15
This issue has not occur on any other devices I have.
I tried running the AVFoundation sample code on the Apple Developer site on the above device. The same problem still occurs. [https://developer.apple.com/documentation/avfoundation/capture_setup/avcambarcode_detecting_barcodes_and_faces]
Are any additional settings required after OS 26 beta and iOS 26 beta? Or is there some problem on the OS side?
In iOS 26 (Developer Beta), the AVCaptureMetadataOutputObjectsDelegate no longer receives callbacks when metadataOutput.metadataObjectTypes = [.face] is set. On earlier iOS versions the issue does not occur. Interestingly, face detection works if I set the sessionPreset to .medium, but not with .high — except on the iPhone 16 Pro Max, where it works regardless.
I'm having a crash on an app that plays videos when the users activates close captions.
I was able to replicate the issue on an empty project. The crash happens when the AVPlayerLayer is used to instantiate an AVPictureInPictureController
These are the example project where I tested the crash:
struct ContentView: View {
var body: some View {
VStack {
VideoPlaylistView()
}
.frame(maxWidth: .infinity, maxHeight: .infinity)
.background(Color.black.ignoresSafeArea())
}
}
class VideoPlaylistViewModel: ObservableObject {
// Test with other videos
var player: AVPlayer? = AVPlayer(url: URL(string:"https://d2ufudlfb4rsg4.cloudfront.net/newsnation/WIpkLz23h/adaptive/WIpkLz23h_master.m3u8")!)
}
struct VideoPlaylistView: View {
@StateObject var viewModel = VideoPlaylistViewModel()
var body: some View {
ScrollView {
VideoCellView(player: viewModel.player)
.onAppear {
viewModel.player?.play()
}
}
.scrollTargetBehavior(.paging)
.ignoresSafeArea()
}
}
struct VideoCellView: View {
let player: AVPlayer?
@State var isCCEnabled: Bool = false
var body: some View {
ZStack {
PlayerView(player: player)
.accessibilityIdentifier("Player View")
}
.containerRelativeFrame([.horizontal, .vertical])
.overlay(alignment: .bottom) {
Button {
player?.currentItem?.asset.loadMediaSelectionGroup(for: .legible) { group,error in
if let group {
let option = !isCCEnabled ? group.options.first : nil
player?.currentItem?.select(option, in: group)
isCCEnabled.toggle()
}
}
} label: {
Text("Close Captions")
.font(.subheadline)
.foregroundStyle(isCCEnabled ? .red : .primary)
.buttonStyle(.bordered)
.padding(8)
.background(Color.blue.opacity(0.75))
}
.padding(.bottom, 48)
.accessibilityIdentifier("Button Close Captions")
}
}
}
import Foundation
import UIKit
import SwiftUI
import AVFoundation
import AVKit
struct PlayerView: UIViewRepresentable {
let player: AVPlayer?
func updateUIView(_ uiView: UIView, context: UIViewRepresentableContext<PlayerView>) {
}
func makeUIView(context: Context) -> UIView {
let view = PlayerUIView()
view.playerLayer.player = player
view.layer.addSublayer(view.playerLayer)
view.layer.backgroundColor = UIColor.red.cgColor
view.pipController = AVPictureInPictureController(playerLayer: view.playerLayer)
view.pipController?.requiresLinearPlayback = true
view.pipController?.canStartPictureInPictureAutomaticallyFromInline = true
view.pipController?.delegate = view
return view
}
}
class PlayerUIView: UIView, AVPictureInPictureControllerDelegate {
let playerLayer = AVPlayerLayer()
var pipController: AVPictureInPictureController?
override init(frame: CGRect) {
super.init(frame: frame)
}
required init?(coder: NSCoder) {
fatalError("init(coder:) has not been implemented")
}
override func layoutSubviews() {
super.layoutSubviews()
playerLayer.frame = bounds
playerLayer.backgroundColor = UIColor.green.cgColor
}
func pictureInPictureController(_ pictureInPictureController: AVPictureInPictureController, failedToStartPictureInPictureWithError error: any Error) {
print("Error starting Picture in Picture: \(error.localizedDescription)")
}
}
class AppDelegate: NSObject, UIApplicationDelegate {
func application(_ application: UIApplication, didFinishLaunchingWithOptions launchOptions: [UIApplication.LaunchOptionsKey : Any]? = nil) -> Bool {
let audioSession = AVAudioSession.sharedInstance()
do {
try audioSession.setCategory(.playback, mode: .moviePlayback)
try audioSession.setActive(true)
} catch {
print("ERR: \(error.localizedDescription)")
}
return true
}
}
UITest to make the app crash:
final class VideoPlaylistSampleUITests: XCTestCase {
func testCrashiOS26ToggleCloseCaptions() throws {
let app = XCUIApplication()
app.launch()
let videoPlayer = app.otherElements["Player View"]
XCTAssertTrue(videoPlayer.waitForExistence(timeout: 30))
let closeCaptionButton = app.buttons["Button Close Captions"]
for _ in 0..<2000 {
closeCaptionButton.tap()
}
}
}
I tested the accuracy of the depth map on iPhone 12, 13, 14, 15, and 16, and found that the variance of the depth map after iPhone 12 is significantly greater than that of iPhone 12.
Enabling depth filtering will cause the depth data to be affected by the previous frame, adding more unnecessary noise, especially when the phone is moving.
This is not friendly for high-precision reconstruction. I tried to add depth map smoothing in post-processing to solve the problem of large depth map deviation, but the performance is still poor.
Is there any depth map smoothing solutions already announced by Apple?
What options do I have if I don't want to use Blackmagic's Camera ProDock as the external Sync Hardware, but instead I want to create my own USB-C hardware accessory which would show up as an AVExternalSyncDevice on the iPhone 17 Pro?
Which protocol does my USB-C device have to implement to show up as an eligible clock device in AVExternalSyncDevice.DiscoverySession?
On macOS Sequoia, I'm having the hardest time getting this basic audio output to work correctly. I'm compiling in XCode using C99, and when I run this, I get audio for a split second, and then nothing, indefinitely.
Any ideas what could be going wrong?
Here's a minimum code example to demonstrate:
#include <AudioToolbox/AudioToolbox.h>
#include <stdint.h>
#define RENDER_BUFFER_COUNT 2
#define RENDER_FRAMES_PER_BUFFER 128
// mono linear PCM audio data at 48kHz
#define RENDER_SAMPLE_RATE 48000
#define RENDER_CHANNEL_COUNT 1
#define RENDER_BUFFER_BYTE_COUNT (RENDER_FRAMES_PER_BUFFER * RENDER_CHANNEL_COUNT * sizeof(f32))
void RenderAudioSaw(float* outBuffer, uint32_t frameCount, uint32_t channelCount)
{
static bool isInverted = false;
float scalar = isInverted ? -1.f : 1.f;
for (uint32_t frame = 0; frame < frameCount; ++frame)
{
for (uint32_t channel = 0; channel < channelCount; ++channel)
{
// series of ramps, alternating up and down.
outBuffer[frame * channelCount + channel] = 0.1f * scalar * ((float)frame / frameCount);
}
}
isInverted = !isInverted;
}
AudioStreamBasicDescription coreAudioDesc = { 0 };
AudioQueueRef coreAudioQueue = NULL;
AudioQueueBufferRef coreAudioBuffers[RENDER_BUFFER_COUNT] = { NULL };
void coreAudioCallback(void* unused, AudioQueueRef queue, AudioQueueBufferRef buffer)
{
// 0's here indicate no fancy packet magic
AudioQueueEnqueueBuffer(queue, buffer, 0, 0);
}
int main(void)
{
const UInt32 BytesPerSample = sizeof(float);
coreAudioDesc.mSampleRate = RENDER_SAMPLE_RATE;
coreAudioDesc.mFormatID = kAudioFormatLinearPCM;
coreAudioDesc.mFormatFlags = kLinearPCMFormatFlagIsFloat | kLinearPCMFormatFlagIsPacked;
coreAudioDesc.mBytesPerPacket = RENDER_CHANNEL_COUNT * BytesPerSample;
coreAudioDesc.mFramesPerPacket = 1;
coreAudioDesc.mBytesPerFrame = RENDER_CHANNEL_COUNT * BytesPerSample;
coreAudioDesc.mChannelsPerFrame = RENDER_CHANNEL_COUNT;
coreAudioDesc.mBitsPerChannel = BytesPerSample * 8;
coreAudioQueue = NULL;
OSStatus result;
// most of the 0 and NULL params here are for compressed sound formats etc.
result = AudioQueueNewOutput(&coreAudioDesc, &coreAudioCallback, NULL, 0, 0, 0, &coreAudioQueue);
if (result != noErr)
{
assert(false == "AudioQueueNewOutput failed!");
abort();
}
for (int i = 0; i < RENDER_BUFFER_COUNT; ++i)
{
uint32_t bufferSize = coreAudioDesc.mBytesPerFrame * RENDER_FRAMES_PER_BUFFER;
result = AudioQueueAllocateBuffer(coreAudioQueue, bufferSize, &(coreAudioBuffers[i]));
if (result != noErr)
{
assert(false == "AudioQueueAllocateBuffer failed!");
abort();
}
}
for (int i = 0; i < RENDER_BUFFER_COUNT; ++i)
{
RenderAudioSaw(coreAudioBuffers[i]->mAudioData, RENDER_FRAMES_PER_BUFFER, RENDER_CHANNEL_COUNT);
coreAudioBuffers[i]->mAudioDataByteSize = coreAudioBuffers[i]->mAudioDataBytesCapacity;
AudioQueueEnqueueBuffer(coreAudioQueue, coreAudioBuffers[i], 0, 0);
}
AudioQueueStart(coreAudioQueue, NULL);
sleep(10); // some time to hear the audio
AudioQueueStop(coreAudioQueue, true);
AudioQueueDispose(coreAudioQueue, true);
return 0;
}
I'm writing some camera functionality that uses AVCaptureVideoDataOutput.
I've set it up so that it calls my AVCaptureVideoDataOutputSampleBufferDelegate on a background thread, by making my own dispatch_queue and configuring the AVCaptureVideoDataOutput.
My question is then, if I configure my AVCaptureSession differently, or even stop it altogether, is this guaranteed to flush all pending jobs on my background thread? For example, does [AVCaptureSession stopRunning] imply a blocking call until all pending frame-callbacks are done?
I have a more practical example below, showing how I am accessing something from the foreground thread from the background thread, but I wonder when/how it's safe to clean up that resource.
I have setup similar to the following:
// Foreground thread logic
dispatch_queue_t queue = dispatch_queue_create("qt_avf_camera_queue", nullptr);
AVCaptureSession *captureSession = [[AVCaptureSession alloc] init];
setupInputDevice(captureSession); // Connects the AVCaptureDevice...
// Store some arbitrary data to be attached to the frame, stored on the foreground thread
FrameMetaData frameMetaData = ...;
MySampleBufferDelegate *sampleBufferDelegate = [MySampleBufferDelegate alloc];
// Capture frameMetaData by reference in lambda
[sampleBufferDelegate setFrameMetaDataGetter: [&frameMetaData]() { return &frameMetaData; }];
AVCaptureVideoDataOutput *captureVideoDataOutput = [[AVCaptureVideoDataOutput alloc] init];
[captureVideoDataOutput setSampleBufferDelegate:sampleBufferDelegate
queue:queue];
[captureSession addOutput:captureVideoDataOutput];
[captureSession startRunning];
[captureSession stopRunning];
// Is it now safe to destroy frameMetaData, or do we need manual barrier?
And then in MySampleBufferDelegate:
- (void)captureOutput:(AVCaptureOutput *)captureOutput
didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer
fromConnection:(AVCaptureConnection *)connection
{
// Invokes the callback set above
FrameMetaData *frameMetaData = frameMetaDataGetter();
emitSampleBuffer(sampleBuffer, frameMetaData);
}
Hi everyone,
I’m testing audio recording on an iPhone 15 Plus using AVFoundation.
Here’s a simplified version of my setup:
let settings: [String: Any] = [
AVFormatIDKey: Int(kAudioFormatLinearPCM),
AVSampleRateKey: 8000,
AVNumberOfChannelsKey: 1,
AVLinearPCMBitDepthKey: 16,
AVLinearPCMIsFloatKey: false
]
audioRecorder = try AVAudioRecorder(url: fileURL, settings: settings)
audioRecorder?.record()
When I check the recorded file’s sample rate, it logs:
Actual sample rate: 8000.0
However, when I inspect the hardware sample rate:
try session.setCategory(.playAndRecord, mode: .default)
try session.setActive(true)
print("Hardware sample rate:", session.sampleRate)
I consistently get:
`Hardware sample rate: 48000.0
My questions are:
Is the iPhone mic actually capturing at 8 kHz, or is it recording at 48 kHz and then downsampling to 8 kHz internally?
Is there any way to force the hardware to record natively at 8 kHz?
If not, what’s the recommended approach for telephony-quality audio (true 8 kHz) on iOS devices?
Thanks in advance for your guidance!
I'm writing some camera functionality that uses AVCaptureVideoDataOutput.
I've set it up so that it calls my AVCaptureVideoDataOutputSampleBufferDelegate on a background thread, by making my own dispatch_queue and configuring the AVCaptureVideoDataOutput.
My question is then, if I configure my AVCaptureSession differently, or even stop it altogether, is this guaranteed to flush all pending jobs on my background thread?
I have a more practical example below, showing how I am accessing something from the foreground thread from the background thread, but I wonder when/how it's safe to clean up that resource.
I have setup similar to the following:
// Foreground thread logic
dispatch_queue_t queue = dispatch_queue_create("avf_camera_queue", nullptr);
AVCaptureSession *captureSession = [[AVCaptureSession alloc] init];
setupInputDevice(captureSession); // Connects the AVCaptureDevice...
// Store some arbitrary data to be attached to the frame, stored on the foreground thread
FrameMetaData frameMetaData = ...;
MySampleBufferDelegate *sampleBufferDelegate = [MySampleBufferDelegate alloc];
// Capture frameMetaData by reference in lambda
[sampleBufferDelegate setFrameMetaDataGetter: [&frameMetaData]() { return &frameMetaData; }];
AVCaptureVideoDataOutput *captureVideoDataOutput = [[AVCaptureVideoDataOutput alloc] init];
[captureVideoDataOutput setSampleBufferDelegate:sampleBufferDelegate
queue:queue];
[captureSession addOutput:captureVideoDataOutput];
[captureSession startRunning];
[captureSession stopRunning];
// Is it now safe to destroy frameMetaData, or do we need manual barrier?
And then in MySampleBufferDelegate:
- (void)captureOutput:(AVCaptureOutput *)captureOutput
didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer
fromConnection:(AVCaptureConnection *)connection
{
// Invokes the callback set above
FrameMetaData *frameMetaData = frameMetaDataGetter();
emitSampleBuffer(sampleBuffer, frameMetaData);
}
I’m trying to play an Apple Immersive video in the .aivu format using VideoPlayerComponent using the official documentation found here:
https://developer.apple.com/documentation/RealityKit/VideoPlayerComponent
Here is a simplified version of the code I'm running in another application:
import SwiftUI
import RealityKit
import AVFoundation
struct ImmersiveView: View {
var body: some View {
RealityView { content in
let player = AVPlayer(url: Bundle.main.url(forResource: "Apple_Immersive_Video_Beach", withExtension: "aivu")!)
let videoEntity = Entity()
var videoPlayerComponent = VideoPlayerComponent(avPlayer: player)
videoPlayerComponent.desiredImmersiveViewingMode = .full
videoPlayerComponent.desiredViewingMode = .stereo
player.play()
videoEntity.components.set(videoPlayerComponent)
content.add(videoEntity)
}
}
}
Full code is here:
https://github.com/tomkrikorian/AIVU-VideoPlayerComponentIssueSample
But the video does not play in my project even though the file is correct (It can be played in Apple Immersive Video Utility) and I’m getting this error when the app crashes:
App VideoPlayer+Component Caption: onComponentDidUpdate Media Type is invalid
Domain=SpatialAudioServicesErrorDomain Code=2020631397 "xpc error" UserInfo={NSLocalizedDescription=xpc error}
CA_UISoundClient.cpp:436 Got error -4 attempting to SetIntendedSpatialAudioExperience
[0x101257490|InputElement #0|Initialize] Number of channels = 0 in AudioChannelLayout does not match number of channels = 2 in stream format.
Video I’m using is the official sample that can be found here but tried several different files shot from my clients and the same error are displayed so the issue is definitely not the files but on the RealityKit side of things:
https://developer.apple.com/documentation/immersivemediasupport/authoring-apple-immersive-video
Steps to reproduce the issue:
- Open AIVUPlayerSample project and run. Look at the logs.
All code can be found in ImmersiveView.swift
Sample file is included in the project
Expected results:
If I followed the documentation and samples provided, I should see my video played in full immersive mode inside my ImmersiveSpace.
Am i doing something wrong in the code? I'm basically following the documentation here.
Feedback ticket: FB19971306
When changing a camera's exposure, AVFoundation provides a callback which offers the timestamp of the first frame captured with the new exposure duration: AVCaptureDevice.setExposureModeCustom(duration:, iso:, completionHandler:).
I want to get a similar callback when changing frame duration.
After setting AVCaptureDevice.activeVideoMinFrameDuration or AVCaptureDevice.activeVideoMinFrameDuration to a new value, how can I compute the index or the timestamp of the first camera frame which was captured using the newly set frame duration?
I have a feature requirement: to switch the writer for file writing every 5 minutes, and then quickly merge the last two files. How can I ensure that the merged file is seamlessly combined and that the audio and video information remains synchronized? Currently, the merged video has glitches, and the audio is also out of sync. If there are experts who can provide solutions in this area, I would be extremely grateful.