I can't find any way to search for a song by title only. You can search for songs, but any term you provide appears to be applied to any metadata associated with the song. Look at the largely nonsensical results when I search for a song with the letters "de":
In many cases, that string doesn't appear anywhere. I used
MusicCatalogSearchRequest(term: searchTerm, types: [Song.self])
Likewise it stands to reason that people want to search for artist and album names using text strings. How do we do that?
Explore the integration of media technologies within your app. Discuss working with audio, video, camera, and other media functionalities.
Post
Replies
Boosts
Views
Activity
I'm building a custom camera screen that displays the camera image on a preview layer and then captures an image, using AVCaptureSession. When the picture is captured, I immediately load it into a UIImageView in order to display it to the user for approval.
I've actually done this many times before, but this is the first time I've tried to do it in an app that supports interface rotation. If I hold the phone in Portrait mode and capture a picture, everything works as expected.
When the user rotates the phone into Landscape orientation, I detect this and I replace the preview layer (AVCaptureVideoPreviewLayer) with a new one, specifying connection.videoRotationAngle in order to make the image appear in the right orientation. I'm a little surprised that this is necessary, and it's not a smooth transition, but that doesn't matter.
What does matter is that when I capture the image, it is in the wrong orientation. I tried rotating it myself, but this doesn't seem to make any difference. What am I doing wrong?
Our app involves using the camera to scan barcodes or QR codes, with a working distance of about 5 cm. However, we’ve noticed variations in the focus distance of camera lenses across different iPhone models.
Currently, we mainly use two types of lenses: wide-angle and ultra-wide-angle.
• For iPhone 13 and earlier models, we use the wide-angle lens.
• For iPhone 13 Pro and later models, we use the ultra-wide-angle lens.
We are not certain if this setup is correct since we don’t have all iPhone models to test.
There is a users have reported focus issues on his iPhone 15.
We would like to ask if there’s a resource where we can find the minimum focus distance of different cameras in each iPhone model. This is to verify whether our current configuration is accurate.
Alternatively, if such data is not readily available, could apple tam advise which camera should be used on various iPhone models for scenarios with a working distance of approximately 5 cm?
Thank you!
My app currently captures video using an AVCaptureSession set with the AVCaptureSessionPreset1920x1080 preset. However, I'd like to update this behavior, such that video can be recorded at a range of different resolutions.
There isn't a preset aligning to each desired resolution, so I thought I'd instead directly set the AVCaptureDeviceFormat. For any desired resolution, I would find the format that is closest without going under the desired resolution, and then crop it down as a post-processing step.
However, what I've observed is that there can be a range of available formats for a device at each resolution, with various differing settings. Presumably there is logic within AVCaptureSession that selects a reasonable default based on all these different settings, but since I am applying the format directly, I think I don't have a way to make use of that default logic? And it is undocumented?
Does this mean that the only way to select a format is to implement a comparison function that considers all different values of all different properties on AVCaptureDeviceFormat, and then sort the formats according to this comparator?
If so, what if some new property is added to AVCaptureDeviceFormat in the future? The sort would not take this new property into account, and the function might select a format with some new undesired property.
Are there any guarantees about what types for formats will be supported on a device? For example, can I take for granted that a '420v' format will exist at each resolution? If so I could filter the formats down only to those with this setting without risking filtering out all of the supported formats.
I suspect I may be missing something obvious. Any help would be greatly appreciated!
So get a swift file and put this in it
import Foundation
import AVFoundation
let synthesizer = AVSpeechSynthesizer()
let utterance = AVSpeechUtterance(string: "Hello, testing speech synthesis on macOS.")
if let voice = AVSpeechSynthesisVoice(identifier: "com.apple.voice.compact.en-GB.Daniel") {
utterance.voice = voice
print("Using voice: \(voice.name), \(voice.language)")
} else {
print("Daniel voice not found on macOS.")
}
synthesizer.speak(utterance)
I get no speech output and this log output
Error reading languages in for local resources.
Error reading languages in for local resources.
Using voice: Daniel, en-GB
Program ended with exit code: 0
Why? and whats with "Error reading languages in for local resources." ?
I was able to obtain the depth map image using AVCapturePhotoOutput from the delegate method
func photoOutput(_ output: AVCapturePhotoOutput, didFinishProcessingPhoto photo: AVCapturePhoto, error: (any Error)?)
I convert the depth map to kCVPixelFormatType_DepthFloat32 format and get the pixel values of the depth map using the below code
func convertDepthData(depthMap: CVPixelBuffer) -> [[Float32]] {
let width = CVPixelBufferGetWidth(depthMap)
let height = CVPixelBufferGetHeight(depthMap)
var convertedDepthMap: [[Float32]] = Array(
repeating: Array(repeating: 0, count: width),
count: height
)
CVPixelBufferLockBaseAddress(depthMap, CVPixelBufferLockFlags(rawValue: 2))
let floatBuffer = unsafeBitCast(
CVPixelBufferGetBaseAddress(depthMap),
to: UnsafeMutablePointer<Float32>.self
)
for row in 0 ..< height {
for col in 0 ..< width {
if floatBuffer[width * row + col].isFinite{
convertedDepthMap[row][col] = floatBuffer[width * row + col]
}
}
}
CVPixelBufferUnlockBaseAddress(depthMap, CVPixelBufferLockFlags(rawValue: 2))
return convertedDepthMap
}
Is this the right way of accessing the depth float values from a depth map. And what will be the unit for it. Because some times the depth values are in range of 0.7 when I keep the device close to the subject around 15 to 30 cm.
I have a custom app running on a Mac Studio with Ventura that grabs a snapshot image from a network camera. It then adds some extra information into the EXIF "MakerNote" field. However the metadata cannot be read back out of the image when running Ventrua, it can however be read out of the same image file on a Mac that is not running Ventura.
It would appear Apple has removed support for reading MakerNote in Ventura but still supports writing MakerNote in Ventura.
This code is about 7 years old and written in ObjC and has worked with no issue until Ventura came along.
Calls used
CGImageDestinationAddImageFromSource(); // used to write the image to disk with the extra metadata - Works on Ventura
CGImageSourceCopyPropertiesAtIndex(); // used to read the meta data from an image - does not return "MakeNote" data
Is there a new way to read EXIF "MakeNote" data from image files that was introduced with Ventura?
Case-ID: 10075936
PLATFORM AND VERSION
iOS
Development environment: Xcode Xcode15, macOS macOS 14.5
Run-time configuration: iOS iOS18.0.1
DESCRIPTION OF PROBLEM
Our customer experienced an one-way audio issue when switching from the built-in microphone to AirPods Pro (model: A2084, version: 6F21) during a VoIP call. The issue occurred when the customer's voice could not be heard by the other party, but the customer could hear the other party's voice.
STEPS TO REPRODUCE
Here are the details:
After the issue occurred, subsequent VoIP calls experienced the same issue when using AirPods Pro, but the issue did not occur when using the built-in microphone. The issue could only be resolved by restarting the system, and killing the app did not work.
Log and code analysis:
In WebRTC, it listens for AVAudioSessionRouteChangeNotification. In the above scenario, when webrtc receives the route change notification, it will print the audio session configuration information. At this point, the input channel count was 0, which was abnormal:
[Webrtc] (RTCLogging.mm:33): (audio_device_ios.mm:535 HandleValidRouteChange): RTC_OBJC_TYPE(RTCAudioSession):
{
category: AVAudioSessionCategoryPlayAndRecord
categoryOptions: 128
mode: AVAudioSessionModeVoiceChat
isActive: 1
sampleRate: 48000.00
IOBufferDuration: 0.020000
outputNumberOfChannels: 2
inputNumberOfChannels: 0
outputLatency: 0.021500
inputLatency: 0.005000
outputVolume: 0.600000
isPreferredSpeaker: 0
isCallkit: 0
}
If app tries to call API, setPreferredInputNumberOfChannels at this point, it will fail with an error code of -50:
setConfiguration:active:shouldSetActive:error:]): Failed to set preferred input number of channels(1): The operation couldn’t be completed. (OSStatus error -50.)
Our questions:
When AVAudioSession is active, the category and mode are as expected. Why is the input channel count 0?
Assuming that the AVAudioSession state is abnormal at this point, why does killing the app not resolve the issue, and why does the system need to be restarted to resolve the issue?
Is it possible that the category and mode of the AVAudioSession fetched by the app is currently wrong? Does it need to be reset again each time the callkit is started if the category and mode fetched are the same as the values to be set?
HI Guys,
I'm using Shazamkit in my IOS app and successfully capturing the currently playing track details, when using the devices (iPhone) built-in mic.
When I test with AirPods though, my app cannot both send the output to through the AirPods and capture that same output with the AirPods mic, for Shazamkit recognition.
I believe this must be possible, because the Shazamkit widget on IOS can do this.
Is it restricted in some way for third party apps?
If not, I'd appreciate some guidance on how to achieve this in Swift code.
Thanks in advance.
I'm trying to capture the depth map image using true depth camera in iPhone 15 plus. I was able to setup the AVCapture session with AVCaptureDeviceInput as builtInTrueDepthCamera and AVCapturePhotoOutput with isDepthDataDeliveryEnabled set as true. I also manually made the activeDepthDataFormat of AVCapture device to kCVPixelFormatType_DepthFloat16 or kCVPixelFormatType_DepthFloat32. Finally I have enabled isDepthDataDeliveryEnabled, embedsDepthDataInPhoto , embedsPortraitEffectsMatteInPhoto and embedsSemanticSegmentationMattesInPhoto in AVCapturePhotoSettings before capturing the photo using capturePhoto(with: photoSettings, delegate: self) method.
I have checked manually printing the activeDepthDataFormat of AVCapture device. First before setting it by default it is
Optional('dpth'/'hdis' 640x 480, { 2- 30 fps}, photo dims:{}, fov:73.699, system exposure bias range:-2.0-2.0)
After forcing it to kCVPixelFormatType_DepthFloat16 or kCVPixelFormatType_DepthFloat32 the format is
Optional('dpth'/'hdep' 160x 120, { 2- 30 fps}, photo dims:{}, fov:73.699, system exposure bias range:-2.0-2.0)
But when I receive the captured photo in
func photoOutput(_ output: AVCapturePhotoOutput, didFinishProcessingPhoto photo: AVCapturePhoto, error: (any Error)?)
The depth map is
Optional(hdis 640x480 (high/abs) calibration:{intrinsicMatrix: [2723.07 0.00 2016.00 | 0.00 2723.07 1512.00 | 0.00 0.00 1.00], extrinsicMatrix: [1.00 0.00 0.00 0.00 | 0.00 1.00 0.00 0.00 | 0.00 0.00 1.00 0.00] pixelSize:0.001 mm, distortionCenter:{2016.00,1512.00}, ref:{4032x3024}})
Here it shows hdis instead of hdep, why is it capturing disparity map instead of true depth map.
The depth quality is high and depth data accuracy is absolute.
Here is my code
import UIKit
import AVKit
import AVFoundation
class ViewController: UIViewController, AVCapturePhotoCaptureDelegate {
@IBOutlet weak var previewView: UIView!
@IBOutlet weak var resultLbl: UILabel!
private var session = AVCaptureSession()
private var captureDevice: AVCaptureDevice?
private var inputDevice: AVCaptureDeviceInput?
private var photoOutput: AVCapturePhotoOutput?
private var photoSettings: AVCapturePhotoSettings?
private var cameraPreviewLayer: AVCaptureVideoPreviewLayer?
override func viewDidLoad() {
super.viewDidLoad()
// Do any additional setup after loading the view.
self.setupCaptureSession()
}
func setupCaptureSession(){
captureDevice = AVCaptureDevice.default(.builtInTrueDepthCamera, for: .video, position: .unspecified)
guard let captureDevice else{
print("ERROR::UNABLE TO SET TRUE DEPTH CAMERA ")
return }
session.beginConfiguration()
do{
inputDevice = try AVCaptureDeviceInput(device: captureDevice)
guard let inputDevice else{
print("ERROR: UNABLE TO SET UP INPUT DEVICE")
return }
if session.canAddInput(inputDevice){
session.addInput(inputDevice)
}
}
catch{
print(error)
}
photoOutput = AVCapturePhotoOutput()
guard let photoOutput else{
print("ERROR: UNABLE TO SET UP PHOTO OUTPUT")
return }
if session.canAddOutput(photoOutput){
session.addOutput(photoOutput)
}
session.sessionPreset = .photo
photoOutput.isDepthDataDeliveryEnabled = photoOutput.isDepthDataDeliverySupported
print("IS DEPTH ENABLED:: \(photoOutput.isDepthDataDeliveryEnabled)")
session.commitConfiguration()
let availableFormats = captureDevice.activeFormat.supportedDepthDataFormats
let depthFormat = availableFormats.filter { format in
let pixelFormatType =
CMFormatDescriptionGetMediaSubType(format.formatDescription)
return (pixelFormatType == kCVPixelFormatType_DepthFloat16 ||
pixelFormatType == kCVPixelFormatType_DepthFloat32)
}.first
session.beginConfiguration()
try! captureDevice.lockForConfiguration()
captureDevice.activeDepthDataFormat = depthFormat
captureDevice.unlockForConfiguration()
session.commitConfiguration()
self.setupPreviewLayer()
}
func setupPreviewLayer(){
cameraPreviewLayer = AVCaptureVideoPreviewLayer(session: session)
cameraPreviewLayer?.videoGravity = .resizeAspectFill
if let cameraPreviewLayer{
self.previewView.layer.addSublayer(cameraPreviewLayer)
cameraPreviewLayer.frame = self.previewView.bounds
}
DispatchQueue.global(qos: .userInteractive).async {
self.session.startRunning()
}
}
@IBAction func captureBtnPressed(_ sender: Any) {
photoSettings = AVCapturePhotoSettings(format: [AVVideoCodecKey:AVVideoCodecType.jpeg])
guard let photoSettings else{
print("ERROR: UNABLE TO SETUP PHOTO SETTINGS")
return
}
guard let photoOutput else{
print("ERROR: UNABLE TO SET UP PHOTO OUTPUT")
return
}
photoSettings.isDepthDataDeliveryEnabled = photoOutput.isDepthDataDeliverySupported
photoSettings.embedsDepthDataInPhoto = true
photoSettings.embedsPortraitEffectsMatteInPhoto = true
photoSettings.embedsSemanticSegmentationMattesInPhoto = true
photoOutput.capturePhoto(with: photoSettings, delegate: self)
}
func photoOutput(_ output: AVCapturePhotoOutput, didFinishProcessingPhoto photo: AVCapturePhoto, error: (any Error)?) {
print(photo.depthData)
switch photo.depthData?.depthDataQuality {
case .low:
print("Depth quality is low")
case .high:
print("Depth quality is high")
case nil:
print("Depth quality is nil")
}
switch photo.depthData?.depthDataAccuracy {
case .relative:
print("Depth accuarcy is relative")
case .absolute:
print("Depth accuarcy is absolute")
case nil:
print("Depth accuarcy is nil")
}
if let imageData = photo.fileDataRepresentation(){
if let image = UIImage(data: imageData){
UIImageWriteToSavedPhotosAlbum(image, nil, nil, nil)
}
}
}
}
Hi,
I have configured the stream as interleaved, but I am unsure if the function produces interleaved samples. So here my question:
Does AudioDeviceCreateIOProcID produce interleaved samples with microphone input?
Hi everyone, I just upgraded my iPhone to 18.1.1. I noticed the absence of the normal mode on my AirPods. I can't stand the noise cancellation mode for too long and the transparency one is overwhelming everytime a hair is brushing near by my AirPods. Why is that? I don't really see the progression here. Can you this be fixed in the next version? I don't want to buy another brand, I use them everyday for many hours.
When trying to edit some Live Photos, calling PHLivePhotoEditingContext.saveLivePhoto results in the following error:
Error Domain=AVFoundationErrorDomain Code=-11800 "The operation could not be completed" UserInfo={NSLocalizedFailureReason=An unknown error occurred (-12815), NSLocalizedDescription=The operation could not be completed, NSUnderlyingError=0x300d05380 {Error Domain=NSOSStatusErrorDomain Code=-12815 "(null)"}}
I was able to replicate it on my device by taking a new Live Photo. Not sure what's wrong with that one specifically, not all Live Photos replicate the issue.
I've submitted FB15880825 with a sysdiagnose and a Photos Diagnostics as well. Any ideas what's going on here? It's impacting multiple customers. Thanks!
Since iOS/iPadOs/tvOS 18 then we have run into a new problem with streaming of FairPlay encrypted video. On the affected streams then the audio plays perfectly but the video freezes for periods of a few seconds, so it will freeze for 5s or so, then be OK for a few seconds then freeze again.
It is entirely reproducible when all the following are true
the video streams were produced by a particular encoder (or particular settings, not sure on that)
the video must be encrypted
device is running some variety of iOS 18 (or iPadOS or tvOS)
the device is an affected device
Known devices are
AppleTV 4K 2nd Gen
iPad Pro 11" 1st and 2nd gen
Devices known not to show the problem are
all other AppleTV models
iPhone 13 Pro and 16 Pro
If we stream the same content, but unencrypted, then it plays perfectly, or if you play the encrypted stream on, say, tvOS 17.
When the freezing occurs then we can see in the console logs repeating blocks of lines like the following
default 18:08:46.578582+0000 videocodecd AppleAVD: AppleAVDDecodeFrameResponse(): Frame# 5771 DecodeFrame failed with error 0x0000013c
default 18:08:46.578756+0000 videocodecd AppleAVD: AppleAVDDecodeFrameInternal(): failed - error: 316
default 18:08:46.579018+0000 videocodecd AppleAVD: AppleAVDDecodeFrameInternal(): avdDec - Frame# 5771, DecodeFrame failed with error: 0x13c
default 18:08:46.579169+0000 videocodecd AppleAVD: AppleAVDDisplayCallback(): Asking fig to drop frame # 5771 with err -12909 - internalStatus: 315
also more relevant looking lines:
default 18:17:39.122019+0000 kernel AppleAVD: avdOutbox0ISR(): FRM DONE (cid: 2.0, fno: 10970, codecT: 1) FAILED!!
default 18:17:39.122155+0000 videocodecd AppleAVD: AppleAVDDisplayCallback(): Asking fig to drop frame # 10970 with err -12909 - internalStatus: 315
default 18:17:39.122221+0000 kernel AppleAVD: ## client[ 2.0] @ frm 10970, errStatus: 0x10
default 18:17:39.122338+0000 kernel AppleAVD: decodeFailIdentify(): VP error bit 4 has EP3B0 error
default 18:17:39.122401+0000 kernel AppleAVD: processHWResponse(): clientID 2.0 frameNumber 10970 error 315, offsetIndex 10, isHwErr 1
So it would seem to me that one of the following must be happening:
When these particular HLS files are encrypted then the data is being corrupted in some way that played back on iOS 17 and earlier but now won't on 18+, or
There's a regression in iOS 18 that means that this particular format of video data is corrupted on decryption
If anyone has seen similar behaviour, or has any ideas how to identify which of the two scenarios it is, please say.
Unfortunately we don't have control of the servers so can't make changes there unless we can identify they are definitely the cause of the problem.
Thanks, Simon.
I’m a tracking company and have my own tracking platform. Looking for the solution that using tag device for animals like Air Tags but running on my platform.
Is there a way to allow my platform to interface with the Find My Phone to get the location data of my Tags ?
I'm trying to implement anti-spoofing in iOS app using iphone true depth front camera. I have checked the following questions still can't find a proper working solution.
I trained a coreML model using 22000 depth human face images and 22000 non-human face(objects,food etc) images. The accuracy of the model is very less.
When testing out with flat 2d images shown on a smartphone screen I found that I get depth map even for flat 2D images like this. Even though the image is flat how does it give the depth map for the person shown in the flat 2D picture so the model thinks that it is a real face instead of a spoofed one.
I implemented depth capture by following this documentation and I made sure that I get depth map instead of disparity map
https://developer.apple.com/documentation/avfoundation/additional_data_capture/capturing_photos_with_depth
My next approach was to use NCNN framework to implement anti-spoofing by using the model used in the Mini-vision android anti-spoofing sample. I rewrote their library in iOS by using the objective C++ wrapper for C++ as the sample was only available for android app. And I tested by feeding 80x80 UI-Image in a open cv matrix format it's accurracy is less than the android one.
How can I solve this problem.
Description
As of iOS 18, AVAudioSession.setPreferredIOBufferDuration ignores the requested buffer size when Sound Recognition or Vocal Shortcuts is enabled. This results in 1) much larger buffer sizes and 2) mismatched buffer sizes between input and output buffers, which causes ‘glitchy’ audio and increased latency.
Additionally, when this issue occurs AVAudioSession.setPreferredIOBufferDuration continues to return ‘true’ and no error is produced.
Steps to Reproduce:
Enable Vocal Shortcuts on a device running iOS 18. Enable at least one shortcut (e.g. Control Center).
Open or clone the example project (https://github.com/cwalo/SoundRecognitionBug)
Build and install the example project
Attach a headset and launch the application
Observe console logs showing
a requested buffer size of 0.005805 (256 samples @ 48k)
an actual buffer size of 0.023220 (1104 samples @48k - this is regularly the resulting buffer size in all of our tests)
Quit the app and detach the headset. Enable mutesOutput in AudioSystem.mm (to avoid feedback)
Launch the application
Observe
Same result from step 4
Mismatched hardware buffer size of 1104 and recorded frame count of 1024
Mismatched playbackCount and recordCount
Quit the app and disable vocal shortcuts
Launch the app
Observe IOBufferDuration matching the requested duration and matched buffer sizes (expected behavior)
Expected results:
Requested IOBufferDuration is respected or AVAudioSession returns false or error is produced
Input and output buffer sizes match
Device(s): iPhone 11 Pro, iPad Pro
OS: iOS 18.0.1
Environment: Xcode 16.1
FB: FB15715421
Related to: https://forums.developer.apple.com/forums/thread/765477
For an upcoming update of one of my apps, I’m facing an issue:
The .rate parameter of a AVAudioUnitTimePitch allows me to slow down an audio track without any issues: setting .rate to 0.7 or 0.8 results in an almost perfect playback without changing pitch.
However, whenever the .rate parameter is greater than 1 (e.g. 1.1 or 1.15), I’m starting to hear audio artifacts (“flattering”) in the audio output which is not so nice (even at .overlap = 32).
Intuitively, I’d’ve thought that speeding up the file should contain less artifacts than slowing it down??
I’ve tried different sample rates (44.1 kHz and 48 kHz), but same result.
Grateful for any input on this 🙏
Capturing more than one display is no longer working with macOS Sequoia.
We have a product that allows users to capture up to 2 displays/screens. Our application is using gstreamer which in turn is based on AVFoundation.
I found a quick way to replicate the issue by just running 2 captures from separate terminals. Assuming display 1 has device index 0, and display 2 has device index 1, here are the steps:
install gstreamer with
brew install gstreamer
Then open 2 terminal windows and launch the following processes:
terminal 1 (device-index:0):
gst-launch-1.0 avfvideosrc -e device-index=0 capture-screen=true ! queue ! videoscale ! video/x-raw,width=640,height=360 ! videoconvert ! osxvideosink
terminal 2 (device-index:1):
gst-launch-1.0 avfvideosrc -e device-index=1 capture-screen=true ! queue ! videoscale ! video/x-raw,width=640,height=360 ! videoconvert ! osxvideosink
The first process that is launched will show the screen, the second process launched will not.
Testing this on macOS Ventura and Sonoma works as expected, showing both screens.
I submitted the same issue on Feedback Assistant: FB15900976
Issue Description
When playing certain MIDI files using AVMIDIPlayer, the initial volume settings for individual tracks are being ignored during the first playback. This results in all tracks playing at the same volume level, regardless of their specified volume settings in the MIDI file.
Steps to Reproduce
Load a MIDI file that contains different volume settings for multiple tracks
Start playback using AVMIDIPlayer
Observe that all tracks play at the same volume level, ignoring their individual volume settings
Current Behavior
All tracks play at the same volume level during initial playback
Track volume settings specified in the MIDI file are not being respected
This behavior consistently occurs on first playback of affected MIDI files
Expected Behavior
Each track should play at its specified volume level from the beginning
Volume settings in the MIDI file should be respected from the first playback
Workaround
I discovered that the correct volume settings can be restored by:
Starting playback of the MIDI file
Setting the currentPosition property to (current time - 1 second)
After this operation, all tracks play at their intended volume levels
However, this is not an ideal solution as it requires manual intervention and may affect the playback experience.
Questions
Is there a way to ensure the track volume settings are respected during the initial playback?
Is this a known issue with AVMIDIPlayer?
Are there any configuration settings or alternative approaches that could resolve this issue?
Technical Details
iOS Version: 18.1.1 (22B91)
Xcode Version: 16.1 (16B40)