Render advanced 3D graphics and perform data-parallel computations using graphics processors using Metal.

Metal Documentation

Posts under Metal subtopic

Post

Replies

Boosts

Views

Activity

Metal texture allocated size versus actual image data size
Hello. In the iOS app i'm working on we are very tight on memory budget and I was looking at ways to reduce our texture memory usage. However I noticed that comparing ASTC8x8 to ASTC12x12, there is no actual difference in allocated memory for most of our textures despite ASTC12x12 having less than half the bpp of 8x8. The difference between the two only becomes apparent for textures 1024x1024 and larger, and even in that case the actual texture data is sometimes only 60% of the allocation size. I understand there must be some alignment and padding going on, but this seems extreme. For an example scene in my app with astc12x12 for most textures there is over a 100mb difference in astc size on disk versus when loaded, so I would love to be able to recover even a portion of that memory. Here is some test code with some measurements i've taken using an iphone 11: for(int i = 0; i < 11; i++) { MTLTextureDescriptor *texDesc = [[MTLTextureDescriptor alloc] init]; texDesc.pixelFormat = MTLPixelFormatASTC_12x12_LDR; int dim = 12; int n = 2 << i; int mips = i+1; texDesc.width = n; texDesc.height = n; texDesc.mipmapLevelCount = mips; texDesc.resourceOptions = MTLResourceStorageModeShared; texDesc.usage = MTLTextureUsageShaderRead; // Calculate the equivalent astc texture size int blocks = 0; if(mips == 1) { blocks = n/dim + (n%dim>0? 1 : 0); blocks *= blocks; } else { for(int j = 0; j < mips; j++) { int a = 2 << j; int cur = a/dim + (a%dim>0? 1 : 0); blocks += cur*cur; } } auto tex = [objCObj newTextureWithDescriptor:texDesc]; printf("%dx%d, mips %d, Astc: %d, Metal: %d\n", n, n, mips, blocks*16, (int)tex.allocatedSize); } MTLPixelFormatASTC_12x12_LDR 128x128, mips 7, Astc: 2768, Metal: 6016 256x256, mips 8, Astc: 10512, Metal: 32768 512x512, mips 9, Astc: 40096, Metal: 98304 1024x1024, mips 10, Astc: 158432, Metal: 262144 128x128, mips 1, Astc: 1936, Metal: 4096 256x256, mips 1, Astc: 7744, Metal: 16384 512x512, mips 1, Astc: 29584, Metal: 65536 1024x1024, mips 1, Astc: 118336, Metal: 147456 MTLPixelFormatASTC_8x8_LDR 128x128, mips 7, Astc: 5488, Metal: 6016 256x256, mips 8, Astc: 21872, Metal: 32768 512x512, mips 9, Astc: 87408, Metal: 98304 1024x1024, mips 10, Astc: 349552, Metal: 360448 128x128, mips 1, Astc: 4096, Metal: 4096 256x256, mips 1, Astc: 16384, Metal: 16384 512x512, mips 1, Astc: 65536, Metal: 65536 1024x1024, mips 1, Astc: 262144, Metal: 262144 I also tried using MTLHeaps (placement and automatic) hoping they might be better, but saw nearly the same numbers. Is there any way to have metal allocate these textures in a more compact way to save on memory?
8
0
2.7k
Mar ’25
Metal and Swift Concurrency
Hi, Introducing Swift Concurrency to my Metal app has been a bit challenging as Swift Concurrency is limited by the cooperative thread pool. GPU work is obviously not CPU bound and can block forward moving progress, especially when using waitUntilCompleted on the command buffer. For concurrent render work this has the potential of under utilizing the CPU and even creating dead locks. My question is, what is the Metal's teams general recommendation when it comes to concurrency? It seems to me that Dispatch or OperationQueues are still the preferred way for Metal bound tasks in order to gain maximum performance? To integrate with Swift Concurrency my idea is to use continuations that kick off render jobs via Dispatch or Queues? Would this be the best solution to bridge async tasks with Metal work? Thanks!
5
0
1k
Apr ’25
MTKTextureLoader loading texture error on visionOS2.0
hello everyone. I got a texture loading error on visionOS 2.0: Can't create texture(Error Domain=MTKTextureLoaderErrorDomain Code=0 "Pixel format(MTLPixelFormatInvalid) is not valid on this device" UserInfo={NSLocalizedDescription=Pixel format(MTLPixelFormatInvalid) is not valid on this device, MTKTextureLoaderErrorKey=Pixel format(MTLPixelFormatInvalid) is not valid on this device} But this texture can load correctly on visionOS1.3. I don't know what happen between visionOS1.3 and visionOS2.0. The texture is a ktx file which stores cubemap that encoding in astc6x6hdr. And the ktx texture has a glInternalFormat info: GL_COMPRESSED_RGBA_ASTC_6x6. I wonder if visionOS2.0 no longer supports astc6x6hdr cubemap format, or there is something wrong with my assets.
1
0
502
Oct ’24
Can't profile Metal on Apple TV
Hi, I can capture a frame on the Apple TV, but when I try to profile the capture for GPU timing information, I got "Abort Trap 6" error and with following error in the report: Exception Type: EXC_CRASH (SIGABRT) Exception Codes: 0x0000000000000000, 0x0000000000000000 Triggered by Thread: 7 Application Specific Information: abort() called Last Exception Backtrace: 0 CoreFoundation 0x18c0a99d0 __exceptionPreprocess + 160 1 libobjc.A.dylib 0x18b596d24 objc_exception_throw + 71 2 CoreFoundation 0x18bfa7308 -[__NSArrayM insertObject:atIndex:] + 1239 3 MTLReplayController 0x101f5d148 DYMTLReplayFrameProfiler_loadAnalysis + 1140 4 MTLReplayController 0x101e97f90 GTMTLReplayClient_collectGPUShaderTimelineData + 224 5 MTLReplayController 0x101e81794 __30-[GTMTLReplayService profile:]_block_invoke_4 + 288 6 Foundation 0x18eb6072c __NSOPERATION_IS_INVOKING_MAIN__ + 11 7 Foundation 0x18eb5cc1c -[NSOperation start] + 623 8 Foundation 0x18eb60edc __NSOPERATIONQUEUE_IS_STARTING_AN_OPERATION__ + 11 9 Foundation 0x18eb60bc4 __NSOQSchedule_f + 167 10 libdispatch.dylib 0x18b8d6a84 _dispatch_block_async_invoke2 + 103 11 libdispatch.dylib 0x18b8c9420 _dispatch_client_callout + 15 12 libdispatch.dylib 0x18b8cc5d0 _dispatch_continuation_pop + 531 13 libdispatch.dylib 0x18b8cbcd4 _dispatch_async_redirect_invoke + 635 14 libdispatch.dylib 0x18b8d9224 _dispatch_root_queue_drain + 335 15 libdispatch.dylib 0x18b8d9a08 _dispatch_worker_thread2 + 163 16 libsystem_pthread.dylib 0x18b6e652c _pthread_wqthread + 223 17 libsystem_pthread.dylib 0x18b6ed8d0 start_wqthread + 7 It's Xcode 16.0 + Apply TV 4K (4th Gen) tvOS 18, does anyone know what's the cause of this error and is there any solution for it? Thank you very much, Kai
0
0
335
Oct ’24
Getting stuck in first frame of renderLoop.
First I get this ar_world_tracking_provider_query_device_anchor_at_timestamp <0x302b9c0a0>: The device_anchor can only be queried when the world tracking provider is running. This seemed to all break with the auto-update to 2.0.1. Simulator runs the code fine. I seem to see an infinite stall here frameLayer.endUpdate() // Pace frames by waiting for the optimal prediction time. try await LayerRenderer.Clock().sleep(until: timing.optimalInputTime, tolerance: nil) // Start submitting the updated frame. frameLayer.startSubmission() <-
0
0
504
Oct ’24
Strange Metal related shader issue
Hi everyone, I encountered a very strange shader bug that seems related to Metal only (not OpenGL). You can find the full description of the issue on the Babylon.js forums here: https://forum.babylonjs.com/t/strange-shader-related-issue-on-macos-with-safari-and-chrome-not-firefox/54289 (sorry, I couldn't post a clickable link here as this seems to be blocked here). I have a workaround to fix the issue (as described in the link above), but this really looks like an issue in Metal itself. Let me know if you need more details or explanations.
0
0
397
Oct ’24
Metal rendering shows incorrect color, gputrace shows fragment function returning correct color
I'm experiencing a strange issue where I'm seeing black in a metal drawable where it should be a different color. When I capture the frame and inspect the returned value from the fragment function, it's correct, but the drawable isn't. This screenshot hopefully illustrates the issue. I've not found any references to similar issues. I saw something about some out of bounds or NaN values being dropped to 0 (which would be black), but the debugger doesn't indicate this is happening.
1
0
474
Oct ’24
Metal Inline Functions
Hi! How to define and call an inline function in Metal? Or simple function that will return some value. Case: inline uint index4D(constant _4D& shape, constant uint& n, constant uint& c, constant uint& h, constant uint& w) { return n * shape.C * shape.H * shape.W + c * shape.H * shape.W + h * shape.W + w; } When I call it in my kernel function I get No matching function for call error. Thx in advance.
2
0
714
Nov ’24
Metal Compute Overhead
Hello, We are experimenting with Metal to accelerate some peculiar numerical computation. Our workloads are relatively small, so the ability to avoid moving data to and from the GPU's memory is very appealing. However, we are observing higher overhead compared to CUDA, which negates the benefits of avoiding data transfer. In our tests using an empty kernel, CUDA completes in 0.001 ms (Intel i7 10700K, RTX 3080), while Metal's waitUntilCompleted takes 0.12 ms (M2 Max). As we do not have prior experience with Metal, we are wondering if we are using the APIs just fine and this timing is expected, or if there is a way to reduce it. Thank you in advance for any comment! test-metal.cpp
0
0
629
Nov ’24
Cannot use Metal graphics overview HUD with multiple CAMetalLayers
I have multiple CAMetalLayers that I render content to and noticed that the graphics overview HUD does not function properly when I have more than one CAMetalLayer. The values reported will be very strange. For example, FPS may report 999 or some large negative value. It the HUD simply not designed to work with multiple CAMetalLayers or MTKViews? When I disable all but one of my CAMetalLayers, the HUD works as expected.
1
0
688
Nov ’24
What does CAMetalLayerWantsCompositingDependencies in Info.plist do?
I've noticed a major third-party app has the following flag set to 1/true in its Info.plist: CAMetalLayerWantsCompositingDependencies Does anyone know if it’s recognized by Core Animation / Metal, and what it’s supposed to do? It might obviously have zero relationship to the OS, defined by that app and for that app... but since it looks very much like an unofficial/undocumented environment setting, it might be great to know what problem it solves. I happen to have issues related to compositing other CALayers over a CAMetalLayer in my app... so this definitely stood out as interesting. Thank you!
0
0
425
Nov ’24
Rendering YCbCr input using Metal
I would like to take YCbCr CVPixelBuffers from AVCaptureVideoDataOutput, apply some processing in RGB space, render to an MTKView, and pass to AVAssetWriter for recording. Right now, I'm doing this all manually – deswing the incoming data if necessary, choose the right matrix to convert to RGB, apply processing, etc. I also have to convert back to YCbCr before feeding the frames to AVAssetWriter because encoding performs much better if I do. Is there any efficient, built-in way to achieve the same? I can't use AVCaptureVideoPreviewLayer, since I need to do some further processing before display. I can't use AVCaptureVideoDataOutput's videoSettings to get automatic BGRA conversion because that would lose bit depth for 10 bit video formats (and isn't available on all formats anyway). I see these Accelerate functions, but they seemingly don't use the GPU, nor do they support all the formats and bit depths I'd need. I found reference to some undocumented MTLPixelFormats that seem to do exactly what I want, but I don't want to rely on something like this unless it's explicitly endorsed. This would also incur an RGB/YCbCr conversion on every texture read and write, right? Is there anything I'm missing here?
0
0
542
Nov ’24
D3DMetal unsupported CheckFeatureSupport query 53 while running simple vulkaninfo using Mesa 24.3 Dozen (Vulkanon12) driver..
Hi, wanted to test if possible to use Mesa3D Dozen driver(Vulkan on D3D12 )+D3DMetal 2b3 to get maybe better Vulkan driver on Wine than default MoltenVK.. this will support Vulkan windows apps via using D3D12Metal.. using vulkan_dzn.dll,dzn_icd.x86_64.json,dxil.dll from x64 folder from: https://github.com/pal1000/mesa-dist-win/releases/download/24.3.0-rc1/mesa3d-24.3.0-rc1-release-msvc.7z using simple vulkaninfo app and running like: wine64 vulkaninfo I get error: [D3DMetal:LOG:2A825] Unsupported API: CheckFeatureSupport, unhandled support query 53 also seems D3DMetal Wine integration on Whisky doesn't expose d3d12core.dll and d3d12.dll like new Agility D3D12 dlls or VKD3D, so getting: MESA: error: Failed to retrieve D3D12GetInterface MESA: error: Failed to load DXCore but anyways seems to try to load the driver as: WARNING: dzn is not a conformant Vulkan implementation, testing use only. full log: MESA: error: Failed to retrieve D3D12GetInterface MESA: error: Failed to load DXCore WARNING: dzn is not a conformant Vulkan implementation, testing use only. [D3DMetal:LOG:2A825] Unsupported API: CheckFeatureSupport, unhandled support query 53 00bc:fixme:dcomp:DCompositionCreateDevice 0000000000000000, {c37ea93a-e7aa-450d-b16f-9746cb0407f3}, 000000000011E328. MESA: error: Failed to load DXCore WARNING: dzn is not a conformant Vulkan implementation, testing use only. [D3DMetal:LOG:2A825] Unsupported API: CheckFeatureSupport, unhandled support query 53 00bc:fixme:dcomp:DCompositionCreateDevice 0000000000000000, {c37ea93a-e7aa-450d-b16f-9746cb0407f3}, 000000000011E578. ERROR: [Loader Message] Code 0 : setup_loader_term_phys_devs: Call to 'vkEnumeratePhysicalDevices' in ICD c:\windows\system32\.\vulkan_dzn.dll failed with error code -3 ERROR: [Loader Message] Code 0 : setup_loader_term_phys_devs: Failed to detect any valid GPUs in the current config ERROR at C:\j\msdk0\build\Khronos-Tools\repo\vulkaninfo\vulkaninfo.h:241:vkEnumeratePhysicalDevices failed with ERROR_INITIALIZATION_FAILED
0
0
776
Nov ’24
MTKView draw method causes EXC_BAD_ACCESS crash
Hello, I am using MTKView to display: camera preview & video playback. I am testing on iPhone 16. App crashes at a random moment whenever MTKView is rendering CIImage. MetalView: public enum MetalActionType { case image(CIImage) case buffer(CVPixelBuffer) } public struct MetalView: UIViewRepresentable { let mtkView = MTKView() public let actionPublisher: any Publisher<MetalActionType, Never> public func makeCoordinator() -> Coordinator { Coordinator(self) } public func makeUIView(context: UIViewRepresentableContext<MetalView>) -> MTKView { guard let metalDevice = MTLCreateSystemDefaultDevice() else { return mtkView } mtkView.device = metalDevice mtkView.framebufferOnly = false mtkView.clearColor = MTLClearColor(red: 0, green: 0, blue: 0, alpha: 0) mtkView.drawableSize = mtkView.frame.size mtkView.delegate = context.coordinator mtkView.isPaused = true mtkView.enableSetNeedsDisplay = true mtkView.preferredFramesPerSecond = 60 context.coordinator.ciContext = CIContext( mtlDevice: metalDevice, options: [.priorityRequestLow: true, .highQualityDownsample: false]) context.coordinator.metalCommandQueue = metalDevice.makeCommandQueue() context.coordinator.actionSubscriber = actionPublisher.sink { type in switch type { case .buffer(let pixelBuffer): context.coordinator.updateCIImage(pixelBuffer) break case .image(let image): context.coordinator.updateCIImage(image) break } } return mtkView } public func updateUIView(_ nsView: MTKView, context: UIViewRepresentableContext<MetalView>) { } public class Coordinator: NSObject, MTKViewDelegate { var parent: MetalView var metalCommandQueue: MTLCommandQueue! var ciContext: CIContext! private var image: CIImage? { didSet { Task { @MainActor in self.parent.mtkView.setNeedsDisplay() //<--- call Draw method } } } var actionSubscriber: (any Combine.Cancellable)? private let operationQueue = OperationQueue() init(_ parent: MetalView) { self.parent = parent operationQueue.qualityOfService = .background super.init() } public func mtkView(_ view: MTKView, drawableSizeWillChange size: CGSize) { } public func draw(in view: MTKView) { guard let drawable = view.currentDrawable, let ciImage = image, let commandBuffer = metalCommandQueue.makeCommandBuffer(), let ci = ciContext else { return } //making sure nothing is nil, now we can add the current frame to the operationQueue for processing operationQueue.addOperation( MetalOperation( drawable: drawable, drawableSize: view.drawableSize, ciImage: ciImage, commandBuffer: commandBuffer, pixelFormat: view.colorPixelFormat, ciContext: ci)) } //consumed by Subscriber func updateCIImage(_ img: CIImage) { image = img } //consumed by Subscriber func updateCIImage(_ buffer: CVPixelBuffer) { image = CIImage(cvPixelBuffer: buffer) } } } now the MetalOperation class: private class MetalOperation: Operation, @unchecked Sendable { let drawable: CAMetalDrawable let drawableSize: CGSize let ciImage: CIImage let commandBuffer: MTLCommandBuffer let pixelFormat: MTLPixelFormat let ciContext: CIContext init( drawable: CAMetalDrawable, drawableSize: CGSize, ciImage: CIImage, commandBuffer: MTLCommandBuffer, pixelFormat: MTLPixelFormat, ciContext: CIContext ) { self.drawable = drawable self.drawableSize = drawableSize self.ciImage = ciImage self.commandBuffer = commandBuffer self.pixelFormat = pixelFormat self.ciContext = ciContext } override func main() { let width = Int(drawableSize.width) let height = Int(drawableSize.height) let ciWidth = Int(ciImage.extent.width) //<-- Thread 22: EXC_BAD_ACCESS (code=1, address=0x5e71f5490) A bad access to memory terminated the process. let ciHeight = Int(ciImage.extent.height) let destination = CIRenderDestination( width: width, height: height, pixelFormat: pixelFormat, commandBuffer: commandBuffer, mtlTextureProvider: { [self] () -> MTLTexture in return drawable.texture }) let transform = CGAffineTransform( scaleX: CGFloat(width) / CGFloat(ciWidth), y: CGFloat(height) / CGFloat(ciHeight)) do { try ciContext.startTask(toClear: destination) try ciContext.startTask(toRender: ciImage.transformed(by: transform), to: destination) } catch { } commandBuffer.present(drawable) commandBuffer.commit() commandBuffer.waitUntilCompleted() } } Now I am no Metal expert, but I believe it's a very simple execution that shouldn't cause memory leak especially after we have already checked for whether CIImage is nil or not. I have also tried running this code without OperationQueue and also tried with @autoreleasepool but none of them has solved this problem. Am I missing something?
1
0
736
Dec ’24
Jurassic World Evolution 2 Likely Fails Due to Missing Tiled Resources Support
I’ve been trying to run Jurassic World Evolution 2 using the Game Porting Toolkit on macOS, but the game doesn’t launch and crashes immediately. Based on the error and research, it seems the issue is related to missing support for D3D12_TILED_RESOURCES_TIER_2 in the Metal API. If this is the case, does anyone know if support for tiled resources is planned for future updates of the toolkit? Or are there any potential workarounds for bypassing this limitation?
1
0
695
Dec ’24