Render advanced 3D graphics and perform data-parallel computations using graphics processors using Metal.

Metal Documentation

Posts under Metal tag

278 Posts
Sort by:
Post not yet marked as solved
0 Replies
10 Views
[UE] Assertion failed: Binding.index < ML_MaxTextures [File:./Runtime/Apple/MetalRHI/Private/MetalPipeline.cpp] [Line: 563] Metal texture index exceeded! How can I solve it?
Posted
by jaycecai.
Last updated
.
Post not yet marked as solved
1 Replies
434 Views
Hi there, I have some existing metal rendering / shader views that I would like to use to present stereoscopic content on the Vision Pro. Is there a metal shader function / variable that lets me know which eye we're currently rendering to inside my shader? Something like Unity's unity_StereoEyeIndex? I know RealityKit has GeometrySwitchCameraIndex, so I want something similar (but outside of a RealityKit context). Many thanks, Rich
Posted
by RichLogan.
Last updated
.
Post not yet marked as solved
5 Replies
1.2k Views
What is the most efficient way to use a MTLTexture (created procedurally at run-time) as a RealityKit TextureResource? I update the MTLTexture per-frame using regular Metal rendering, so it’s not something I can do offline. Is there a way to wrap it without doing a copy? A specific example would be great. Thank you!
Posted Last updated
.
Post not yet marked as solved
1 Replies
51 Views
I'm implementing a bitonic sort in Metal with a Swift app. This requires 100's kernel dispatch calls for each of the swap stages which touch the whole array, the work required by the GPU is small. I haven't been able to get this to run fast enough in Swift and it seems its due to a high overhead for each dispatchThread command. I rewrote the test program in Objective C with a super-simple kernel function and its runs 25x faster from Objective C! Kernel function kernel void fill(device uint8_t *array [[buffer(0)]], const device uint32_t &N [[buffer(1)]], const device uint8_t &value [[buffer(2)]], uint i [[thread_position_in_grid]]) { if (i < N) { array[i] = value; } } The Swift code is: func fill(pso:MTLComputePipelineState, buffer:MTLBuffer, N: Int, passes: Int) { guard let commandBuffer = commandQueue.makeCommandBuffer() else { return } let gridSize = MTLSizeMake(N, 1, 1) var threadGroupSize = pso.maxTotalThreadsPerThreadgroup if (threadGroupSize > N) { threadGroupSize = N; } let threadgroupSize = MTLSizeMake(threadGroupSize, 1, 1); for pass in 0..<passes { guard let computeEncoder = commandBuffer.makeComputeCommandEncoder() else { return } var value:UInt8 = UInt8(pass); var NN:UInt32 = UInt32(N); computeEncoder.setComputePipelineState(pso) computeEncoder.setBuffer(buffer, offset: 0, index: 0) computeEncoder.setBytes(&NN, length: MemoryLayout<UInt32>.size, index: 1) computeEncoder.setBytes(&value, length: MemoryLayout<UInt8>.size, index: 2) computeEncoder.dispatchThreadgroups(gridSize, threadsPerThreadgroup: threadgroupSize) computeEncoder.endEncoding() } commandBuffer.commit() commandBuffer.waitUntilCompleted() } let device = MTLCreateSystemDefaultDevice()! let library = device.makeDefaultLibrary()! let commandQueue = device.makeCommandQueue()! let funcFill = library.makeFunction(name: "fill")! let pso = try? device.makeComputePipelineState(function: funcFill) var N = 16384 let passes = 100 let buffer = device.makeBuffer(length:N, options: [.storageModePrivate])! for _ in 1...10 { let startTime = DispatchTime.now() fill(pso:pso!, buffer:buffer, N:N, passes:passes) let endTime = DispatchTime.now() let elapsedTime = endTime.uptimeNanoseconds - startTime.uptimeNanoseconds print("Elapsed time:", Float(elapsedTime)/1_000_000, "ms"); } and the Objective C code (which should be almost identical) is void fill(id<MTLCommandQueue> commandQueue, id<MTLComputePipelineState> funcPSO, id<MTLBuffer> A, uint32_t N, int passes) { id<MTLCommandBuffer> commandBuffer = [commandQueue commandBuffer]; MTLSize gridSize = MTLSizeMake(N, 1, 1); NSUInteger threadGroupSize = funcPSO.maxTotalThreadsPerThreadgroup; if (threadGroupSize > N) { threadGroupSize = N; } MTLSize threadgroupSize = MTLSizeMake(threadGroupSize, 1, 1); for(uint8_t pass=0; pass<passes; pass++) { id<MTLComputeCommandEncoder> computeEncoder = [commandBuffer computeCommandEncoder]; [computeEncoder setComputePipelineState:funcPSO]; [computeEncoder setBuffer:A offset:0 atIndex:0]; [computeEncoder setBytes:&N length:sizeof(uint32_t) atIndex:1]; [computeEncoder setBytes:&pass length:sizeof(uint8_t) atIndex:2]; [computeEncoder dispatchThreads:gridSize threadsPerThreadgroup:threadgroupSize]; [computeEncoder endEncoding]; } [commandBuffer commit]; [commandBuffer waitUntilCompleted]; } int main() { NSError *error; id<MTLDevice> device = MTLCreateSystemDefaultDevice(); id<MTLLibrary> library = [device newDefaultLibrary]; id<MTLCommandQueue> commandQueue = [device newCommandQueue]; id<MTLFunction> funcFill = [library newFunctionWithName:@"fill"]; id<MTLComputePipelineState> pso = [device newComputePipelineStateWithFunction:funcFill error:&error]; // Prepare data int N = 16384; int passes = 100; id<MTLBuffer> bufferA = [device newBufferWithLength:N options:MTLResourceStorageModePrivate]; for(int it=1; it<=10; it++) { CFTimeInterval startTime = CFAbsoluteTimeGetCurrent(); fill(commandQueue, pso, bufferA, N, passes); CFTimeInterval duration = CFAbsoluteTimeGetCurrent() - startTime; NSLog(@"Elapsed time: %.1f ms", 1000*duration); } } The Swift output is: Elapsed time: 89.35556 ms Elapsed time: 63.243744 ms Elapsed time: 62.39568 ms Elapsed time: 62.183224 ms Elapsed time: 63.741913 ms Elapsed time: 63.59463 ms Elapsed time: 62.378654 ms Elapsed time: 61.746098 ms Elapsed time: 61.530384 ms Elapsed time: 60.88774 ms The objective C output is 2024-04-18 19:27:45.704 compute_test[3489:92754] Elapsed time: 3.6 ms 2024-04-18 19:27:45.706 compute_test[3489:92754] Elapsed time: 2.6 ms 2024-04-18 19:27:45.709 compute_test[3489:92754] Elapsed time: 2.6 ms 2024-04-18 19:27:45.712 compute_test[3489:92754] Elapsed time: 2.6 ms 2024-04-18 19:27:45.714 compute_test[3489:92754] Elapsed time: 2.7 ms 2024-04-18 19:27:45.717 compute_test[3489:92754] Elapsed time: 2.8 ms 2024-04-18 19:27:45.720 compute_test[3489:92754] Elapsed time: 2.8 ms 2024-04-18 19:27:45.723 compute_test[3489:92754] Elapsed time: 2.7 ms 2024-04-18 19:27:45.726 compute_test[3489:92754] Elapsed time: 2.5 ms 2024-04-18 19:27:45.728 compute_test[3489:92754] Elapsed time: 2.5 ms I compile the Swift code for Release, optimised for speed. I can't believe there should be a difference here, so what could be different, and what might I be doing wrong? thanks Adrian
Posted Last updated
.
Post not yet marked as solved
1 Replies
219 Views
The Drawing fully immersive content using Metal guide describes how to use Metal for visionOS immersive experiences, but seemingly requires swift to bring up the required CompositorLayer. @main struct MyApp: App { var body: some Scene { ImmersiveSpace(id: "MyContent") { CompositorLayer { layerRenderer in let renderThread = Thread { let engine = myEngineCreate(layerRenderer) myEngineRenderLoop(engine) } renderThread.name = "Render Thread" renderThread.start() } } } The ImmersiveSpace scene can presumably be replaced with a call to [UIApplication.sharedApplication activateSceneSessionForRequest:[UISceneSessionActivationRequest requestWithRole:UISceneSessionRoleImmersiveSpaceApplication] errorHandler:nil] But is there a replacement for CompositorLayer? Or some other way to produce a cp_layer_renderer? Perhaps it would be possible to write a small swift helper for this, but given the swift interface for CompositorLayer how would that be tied to an existing UIScene as created above? @available(visionOS 1.0, *) public struct CompositorLayer : SwiftUI.ImmersiveSpaceContent { public init(configuration: any _CompositorServices_SwiftUI.CompositorLayerConfiguration = .default, renderer: @escaping (CompositorServices.LayerRenderer) -> Swift.Void) public var body: Swift.Never { get } public typealias Body = Swift.Never }
Posted
by torarnv.
Last updated
.
Post not yet marked as solved
1 Replies
116 Views
as the title said, I want to sample the current pixel in the layer and if the color is the same as one in the parameter, I need to find a way to return that position back to SwiftUI. Do you think that's possible? in Shader.Argument I didn't find a way to directly convert to a pointer.
Posted Last updated
.
Post not yet marked as solved
1 Replies
83 Views
Is it possible to set the header include paths when compiling a metal library from source at runtime? I'm dynamically generating kernel source that I'm compiling at runtime. I want to include some functions I have defined in a header. If I was doing online computation, I can specify include paths using -I /path/to/include using the command. xcrun metal -c -I /path/to/include example.metal -o example.air However when I'm online, it doesn't appear possible to define a header search path. MTLCompileOptions *options = [MTLCompileOptions new]; options.fastMathEnabled = NO; library = [device newLibraryWithSource:kernel_source options:options error:&error]; Is there a way to specify a header search path using newLibraryWithSource?
Posted
by Cianciosa.
Last updated
.
Post not yet marked as solved
0 Replies
76 Views
Where do I start with this error? I am using the Metal Debugger and have.a bunch of stuck command buffers. how do I look at the command buffers to see the errors? My suspicion is that the cause is some sort of memory leak. Not having access to the source for Metal leaves me stuck. The following message shows up in the logging pane of the execution. Execution of the command buffer was aborted due to an error during execution. Ignored (for causing prior/excessive GPU errors) (00000004:kIOGPUCommandBufferCallbackErrorSubmissionsIgnored) Type: Error | Timestamp: 2024-04-11 14:16:13.464336-05:00 | Library: Metal | Subsystem: Metal | Category: Default | TID: 0x2a0b8c I just need some guidance
Posted
by mfstanton.
Last updated
.
Post not yet marked as solved
1 Replies
156 Views
I'm brand new to Metal. I've googled, but can't get the right answer to come up. (Thanks, unhelpful ChatGPT generated answers polluting everything, but I digress...) Ultimately, I'm trying to figure out how to use Metal to render 3D DICOM data on iOS specifically. If you're not familiar with DICOM, let's just say I've got a whole stack of CT image slices. Or to get really simple, I've got a cube of voxel values with differing values at each voxel coordinate. Where do I even start in Metal to render something like this? (I was trying to get the VTK toolkit compiled for iOS, which uses OpenGL, but that appears to be a dead end. And besides, Metal is supposed to be so much better.) Thanks for any tips/leads/suggestions/general pointers.
Posted Last updated
.
Post not yet marked as solved
4 Replies
687 Views
Hi there - Where would a dev go these days to get an initial understanding of SceneKit? The WWDC videos linked in various places seem to be gone?! For example, the SceneKit page at developer.apple.com lists features a session videos link that comes up without any result, https://developer.apple.com/scenekit/ Any advice..? Cheers, Jan
Posted
by JayMcBee.
Last updated
.
Post not yet marked as solved
0 Replies
120 Views
Hello all We would like to use AMD's FidelityFx Downsampler in our custom game engine and we are having difficulties to correctly implement it for Metal due to its use of the globallycoherent keyword. We have done extensive search online but have not succeeded in finding an answer. What we have found is the largely undocumented 'volatile' keyword, so we were hypothesising that marking a texture with 'volatile' (which implies 'device volatile' since it's a texture) could have the same effect but we are far from convinced it would work. Does anyone have insights into this?
Posted Last updated
.
Post not yet marked as solved
1 Replies
117 Views
I have a fbo generated by last frame, which contain a color texture and a depth texture. Then i want blit that fbo to current MTKView, but failed. Thanks a lot for any suggestions. Here is the code: last frame: id <MTLTexture> src_color; id <MTLTexture> src_depth; drawTo(src_color, src_depth); current frame: first, init a depth texture if null: id<MTLTexture> depth_texture_; if(depth_texture_) { depth_texture_ = [device_ newTextureWithDescriptor:desc]; } second, create desc for encoder: current_desc = metal_view_.currentRenderPassDescriptor; current_desc.depthAttachment.texture = depth_texture_; current_desc.depthAttachment.loadAction = MTLLoadActionClear; current_desc.depthAttachment.clearDepth = 1.0; current_desc.stencilAttachment.texture = depth_texture_; current_desc.stencilAttachment.loadAction = MTLLoadActionClear; current_desc.stencilAttachment.clearStencil = 0; current_desc.colorAttachments[0].loadAction = MTLLoadActionClear; current_desc.colorAttachments[0].clearColor = MTLClearColorMake(1.0, 1, 1, 1.0); third, blit to current MTKView: auto dst_color = current_desc.colorAttachments[0].texture; auto dst_depth = current_desc.depthAttachment.texture; id<MTLCommandBuffer> commandBuffer = [command_queue_ commandBuffer]; commandBuffer.label = @"blit"; id <MTLBlitCommandEncoder> blitEncoder = [commandBuffer blitCommandEncoder]; [blitEncoder copyFromTexture:src_color sourceSlice:0 sourceLevel:0 sourceOrigin:MTLOriginMake(srcRect.x, srcRect.y, 0) sourceSize:MTLSizeMake(srcRect.width, srcRect.height, 1) toTexture:dst_color destinationSlice:0 destinationLevel:0 destinationOrigin:MTLOriginMake(dstRect.x, dstRect.y, 0)]; [blitEncoder copyFromTexture:src_depth sourceSlice:0 sourceLevel:0 sourceOrigin:MTLOriginMake(srcRect.x, srcRect.y, 0) sourceSize:MTLSizeMake(srcRect.width, srcRect.height, 1) toTexture:dst_depth destinationSlice:0 destinationLevel:0 destinationOrigin:MTLOriginMake(dstRect.x, dstRect.y, 0)]; [blitEncoder endEncoding]; [commandBuffer commit]; [commandBuffer waitUntilCompleted];
Posted Last updated
.
Post not yet marked as solved
0 Replies
125 Views
I'm trying to follow the metal-cpp tutorials I've found at https://developer.apple.com/metal/sample-code/?q=learn The program seems to be launching correctly (I can see the menu bar and interact with it), but nothing is rendered inside the window. I suppose the culprit is somewhere in the following function (I see it binds the device, the view and the window with the object in charge of drawing stuff in the view) void core::Application::applicationDidFinishLaunching(NS::Notification *pNotification) { CGRect frame = (CGRect){{100.0, 100.0}, {512.0, 512.0}}; m_Window->init(frame, NS::WindowStyleMaskClosable | NS::WindowStyleMaskTitled, NS::BackingStoreBuffered, false); m_Device = MTL::CreateSystemDefaultDevice(); m_View = MTK::View::alloc()->init(frame, m_Device); m_View->setColorPixelFormat(MTL::PixelFormat::PixelFormatBGRA8Unorm); m_View->setClearColor(MTL::ClearColor::Make(1.0, 0.0, 0.0, 1.0)); m_ViewDelegate = new graphics::ViewDelegate(m_Device); m_View->setDelegate(m_ViewDelegate); m_Window->setContentView(m_View); m_Window->setTitle(NS::String::string("Template 1", NS::StringEncoding::UTF8StringEncoding)); m_Window->makeKeyAndOrderFront(nullptr); NS::Application* nsApp = reinterpret_cast<NS::Application*>(pNotification->object()); nsApp->activateIgnoringOtherApps(true); } but, as you can infer from the fact that I'm failing at the very first tutorial of the bunch, I'm quite lost. I've tried debugging the app with the Xcode debugger and I saw that it never enters in this function. void ViewDelegate::drawInMTKView(MTK::View *pView) { m_Renderer->Draw(pView); } Can it be a symptom of some call missing from my code? Thank you in advance for your help
Posted
by p_Each.
Last updated
.
Post not yet marked as solved
0 Replies
164 Views
Hi, I have a CUDA program that I want to convert to Metal Compute so that we can support Apple hardware. When I wrote the CUDA version, I was able to write efficient code because I learned first about the Cuda-core architecture. The way the cores can access memory for instance is very important information so that I could write code that efficiently access the memory. Now I want to do the same for the Metal Compute software. But I can not find any information about the low level architecture and especially the things you should know to be able to write efficient code. Do I miss something? Is there some guide giving hints for the most efficient way to access memory for instance?
Posted
by simmania.
Last updated
.
Post not yet marked as solved
4 Replies
336 Views
I am working on an application where we are planning to use Metal for directly rendering custom content. When user looks at something on the rendered image, I want to get the position or ray of cursor (the point where the user is currently looking at) to render something else like a crosshair. Is it possible to get the cursor position information on VisionOS to accomplish this? How can I know if something is being hovered on by the eyes?
Posted Last updated
.
Post not yet marked as solved
1 Replies
241 Views
Platform 'METAL' is experimental and not all JAX functionality may be correctly supported! 2024-03-23 22:04:38.947506: W pjrt_plugin/src/mps_client.cc:563] WARNING: JAX Apple GPU support is experimental and not all JAX functionality is correctly supported! Metal device set to: Apple M1 Pro systemMemory: 16.00 GB maxCacheSize: 5.33 GB loc("-":0:0): error: current mps dialect version is 1.0.0, can't parse version 1.1.0 /AppleInternal/Library/BuildRoots/495c257e-668e-11ee-93ce-926038f30c31/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphExecutable.mm:1097: failed assertion `Error importing MLIR bytecode. ' zsh: abort python -c 'import jax; print(jax.numpy.arange(10))'
Posted Last updated
.
Post marked as solved
1 Replies
194 Views
I tried to render to two layers using vertex amplification in my mesh shader program, but in the vision pro only the left eye has content and it contains 2 eyes image, and when I switch the mapping0.renderTargetArrayIndexOffset in the encoder, it does not transfer the image to the right eye. Using vertex amplification achieve 2 eyes rendering?
Posted
by Suan.
Last updated
.
Post marked as solved
1 Replies
201 Views
I've been trying to find a C/C++ framework for apps on macos. I couldn't find good docs on metal. Is there a way to write C++ apps without any other library?
Posted Last updated
.
Post not yet marked as solved
0 Replies
168 Views
in Swift languange, CVMetalTextureCacheCreateTextureFromImage return CVMetalTexture, and CVMetalTexture is Swift class, so. it doesn't need to call CVBufferRelease manually. My question is : should I use a variable to keep strong reference until GPU finished (until addCompleteHandler callback ) ? cvmetaltexturecachecreatetexture
Posted
by ZoGo996.
Last updated
.
Post not yet marked as solved
0 Replies
177 Views
Currently (jax-metal 0.0.5) doesn't support matrix decompositions like Cholesky, LU, or Eigen, on Metal. I have a 64GB M1 Max on Sonoma 14.2.1 (23C71). Eigen raises a NotImplemented error, but Cholesky and jax.linalg.inv error out. When using the CPU by setting export JAX_PLATFORM_NAME=CPU then the decompositions work. The error I am getting: XlaRuntimeError Traceback (most recent call last) Cell In[33], line 1 ----> 1 jsl.cho_factor(Sigma_0_inv) File ~/miniconda3/envs/jaxmetal/lib/python3.10/site-packages/jax/_src/scipy/linalg.py:61, in cho_factor(failed resolving arguments) 56 @_wraps(scipy.linalg.cho_factor, 57 lax_description=_no_overwrite_and_chkfinite_doc, skip_params=('overwrite_a', 'check_finite')) 58 def cho_factor(a: ArrayLike, lower: bool = False, overwrite_a: bool = False, 59 check_finite: bool = True) -> tuple[Array, bool]: 60 del overwrite_a, check_finite # Unused ---> 61 return (cholesky(a, lower=lower), lower) File ~/miniconda3/envs/jaxmetal/lib/python3.10/site-packages/jax/_src/scipy/linalg.py:54, in cholesky(failed resolving arguments) 49 @_wraps(scipy.linalg.cholesky, 50 lax_description=_no_overwrite_and_chkfinite_doc, skip_params=('overwrite_a', 'check_finite')) 51 def cholesky(a: ArrayLike, lower: bool = False, overwrite_a: bool = False, 52 check_finite: bool = True) -> Array: 53 del overwrite_a, check_finite # Unused ---> 54 return _cholesky(a, lower) File ~/miniconda3/envs/jaxmetal/lib/python3.10/site-packages/jax/_src/compiler.py:255, in backend_compile(backend, module, options, host_callbacks) 250 return backend.compile(built_c, compile_options=options, 251 host_callbacks=host_callbacks) 252 # Some backends don't have host_callbacks option yet 253 # TODO(sharadmv): remove this fallback when all backends allow compile 254 # to take in host_callbacks --> 255 return backend.compile(built_c, compile_options=options) XlaRuntimeError: UNKNOWN: /var/folders/9d/0035yr7j3bx84h3ghpp_86pc0000gn/T/ipykernel_40684/3046650730.py:1:0: error: failed to legalize operation 'mhlo.cholesky' /var/folders/9d/0035yr7j3bx84h3ghpp_86pc0000gn/T/ipykernel_40684/3046650730.py:1:0: note: see current operation: %5 = "mhlo.cholesky"(%4) {lower = true} : (tensor<200x200xf32>) -> tensor<200x200xf32>
Posted Last updated
.