Post not yet marked as solved
Hi PYNing, it seems likely your kernel is working as expected, let me explain:
simdgroup_half8x8 is a 64-wide operation (8 x 8 = 64).
M1 uses 32 threads per threadgroup, this can be determined via threadExecutionWidth as explained here.
Using 32 threads, each thread performs 2 of the 64 operations.
This if statement:
// test only 1 thread
if(thread_position_in_grid.z != 0 || thread_position_in_grid.y != 0 || thread_position_in_grid.x * N_Pack != 0) return;
is testing whether thread_position_in_grid.xyz != ushort3(0, 0, 0) and thus every thread except thread 0 is masked out (inactive after the if statement). If your input data contains 0 and 1 at indices 0 and 1, then thread 0 writes out exactly these 2 values and your output is as expected.
You mentioned you already captured a GPU frame, I'd recommend having a look at Metal Shader Debugging and Profiling to learn how to debug shaders in Xcode.
Post not yet marked as solved
Hi guarav2289, I would suggest taking a look at our GPU tooling in Xcode. This WWDC session explains how to use GPU Frame capture. It might give you insight in what is happening in a frame in your application.
If the problem still occurs and it seems to be a Metal issue, please file a Feedback Assistant ticket with:
The macOS and/or iOS version version that was used for both development and running the application.
The Xcode version used for development.
If possible, a sample project that reproduces the issue for you.
If you file a ticket, please share the Feedback Assistant ID here.
Hi wasintw, MTKView provides drawableSize and preferredDrawableSize properties. Setting drawableSize on MTKView will trigger drawableSizeChanged to be called. Therefor, you would need to set autoResizeDrawable to false if you want to use a custom drawableSize.
When autoResizeDrawable is set to false on MTKView, preferredDrawableSize will still be updated. The preferredDrawableSize property can be observed using a Key-Value observer to determine whether the window was resized. We have a Swift article that explains how to set up a Key-Value observer.
Post not yet marked as solved
Hi Aquiss, would you be able to provide a sample project that reproduces your issue? If so, can you file a Feedback Assistant Report with the following information:
The system configuration: Hardware and software versions of the device reproducing the issue .
The version of Xcode you are using.
A sample project reproducing the issue for you.
We will take a closer look at this on our end. Please share the Feedback Assistant report ID here.
Post not yet marked as solved
Hi jcookie,
There is currently no way to hint the GPU to ramp up before executing work. Last year however, we introduced GPU Performance State inducer in our GPU tooling. The WWDC session "Discover Metal debugging, profiling, and asset creation tools" explains how to use these tools.
Post not yet marked as solved
Hi iPerKard, I would recommend switching to using CAMetalLayer for your use case. MTKView wraps a CAMetalLayer but does not expose all advanced features. CAMetalLayer gives you more control and allows you to set the colorspace property on both macOS and iOS.
Post not yet marked as solved
Hi KevinRub, Unreal Engine is not an Apple product. I would recommend reaching out to the developers of Unreal Engine.
Post not yet marked as solved
Hi, can you file a Feedback Assistant Report with the following information:
System config: Hardware and software versions of both your target device (you said iPhone 13 Pro, were there any other devices you tried that were having issues?) and the Mac you're using Xcode from.
Version of Xcode you are using. If you can, please try the latest Xcode version.
Please export the gputrace (in the Summary or from File > Export in the menubar, there is also an Export button in the Summary section of the GPU trace on the top right), zip it, and share it with us.
We will take a closer look at this on our end. Please share the Feedback Assistant report ID here.
Post not yet marked as solved
Hi lemong, a MTLCommandBuffer has gpuStartTime and gpuEndTime properties that allow you to calculate the number of seconds it took to execute the command buffer.