Delve into the world of graphics and game development. Discuss creating stunning visuals, optimizing game mechanics, and share resources for game developers.

All subtopics
Posts under Graphics & Games topic

Post

Replies

Boosts

Views

Activity

MTLCaptureManager.sharedCaptureManager generates corrupted .gputrace files (0KB, invalid internal structure)
Hello, I am experiencing an issue with programmatically capturing a GPU trace using MTLCaptureManager. The .gputrace file that is generated appears to be corrupted, and I'm looking for guidance or a solution. Description of the Problem: I am using MTLCaptureManager.sharedCaptureManager to capture a Metal frame and save it to disk. The generated .gputrace file is consistently reported as 0 bytes in size by the file system. Crucially, when I compress this 0-byte .gputrace file into a .zip archive, the resulting archive contains the full, expected data. After unzipping, the file can be opened and viewed correctly in Xcode. However,When inspecting the file's contents using NSFileManager in Objective-C (treating it as a directory), the internal structure is different from a .gputrace file captured directly from Xcode's Metal Debugger. capture in xcode capture in file Finally,When capturing multiple frames programmatically, the first captured frame contains valid buffer data. However, for subsequent frames (starting from the second frame), the corresponding buffer contents are all zero-filled. Frame 1: All MTLBuffer data is correctly captured and populated. Frame 2 and onward: The same MTLBuffer objects are present in the trace, but their contents are entirely 0 (i.e., the data is not captured or is corrupted). In this case, the on-screen display is normal, but the captured frame is incorrect. The frame captured directly in Xcode is also correct. Only the frame captured to a file is abnormal.
1
0
520
Aug ’25
Metal triangle strips uniform opacity.
I have this drawing app that I have been working on for the past few years when I have free time. I recently rebuilt the app in Metal to build out other brushes and improve performance, need to render 10000s of lines in realtime. I’m running into this issue trying to create a uniform opacity per path. I have a solution but do not love it - as this is a realtime app and the solution could have some bottlenecks. If I just generate a triangle strip from touch points and do my best to smooth, resample, and handle miters I will always get some overlaps. See: To create a uniform opacity I render to an offscreen texture with blending disabled. I then pre-multiply the color and draw that texture to a composite texture with blending on (I do this per path). This works but gets tricky when you introduce a textured brush, the edges of the texture in the frag shader cut out the line. Pasted Graphic 1.png Solution: I discard below a threshold fragment float4 fragment_line(VertexOut in [[stage_in]], texture2d<float> texture [[ texture(0) ]]) { constexpr sampler s(coord::normalized, address::mirrored_repeat, filter::linear); float2 texCoord = in.texCoord; float4 texColor = texture.sample(s, texCoord); if (texColor.a < 0.01) discard_fragment(); // may be slow (from what I read) return in.color * texColor; } Better but still not perfect. Question: I'm looking for better ways to create a uniform opacity per path. I tried .max blending but that will cause no blending of other paths. Any tips, ideas, much appreciated. If this is too detailed of a question just achieve.
1
0
106
Mar ’25
Metal HUD Logging issue
Hi I've noticed one issue in Metal HUD, but I'm not sure if it is a bug in the Metal HUD or if there is a purpose for this behavior. Metal HUD has an option to send the data to system log in raw format where the numbers are like metal-HUD: ,,,,,..., https://developer.apple.com/documentation/xcode/monitoring-your-metal-apps-graphics-performance/ If the HUD is displayed, it works just fine, but it seems that when the HUD is hidden (with shift-F9), it still send the data to system log, but the numbers are the same all the time and are not updated while is still being updated. I would expect that it should log the data no matter if the HUD is displayed or not, this of course leads to incorrect FPS calculations Here is an example of the system log entries when the HUD is not visible:
0
0
110
May ’25
Error: "CoreImage Metal library does not contain function"
Hey I'm using the CIDepthBlurEffect Core Image Filter in my app. It seems to work ok but I get these errors in the console when calling the class. CoreImage Metal library does not contain function for name: sparserendering_xhlrb_scan CoreImage Metal library does not contain function for name: sparserendering_xhlrb_diffuse CoreImage Metal library does not contain function for name: sparserendering_xhlrb_copy_back CoreImage Metal library does not contain function for name: plain_or_sRGB_copy Am I missing some sort of import to gain these Metal functions? I am using my own custom shaders but I assume you'd be able to use them along side the built in ones.
0
0
516
Dec ’25
Threadgroup configuration for tile shading
Hello! I have a question about how thread groups work with tile shading. When running "traditional" compute, I get to choose both thread group size and the grid size. However, when using tile shading kernel I only have dispatchThreadsPerTile method - this controls how many threads will be ran in each tile. So far so good, but what about thread groups? The examples in video "Tile Shading on A11" seem to suggest that there will be only one thread group per tile. In the video, [[thread_index_in_threadgroup]] is called "local_id" and it is used to access the image block. I assume this is the default configuration. So when one does the following: Creates MTLRenderPassDescriptor with tileWidth set to W and tileHeight set to H Fires up the tile shading kernel using dispatchThreadsPerTile with MTLSize size = { W, H, 1 } I understand that the result is 1-to-1 mapping between the tile "pixels" and kernel threads. Now, what I would like to do is to have more than one thread group there. I want this for performance reasons: I have a certain compute kernel which I know executes very well with small thread group size. In fact, { 32, 1, 1 } seems to be the fastest. My understanding is that even if I set tile size to 16x16, and so I am executing 256 threads there, there will only be one SIMD group active in a thread group. Meaning that this SIMD group has to execute 8 times over the tile. Is it possible somehow? Or perhaps the limitations of the API are pointing at the limitations of hardware itself, and if I want to execute with SIMD group sized thread groups I have to use "traditional" compute encoder? Will be grateful for help. Michał
0
0
76
Mar ’25
How can I assign priorities to my app’s GPU workloads?
My app has a number of heterogeneous GPU workloads that all run concurrently. Some of these should be executed with the highest priority because the app’s responsiveness depends on them, while others are triggered by file imports and the like which should have a low priority. If this was running on the CPU I’d assign the former User Interactive QoS and the latter Utility QoS. Is there an equivalent to this for GPU work?
1
0
907
Jan ’26
iOS Matchmaker ViewController Info Button
I'm updating an existing distributed game to add turn-based matches. When the Matchmaker ViewController Info Button next to a game is pressed, the results vary: iOS 15.x - Button under avatar says "Accept Invite" or "View Game" (depending on if invite has already been accepted) iOS 18.x - Button always says "App Store" - I assume that means it would lead one to the App store to install the game. Both devices (iPad 15.x and iPhone 18.x) have the same version of the game installed. The results are the same when running in the simulator. When the game is released, I assume this button will work properly, no?
0
0
385
Dec ’25
screencapturekit
I have code that captures a window and displays a cropped image. The problem is 2 fold. Kit doesn't seem to allow to modify stop and recapture image in window mode to capture a portion of the screen. So this makes me having to crop and display the cropped image via a published variable. This all works find. But seems to stop after some time. Using an M1 16gig ram. program is taking less than 100meg of mem with 40-70%cpu as the crow flies. printing captured success in debug mode and sometimes frame isn't valid so guarding against it. any ideas on how to improve my strategy?
0
0
136
Jun ’25
Loosing display when zooming UIView on Mac (Designed for IPad) while IPad version works fine with same zoom level
I have a UIView that displays lines, and I zoom in (scale by 2 on the scroll view zoomScale variable containing the UIView). As I zoom in, on the Mac version (Designed for IPad) I loose the graphic after a certain number of zooms (the scrollView maximumZoomScale is set at 10). To ensure that lines are correctly represented, I modify the contentScaleFactor variable on the UIView; otherwise, the line's display is pixelated. On the IPad (simulator and real) I do not loose the graphic when zooming. So the Mac port of the UIView drawing is not working as the IPad version. Everything else of the application works fine except this important details. I already submitted a feedback request (#FB16829106) with the images showing the problem. I need a solution to this problem. Thanks.
1
0
106
Mar ’25
Metal texture allocated size versus actual image data size
Hello. In the iOS app i'm working on we are very tight on memory budget and I was looking at ways to reduce our texture memory usage. However I noticed that comparing ASTC8x8 to ASTC12x12, there is no actual difference in allocated memory for most of our textures despite ASTC12x12 having less than half the bpp of 8x8. The difference between the two only becomes apparent for textures 1024x1024 and larger, and even in that case the actual texture data is sometimes only 60% of the allocation size. I understand there must be some alignment and padding going on, but this seems extreme. For an example scene in my app with astc12x12 for most textures there is over a 100mb difference in astc size on disk versus when loaded, so I would love to be able to recover even a portion of that memory. Here is some test code with some measurements i've taken using an iphone 11: for(int i = 0; i &lt; 11; i++) { MTLTextureDescriptor *texDesc = [[MTLTextureDescriptor alloc] init]; texDesc.pixelFormat = MTLPixelFormatASTC_12x12_LDR; int dim = 12; int n = 2 &lt;&lt; i; int mips = i+1; texDesc.width = n; texDesc.height = n; texDesc.mipmapLevelCount = mips; texDesc.resourceOptions = MTLResourceStorageModeShared; texDesc.usage = MTLTextureUsageShaderRead; // Calculate the equivalent astc texture size int blocks = 0; if(mips == 1) { blocks = n/dim + (n%dim&gt;0? 1 : 0); blocks *= blocks; } else { for(int j = 0; j &lt; mips; j++) { int a = 2 &lt;&lt; j; int cur = a/dim + (a%dim&gt;0? 1 : 0); blocks += cur*cur; } } auto tex = [objCObj newTextureWithDescriptor:texDesc]; printf("%dx%d, mips %d, Astc: %d, Metal: %d\n", n, n, mips, blocks*16, (int)tex.allocatedSize); } MTLPixelFormatASTC_12x12_LDR 128x128, mips 7, Astc: 2768, Metal: 6016 256x256, mips 8, Astc: 10512, Metal: 32768 512x512, mips 9, Astc: 40096, Metal: 98304 1024x1024, mips 10, Astc: 158432, Metal: 262144 128x128, mips 1, Astc: 1936, Metal: 4096 256x256, mips 1, Astc: 7744, Metal: 16384 512x512, mips 1, Astc: 29584, Metal: 65536 1024x1024, mips 1, Astc: 118336, Metal: 147456 MTLPixelFormatASTC_8x8_LDR 128x128, mips 7, Astc: 5488, Metal: 6016 256x256, mips 8, Astc: 21872, Metal: 32768 512x512, mips 9, Astc: 87408, Metal: 98304 1024x1024, mips 10, Astc: 349552, Metal: 360448 128x128, mips 1, Astc: 4096, Metal: 4096 256x256, mips 1, Astc: 16384, Metal: 16384 512x512, mips 1, Astc: 65536, Metal: 65536 1024x1024, mips 1, Astc: 262144, Metal: 262144 I also tried using MTLHeaps (placement and automatic) hoping they might be better, but saw nearly the same numbers. Is there any way to have metal allocate these textures in a more compact way to save on memory?
8
0
2.8k
Mar ’25
How does Game Kit offline leaderboard submission work?
I do not understand how offline leaderboard submission is supposed to work in Game Kit: While the documentation briefly states that offline submission is supported, how is that even possible when you first have to fetch a leaderboard object in order to then call its submitScore function? How can I get the leaderboard object in the first place when offline? Can anyone enlighten me how this works? Or maybe point me to some relevant documentation?
0
0
364
Dec ’25
RealityKit - How to change camera target in response of a touch event?
Hello, I’m porting my UIKit/SceneKit app to SwiftUI/RealityKit and I’m wondering how to change the camera target programmatically. I created a simple scene in Reality Composer Pro with two spheres. My goal is straightforward: when the user taps a sphere, the camera should look at it as the main target. Following Apple’s videos, I implemented the .gesture modifier and it is printing the tapped sphere correctly, but updating my targetEntity state doesn’t change anything, so the camera won't update its target. Is there a way to access the scene content at that level? Or what else should I do? Here’s my current code implementation: Thanks!
1
0
275
Sep ’25
Metal useResource vs. MTLFence
Hello, I'm tracking down a bug where useResource doesn't seem to apply proper synchronization when a resource is produced by the render pass then consumed by the compute pass, but when I use MTLFence between the to signal and wait between the render/compute encoders, the artifact goes away. The resource is created with MTLHazardTrackingModeTracked and useResource is called on the compute encoder after the render pass. Metal API Validation doesn't report any warnings/errors. Am I misunderstanding the difference between the two APIs? I dug through the Metal documentation and it looks like useResource should handle synchronization given the resource has MTLHazardTrackingModeTracked but on the other hand, MTLFence should be used to ensure proper synchronization between command encoders. Can someone can clarify the difference between the two APIs and when to use them.
3
0
158
Jul ’25
Metal 4 Argument Tables
I am puzzled by the setAddress(_:attributeStride:index:) of MTL4ArgumentTable. Can anyone please explain what the attributeStride parameter is for? The doc says that it is "The stride between attributes in the buffer." but why? Who uses this for what? On the C++ side in the shaders the stride is determined by the C++ type, as far as I know. What am I missing here? Thanks!
1
0
1k
Jan ’26
Swift game sometimes runs on efficiency cores then snaps back to performance cores
I've been working on Swift game which is not yet launched or available for preview. The game works in such a way that it has idle CPU while the user is thinking and sustained max CPU and GPU on as many cores as possible when he makes a move. Rarely, due to OS activity or something else outside of my control (for example when dropping the OS curtain even if for just a bit then remove it), the game or some of its threads are moved to efficiency cores which results in major stuttering which persists precisely until the game is idle again at which point the game is moved back on performance cores - but if the player keeps making moves the stuttering simply won't go away and so I guess compuptation is locked onto efficiency cores. The issue does not reproduce on MacCatalyst on Intel. How do I tell Swift to avoid efficiency cores? BTW Swift and SceneKIT have AMAZING performance especially when compared to others.
0
0
194
Mar ’25
Why does game mode not get triggered for my App?
I think I really have tried everything and I did all according to official documentation to support game mode on iOS or iPadOS but it doesn't matter what I do it just doesn't get triggered. Funny enough it works during development when I install it via Xcode but as soon as it is live on the store and when I install it from there game mode doesn't get triggered anymore. What I have atm I have added (even though it is deprecated) <key>GCSupportsGameMode</key> <true/> I have set the (but it seems only supported for macOS) <key>LSApplicationCategoryType</key> <string>public.app-category.games</string> I have added <key>LSSupportsGameMode</key> <true/> It just doesn't work. Is there anything else what needs to be done? Should the flag LSSupportsGameMode not be enough normally? The reason why this is so annoying is that my app is a real time streaming app and I want to profit from minimised background activities for smoother gameplay and more consistent frame rates like mentioned in the documentation.
1
0
606
Nov ’25
Regression: RealityKit spatial audio crackles and pops on iOS 26.0 beta 5 (FB19423059)
RealityKit spatial audio crackles and pops on iOS 26.0 beta 5. It works correctly on iOS 18.6 and visionOS 26.0 beta 5. The APIs used are AudioPlaybackController, Entity.prepareAudio, Entity.play Videos of the expected and observed behavior are attached to the feedback FB19423059. The audio should be a consistent, repeating sound, but it seems oddly abbreviated and the volume varies unexpectedly. Thank you for investigating this issue.
0
0
284
Aug ’25
Unable to find intelgpu_kbl_gt2r0 slice or a compatible one in binary archive
Unable to find intelgpu_kbl_gt2r0 slice or a compatible one in binary archive 'file:///System/Library/PrivateFrameworks/IconRendering.framework/Resources/binary.metallib' available slices: applegpu_g13g, applegpu_g13s, applegpu_g13d, applegpu_g14g, applegpu_g14s, applegpu_g14d, applegpu_g15g, applegpu_g15s, applegpu_g15d, applegpu_g16g, applegpu_g16s, applegpu_g17g, applegpu_g15g, applegpu_g15s, applegpu_g15d, applegpu_g16s Is it related to performance of applications in macOS 26.2 on Intel Macs?
3
0
202
5d