The title is self-exploratory. I wasn't able to find the CAMetalDisplayLink on the most recent metal-cpp release (metal-cpp_macOS15_iOS18-beta). Are there any plans to include it in the next release?
Metal
RSS for tagRender advanced 3D graphics and perform data-parallel computations using graphics processors using Metal.
Selecting any option will automatically load the page
Post
Replies
Boosts
Views
Activity
We have a production Metal app with a complex multithreaded Metal pipeline.
When everything is operating smoothly, it works great.
Even when extremely overloaded, it does not crash for days at a time.
This isn't good enough for our users.
Unfortunately, when I have zero visibility into id, I have no way of knowing when metal is "done" with an id.
When overloaded, stale metal render passes need to be 'aborted', which results in metal callbacks not being called.
for example, these callbacks may not be called after an aborted pass:
id<MTLCommandBuffer> m_cmdbuf;
[m_cmdbuf addScheduledHandler:^(id <MTLCommandBuffer> cb) {
cpr->scheduled = MachAbsoluteTime();
}];
[m_cmdbuf addCompletedHandler:^(id <MTLCommandBuffer> cb) {
cpr->completed = MachAbsoluteTime();
}];
For the moment, our workaround is a system which waits a few seconds after we "think" a rendering pass should be done with all its (aborted) resources before releasing buffers. This is not ideal, to say the least.
So, in summary, my question is, it would be nice to be able to 'query' an id to know when metal is done with it, so that we know that its safe to release it along with our own internal resources.
Is there any such (undocumented) mechanism? I have exhaustively read all existing Metal documentation many times.
An idea that I've been toying with... it would be nice to have something akin to Zombie detection running all the time for id only.
In OpenGL, it was OK to use a released texture... you may display a bad frame, but not CRASH!. Is there any similar option for id?
Topic:
Graphics & Games
SubTopic:
Metal
How many 32-bit variables can I use concurrently in a single thread of a Metal compute kernel without worrying about the variables getting spilled into the device memory? Alternatively: how many 32-bit registers does a single thread have available for itself?
Let's say that each thread of my compute kernel needs to store and work with its own array of N float variables, where N can be 128, 256, 512 or more. To achieve maximum possible performance, I do not want to the local thread variables to get spilled into the slow device memory. I want all N variables to be stored "on-chip", in the thread memory space.
To make my question more concrete, let's say there is an array thread float localArray[N]. Assuming an unrealistic hypothetical scenario where localArray is the only variable in the whole kernel, what is the maximum value of N for which no portion of localArray would get spilled into the device memory?
I searched in the Metal feature set tables, but I could not find any details.
hello everyone.
I got a texture loading error on visionOS 2.0:
Can't create texture(Error Domain=MTKTextureLoaderErrorDomain Code=0 "Pixel format(MTLPixelFormatInvalid) is not valid on this device" UserInfo={NSLocalizedDescription=Pixel format(MTLPixelFormatInvalid) is not valid on this device, MTKTextureLoaderErrorKey=Pixel format(MTLPixelFormatInvalid) is not valid on this device}
But this texture can load correctly on visionOS1.3.
I don't know what happen between visionOS1.3 and visionOS2.0.
The texture is a ktx file which stores cubemap that encoding in astc6x6hdr. And the ktx texture has a glInternalFormat info: GL_COMPRESSED_RGBA_ASTC_6x6.
I wonder if visionOS2.0 no longer supports astc6x6hdr cubemap format, or there is something wrong with my assets.
Given
I do not understand much at all about how to write shaders
I do not understand the math associated with page-curl effects
I am trying to:
implement a page-curl shader for use on SwiftUI views.
I've lifted a shader from HIROKI IKEUCHI that I believe they lifted from a non-metal shader resource online, and I'm trying to digest it.
One thing I want to do is to paint the "underside" of the view with a given color and maintain the transparency of rounded corners when they are flipped over.
So, if an underside pixel is "clear" then I want to sample the pixel at that position on the original layer instead of the "curl effect" pixel.
There are two comments in the shader below where I check the alpha, and underside flags, and paint the color red as a debug test.
The shader gives this result:
The outside of those rounded corners is appropriately red and the white border pixels are detected as "not-clear". But the "inner" portion of the border is... mistakingly red?
I don't get it. Any help would be appreciated. I feel tapped out and I don't have any IRL resources I can ask.
//
// PageCurl.metal
// ShaderDemo3
//
// Created by HIROKI IKEUCHI on 2023/10/17.
//
#include <metal_stdlib>
#include <SwiftUI/SwiftUI_Metal.h>
using namespace metal;
#define pi float(3.14159265359)
#define blue half4(0.0, 0.0, 1.0, 1.0)
#define red half4(1.0, 0.0, 0.0, 1.0)
#define radius float(0.4)
// そのピクセルの色を返す
[[ stitchable ]] half4 pageCurl
(
float2 _position,
SwiftUI::Layer layer,
float4 bounds,
float2 _clickedPoint,
float2 _mouseCursor
) {
half4 undersideColor = half4(0.5, 0.5, 1.0, 1.0);
float2 originalPosition = _position;
// y座標の補正
float2 position = float2(_position.x, bounds.w - _position.y);
float2 clickedPoint = float2(_clickedPoint.x, bounds.w - _clickedPoint.y);
float2 mouseCursor = float2(_mouseCursor.x, bounds.w - _mouseCursor.y);
float aspect = bounds.z / bounds.w;
float2 uv = position * float2(aspect, 1.) / bounds.zw;
float2 mouse = mouseCursor.xy * float2(aspect, 1.) / bounds.zw;
float2 mouseDir = normalize(abs(clickedPoint.xy) - mouseCursor.xy);
float2 origin = clamp(mouse - mouseDir * mouse.x / mouseDir.x, 0., 1.);
float mouseDist = clamp(length(mouse - origin)
+ (aspect - (abs(clickedPoint.x) / bounds.z) * aspect) / mouseDir.x, 0., aspect / mouseDir.x);
if (mouseDir.x < 0.)
{
mouseDist = distance(mouse, origin);
}
float proj = dot(uv - origin, mouseDir);
float dist = proj - mouseDist;
float2 linePoint = uv - dist * mouseDir;
half4 pixel = layer.sample(position);
if (dist > radius)
{
pixel = half4(0.0, 0.0, 0.0, 0.0); // background behind curling layer (note: 0.0 opacity)
pixel.rgb *= pow(clamp(dist - radius, 0., 1.) * 1.5, .2);
}
else if (dist >= 0.0)
{
// THIS PORTION HANDLES THE CURL SHADED PORTION OF THE RESULT
// map to cylinder point
float theta = asin(dist / radius);
float2 p2 = linePoint + mouseDir * (pi - theta) * radius;
float2 p1 = linePoint + mouseDir * theta * radius;
bool underside = (p2.x <= aspect && p2.y <= 1. && p2.x > 0. && p2.y > 0.);
uv = underside ? p2 : p1;
uv = float2(uv.x, 1.0 - uv.y); // invert y
pixel = layer.sample(uv * float2(1. / aspect, 1.) * float2(bounds[2], bounds[3])); // ME<----
if (underside && pixel.a == 0.0) { //<---- PIXEL.A IS 0.0 WHYYYYY
pixel = red;
}
// Commented out while debugging alpha issues
// if (underside && pixel.a == 0.0) {
// pixel = layer.sample(originalPosition);
// } else if (underside) {
// pixel = undersideColor; // underside
// }
// Shadow the pixel being returned
pixel.rgb *= pow(clamp((radius - dist) / radius, 0., 1.), .2);
}
else
{
// THIS PORTION HANDLES THE NON-CURL-SHADED PORTION OF THE SAMPLING.
float2 p = linePoint + mouseDir * (abs(dist) + pi * radius);
bool underside = (p.x <= aspect && p.y <= 1. && p.x > 0. && p.y > 0.);
uv = underside ? p : uv;
uv = float2(uv.x, 1.0 - uv.y); // invert y
pixel = layer.sample(uv * float2(1. / aspect, 1.) * float2(bounds[2], bounds[3])); // ME
if (underside && pixel.a == 0.0) { //<---- PIXEL.A IS 0.0 WHYYYYY
pixel = red;
}
// Commented out while debugging alpha issues
// if (underside && pixel.a == 0.0) {
// // If the new underside pixel is clear, we should sample the original image's pixel.
// pixel = layer.sample(originalPosition);
// } else if (underside) {
// pixel = undersideColor;
// }
}
return pixel;
}
I am looking to implement CAMetalDisplayLink on a separate thread on a macOS application. I am basing my implementation on the following example project:
Achieving Smooth Frame Rates with Metal Display Link
This project allows you to configure whether a separate thread is used for rendering by setting RENDER_ON_MAIN_THREAD in GameConfig to 0. However, when I set it to use a separate thread nothing is rendered. Stepping through the code shows that a separate thread is created, but a CAMetalDisplayLinkUpdate is never received. Does anyone know why this does not work?
Hi,
A user sent us a crash report that indicates an error occurring just after loading the default Metal library of our app.
Application Specific Information:
Crashing on exception: *** -[__NSArrayM objectAtIndex:]: index 0 beyond bounds for empty array
The report pointed me to these (simplified) lines of codes in the library setup:
_vertexFunctions = [[NSMutableArray alloc] init];
_fragmentFunctions = [[NSMutableArray alloc] init];
id<MTLLibrary> library = [device newDefaultLibrary];
2 vertex shaders and 5 fragment shaders are then loaded and stored in these two arrays using this method:
-(BOOL) addShaderNamed:(NSString *)name library:(id<MTLLibrary>)library isFragment:(BOOL)isFragment {
id shader = [library newFunctionWithName:name];
if (!shader) {
ALOG(@"Error : Unable to find the shader named : “%@”", name);
return NO;
}
[(isFragment ? _fragmentFunctions : _vertexFunctions) addObject:shader];
return YES;
}
As you can see, the arrays are not filled if the method fails... however, a few lines later, they are used without checking if they are really filled, and that causes the crash...
But this coding error doesn't explain why no shader of a certain type (or both types) have been added to the array, meaning: why -newFunctionWithName: returned nil for all given names (since the implied array appears completely empty)?
Clue
This error has only be detected once by a user running the app on macOS 10.13 with a NVIDIA Web Driver instead of the default macOS graphic driver. Moreover, it wasn't possible to reproduce the problem on the same OS using the native macOS driver.
So my question is: is it some known conflicts between NVIDIA drivers and the use of Metal libraries? Or does this case would require some specific options in the Metal implementation?
Any help appreciated, thanks!
When generating large arrays of random numbers, NaNs show up. They also show up at the same indices when using the same seed, leading me to believe that this is a bug with MPSMatrixRandom's normally distributed Float32 random number distribution.
Happens with both Philox and MTGP32.
Is this intentional and how do I work around this?
See the original post for a MWE in Swift and Julia: https://github.com/JuliaGPU/Metal.jl/issues/474
I'm experiencing a strange issue where I'm seeing black in a metal drawable where it should be a different color. When I capture the frame and inspect the returned value from the fragment function, it's correct, but the drawable isn't.
This screenshot hopefully illustrates the issue.
I've not found any references to similar issues. I saw something about some out of bounds or NaN values being dropped to 0 (which would be black), but the debugger doesn't indicate this is happening.
Topic:
Graphics & Games
SubTopic:
Metal
I have multiple CAMetalLayers that I render content to and noticed that the graphics overview HUD does not function properly when I have more than one CAMetalLayer. The values reported will be very strange. For example, FPS may report 999 or some large negative value. It the HUD simply not designed to work with multiple CAMetalLayers or MTKViews? When I disable all but one of my CAMetalLayers, the HUD works as expected.
I want to turn off my ray-tracing conditionally. There's is_null_acceleration_structure but when I don't bind an acceleration structure (or pass nil to setFragmentAccelerationStructure), I get the following API validation error:
-[MTLDebugRenderCommandEncoder validateCommonDrawErrors:]:5782: failed assertion `Draw Errors Validation
Fragment Function(vol_deferred_lighting): missing instanceAccelerationStructure binding at index 6 for accelerationStructure[0].
I can turn off API validation and it works, but it seems like I should be able to use nil for the acceleration structure w/o triggering a validation error. Seems like a bug, right?
I suppose I can work around this by creating a separate pipeline with the ray-tracing disabled via a function constant instead of using is_null_acceleration_structure.
(Can we get a ray-tracing tag for questions?)
Project: I have some data wich could be transformed by shader, result may be kept in rgb channels of image. Great.
But now to mix dozens of those results? Not one by one, image after image, but all at once. Something like „complicated average” color of particular pixel from all delivered images.
Is it possible?
my app use mtkview to render video, but [MTKView initwithFrame:device] takes 2-3s in some some 2019 macbook pro, system macos 15.0.1.
how can I do?
Hello, I am using MTKView to display: camera preview & video playback. I am testing on iPhone 16. App crashes at a random moment whenever MTKView is rendering CIImage.
MetalView:
public enum MetalActionType {
case image(CIImage)
case buffer(CVPixelBuffer)
}
public struct MetalView: UIViewRepresentable {
let mtkView = MTKView()
public let actionPublisher: any Publisher<MetalActionType, Never>
public func makeCoordinator() -> Coordinator {
Coordinator(self)
}
public func makeUIView(context: UIViewRepresentableContext<MetalView>) -> MTKView {
guard let metalDevice = MTLCreateSystemDefaultDevice() else {
return mtkView
}
mtkView.device = metalDevice
mtkView.framebufferOnly = false
mtkView.clearColor = MTLClearColor(red: 0, green: 0, blue: 0, alpha: 0)
mtkView.drawableSize = mtkView.frame.size
mtkView.delegate = context.coordinator
mtkView.isPaused = true
mtkView.enableSetNeedsDisplay = true
mtkView.preferredFramesPerSecond = 60
context.coordinator.ciContext = CIContext(
mtlDevice: metalDevice, options: [.priorityRequestLow: true, .highQualityDownsample: false])
context.coordinator.metalCommandQueue = metalDevice.makeCommandQueue()
context.coordinator.actionSubscriber = actionPublisher.sink { type in
switch type {
case .buffer(let pixelBuffer):
context.coordinator.updateCIImage(pixelBuffer)
break
case .image(let image):
context.coordinator.updateCIImage(image)
break
}
}
return mtkView
}
public func updateUIView(_ nsView: MTKView, context: UIViewRepresentableContext<MetalView>) {
}
public class Coordinator: NSObject, MTKViewDelegate {
var parent: MetalView
var metalCommandQueue: MTLCommandQueue!
var ciContext: CIContext!
private var image: CIImage? {
didSet {
Task { @MainActor in
self.parent.mtkView.setNeedsDisplay() //<--- call Draw method
}
}
}
var actionSubscriber: (any Combine.Cancellable)?
private let operationQueue = OperationQueue()
init(_ parent: MetalView) {
self.parent = parent
operationQueue.qualityOfService = .background
super.init()
}
public func mtkView(_ view: MTKView, drawableSizeWillChange size: CGSize) {
}
public func draw(in view: MTKView) {
guard let drawable = view.currentDrawable, let ciImage = image,
let commandBuffer = metalCommandQueue.makeCommandBuffer(), let ci = ciContext
else {
return
}
//making sure nothing is nil, now we can add the current frame to the operationQueue for processing
operationQueue.addOperation(
MetalOperation(
drawable: drawable, drawableSize: view.drawableSize, ciImage: ciImage,
commandBuffer: commandBuffer, pixelFormat: view.colorPixelFormat, ciContext: ci))
}
//consumed by Subscriber
func updateCIImage(_ img: CIImage) {
image = img
}
//consumed by Subscriber
func updateCIImage(_ buffer: CVPixelBuffer) {
image = CIImage(cvPixelBuffer: buffer)
}
}
}
now the MetalOperation class:
private class MetalOperation: Operation, @unchecked Sendable {
let drawable: CAMetalDrawable
let drawableSize: CGSize
let ciImage: CIImage
let commandBuffer: MTLCommandBuffer
let pixelFormat: MTLPixelFormat
let ciContext: CIContext
init(
drawable: CAMetalDrawable, drawableSize: CGSize, ciImage: CIImage,
commandBuffer: MTLCommandBuffer, pixelFormat: MTLPixelFormat, ciContext: CIContext
) {
self.drawable = drawable
self.drawableSize = drawableSize
self.ciImage = ciImage
self.commandBuffer = commandBuffer
self.pixelFormat = pixelFormat
self.ciContext = ciContext
}
override func main() {
let width = Int(drawableSize.width)
let height = Int(drawableSize.height)
let ciWidth = Int(ciImage.extent.width) //<-- Thread 22: EXC_BAD_ACCESS (code=1, address=0x5e71f5490) A bad access to memory terminated the process.
let ciHeight = Int(ciImage.extent.height)
let destination = CIRenderDestination(
width: width, height: height, pixelFormat: pixelFormat, commandBuffer: commandBuffer,
mtlTextureProvider: { [self] () -> MTLTexture in
return drawable.texture
})
let transform = CGAffineTransform(
scaleX: CGFloat(width) / CGFloat(ciWidth), y: CGFloat(height) / CGFloat(ciHeight))
do {
try ciContext.startTask(toClear: destination)
try ciContext.startTask(toRender: ciImage.transformed(by: transform), to: destination)
} catch {
}
commandBuffer.present(drawable)
commandBuffer.commit()
commandBuffer.waitUntilCompleted()
}
}
Now I am no Metal expert, but I believe it's a very simple execution that shouldn't cause memory leak especially after we have already checked for whether CIImage is nil or not. I have also tried running this code without OperationQueue and also tried with @autoreleasepool but none of them has solved this problem.
Am I missing something?
Currently looking for Metal developers to port Quake 2 RTX to Metal RT in order to give Apple Silicon Macs an amazing Pathtracing demo, This project falls under NightSightProductions who is also working on a Portal 2 with RTX Remaster. if you are interested and want to help further Mac gaming, message me here or on discord at king_vulpes
Hi! I just installed GPTK2 on my new Mac , but the Terminal gave “Error:OpenSSL1.1 has been disabled.”
How should I fix it?Or waiting for the GPTK2 beta4?
Thanks.
I’ve been trying to run Jurassic World Evolution 2 using the Game Porting Toolkit on macOS, but the game doesn’t launch and crashes immediately. Based on the error and research, it seems the issue is related to missing support for D3D12_TILED_RESOURCES_TIER_2 in the Metal API.
If this is the case, does anyone know if support for tiled resources is planned for future updates of the toolkit? Or are there any potential workarounds for bypassing this limitation?
Is there a working example of imageblock_slice with implicit layout somewhere?
I get a compilation error when i write this:
imageblock_slilce color_slice = img_blk.slice(frag->color);
Error:
No matching member function for call to 'slice'
candidate template ignored: couldn't infer template argument 'E'
candidate function template not viable: requires 2 arguments, but 1 was provided
Too few template arguments for class template 'imageblock_slice'
It seems the syntax has changed since the Imageblocks presentation https://developer.apple.com/videos/play/tech-talks/603/
I tried supplying the struct type of the image block between <> but it still does not work.
When inspecting the geometry in Xcode's metal debugger, I noticed that the shown "frustrum box" didn't make sense. Since Metal uses depth range 0,1 in NDC space, I would expect a vertex that is projected to z:0 to be on the front clipping plane of the frustrum shown in the geometry inspector. This is however not the case. A vertex with ndc z:0 is shown halfway inside the frustrum. Vertices with ndc z less than 0 are correctly culled during rendering, while the geometry inspector's frustrum shows that the vertex is stil inside the frustrum.
The image shows vertices that are visually in the middle of the frustrum on z axis, but at the same time the out position shows that they are projected to z:0. How is this possible, unless there's a bug in the geometry inspector?
I used xcode gpu capture to profile render pipeline's bandwidth of my game.Then i found depth buffer and stencil buffer use the same buffer whitch it's format is Depth32Float_Stencil8.
But why in a single pass of pipeline, this buffer was loaded twice, and the Load Attachment Size of Encoder Statistics was double.
Is there any bug with xcode gpu capture?Or the pass really loaded the buffer twice times?
Topic:
Graphics & Games
SubTopic:
Metal