metal-cpp

RSS for tag

C++ games and apps can tap into the power of Metal by bridging with metal-cpp.

metal-cpp Documentation

Posts under metal-cpp tag

27 Posts
Sort by:
Post not yet marked as solved
0 Replies
74 Views
Hi i am very new to this coding gig. I am an Art Student who happened to develop an interest due to some problems. To a lot of developers shock and disgust i am in fact starting with c/c++ and I am using the humble Xcode, out of lack of better knowledge. I tried to use visual basic however i had an SVN problem because my mac is on Catalina. I just want some intel into what makes Xcode not that great in case i run into problems in the future I would be able to know. Your info will be much appreciated.
Posted Last updated
.
Post not yet marked as solved
0 Replies
80 Views
Hi, I am generating a Metal library that I build using the command line tools on macOS for iphoneos, following the instructions here. Then I serialise this to a binary blob that I load at runtime, which seems to work ok as everything renders as expected. When I am doing a frame capture and open up a shader function it tries to load the symbols and fails. I tried pointing it to the directory (and the file) containing the symbols file, but it never resolves those. In the bottom half of the Import External Sources dialogue there is one entry in the Library | Debug Info section: The library name is Library 0x21816b5dc0 and below Debug Info it says Invalid UUID. The validation layer doesn't flag any invalid behaviour so I am a bit lost and not sure what to try next?
Posted
by marco-swe.
Last updated
.
Post not yet marked as solved
2 Replies
861 Views
Device: iPod 7 iOS version: 14.4.1 , GMetalDevice = MTLCreateSystemDefaultDevice(); (lldb) bt * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGABRT   frame #0: 0x00000001d7a3784c libsystem_kernel.dylib`__pthread_kill + 8  * frame #1: 0x00000001f39c29e8 libsystem_pthread.dylib`pthread_kill + 212   frame #2: 0x00000001b4b738f4 libsystem_c.dylib`abort + 100   frame #3: 0x00000001bb238030 libsystem_malloc.dylib`malloc_vreport + 556   frame #4: 0x00000001bb2381e8 libsystem_malloc.dylib`malloc_report + 60   frame #5: 0x00000001bb22e2e8 libsystem_malloc.dylib`free + 432   frame #6: 0x00000001f48dab10 AGXMetalA10`___lldb_unnamed_symbol1249 + 1644   frame #7: 0x00000001f49064d4 AGXMetalA10`___lldb_unnamed_symbol1504 + 72   frame #8: 0x00000001c18ca694 Metal`+[MTLIOAccelDevice registerDevices] + 224   frame #9: 0x00000001c18ccbf8 Metal`invocation function for block in MTLDeviceArrayInitialize() + 872   frame #10: 0x0000000119505528 libdispatch.dylib`_dispatch_client_callout + 16   frame #11: 0x0000000119506e6c libdispatch.dylib`_dispatch_once_callout + 84   frame #12: 0x00000001c18cc470 Metal`MTLCreateSystemDefaultDevice + 200   frame #13: 0x00000001054f5f54 MyProject`+[FIOSView layerClass](self=<unavailable>, _cmd=<unavailable>) at IOSView.cpp:134:18 [opt]   frame #14: 0x00000001aebbd034 UIKitCore`UIViewCommonInitWithFrame + 1040   frame #15: 0x00000001aebbcbcc UIKitCore`-[UIView initWithFrame:] + 124   frame #16: 0x00000001054f6274 MyProject`-[FIOSView initWithFrame:](self=<unavailable>, _cmd=<unavailable>, Frame=<unavailable>) at IOSView.cpp:233:14 [opt]   frame #17: 0x0000000102d237f8 MyProject`invocation function for block in FAppEntry::PlatformInit() [inlined] MainThreadInit() at LaunchIOS.cpp:348:24 [opt]   frame #18: 0x0000000102d237a8 MyProject`invocation function for block in FAppEntry::PlatformInit(.block_descriptor=<unavailable>) at LaunchIOS.cpp:373:47 [opt]   frame #19: 0x0000000119503ce4 libdispatch.dylib`_dispatch_call_block_and_release + 24   frame #20: 0x0000000119505528 libdispatch.dylib`_dispatch_client_callout + 16   frame #21: 0x0000000119513994 libdispatch.dylib`_dispatch_main_queue_callback_4CF + 972   frame #22: 0x00000001abdf85e0 CoreFoundation`__CFRUNLOOP_IS_SERVICING_THE_MAIN_DISPATCH_QUEUE__ + 12   frame #23: 0x00000001abdf2a88 CoreFoundation`__CFRunLoopRun + 2480   frame #24: 0x00000001abdf1ba0 CoreFoundation`CFRunLoopRunSpecific + 572   frame #25: 0x00000001c2b5a598 GraphicsServices`GSEventRunModal + 160   frame #26: 0x00000001ae6e32f4 UIKitCore`-[UIApplication _run] + 1052   frame #27: 0x00000001ae6e8874 UIKitCore`UIApplicationMain + 164   frame #28: 0x0000000102d312bc MyProject`main(argc=3, argv=0x000000016d4e7700) at LaunchIOS.cpp:584:13 [opt]   frame #29: 0x00000001abad0568 libdyld.dylib`start + 4
Posted
by rbbtsn0w.
Last updated
.
Post not yet marked as solved
0 Replies
133 Views
Hey all, I have a few questions about the Loading textures and models using Metal fast resource loading project. I'm not experienced in 3D rendering or Mac development in general, so bear with me :) I noticed that the model and texture data for the scene objects had a.dat extension and appeared to be binary data files. This is different from the models and textures from the Rendering a Scene with Deferred Lighting in C++, which contains .obj and .mtl files that seem to be commonly exported from 3D modelling programs like Blender or Maya. First question: I didn't notice any explanation of for the difference in format between the two projects in the source code. From reading a bit online it seems like binary formats are more efficient and generally more representative of what you'd want to actually ship, is that correct? Second, I was also wondering if the binary format used in "Loading textures and models using Metal fast resource loading" is in a standard format, or if it was created by the project's author just for the project. Third, I was wondering what the typical process was for storing assets in a binary format. Are these usually directly exported from 3D modelling programs? Or is intermediate output from 3D modelling programs (such as .obj files) usually parsed by the developer and then written to the binary format? If this is the case, are there commonly used libraries for this, or do people usually just hand-roll parsers? Any recommended learning material would be appreciated. Thank you!
Posted
by Eskin.
Last updated
.
Post not yet marked as solved
1 Replies
199 Views
I have been working with AR for a while now, sorta learning still, and getting increasingly frustrated. I created complex animations using reality composer and now that feels like a joke. The thing is huge, I need to rewrite it in RealityKit but something tells me if I go Metal on this **** it's going to decrease the latency and make things run so much faster. I really need my app to be as light as possible because the 3D graphics will function like a 3D UI system. My greatest painpoint is the anxiety I feel when I know my application will be HUGE and crash once I publish it. It makes me feel so anxious. Like an earthquake is coming. My goal is to create the lightest thing possible. I am reading a book on Linear Algebra for Machine Learning and I am leaning into this more mathy direction so I am thinking I might as well just re-write using metal and C++? never used metal it btw, never used C++ either. I have some experience with animation and experience with sculpture so the 3D world IRL is not new to me.
Posted
by popbee.
Last updated
.
Post not yet marked as solved
0 Replies
304 Views
What is the best source for information/tutorial material on using Metal with C++? metal-cpp?
Posted
by g-wright.
Last updated
.
Post not yet marked as solved
1 Replies
454 Views
Dear experts, I'm working on adding UI for my cpp based path tracer renderer. I want to create metal cpp device and pass it to renderer, but also I want to use ImGui and GLFW (for window manager and input events handling). I've found solution how I can mix obj c code that requires by GLFW window setup and cpp code: https://github.com/ikryukov/MetalCppImGui // Here is key thing for integration GLFW with Metal Cpp // GLFW supports only obj c window handle NSWindow *nswin = glfwGetCocoaWindow(window); CA::MetalLayer* layer = CA::MetalLayer::layer(); layer->setDevice(device); layer->setPixelFormat(MTL::PixelFormatBGRA8Unorm); // bridge to obj c here because NSWindow expetcs objc CAMetalLayer* l = (__bridge CAMetalLayer*)layer; nswin.contentView.layer = l; nswin.contentView.wantsLayer = YES; Is there any official way to handle event in Metal cpp without objective c support? Maybe MetalKit will have such features in future?
Posted
by ikryukov.
Last updated
.
Post not yet marked as solved
0 Replies
222 Views
Hi experts, I'm working on PathTracer using metal-cpp and I use per-primitive data from you latest presentation: https://developer.apple.com/videos/play/wwdc2022/10105/ and described here: https://developer.apple.com/documentation/metal/ray_tracing_with_acceleration_structures Currently I want to store something like this structure: struct Triangle {     vector_float3 positions[3];     uint32_t normals[3];     uint32_t tangent[3];     uint32_t uv[3]; }; Are there any guidelines on size of this small amounts of data that I could store in acceleration structure? And if it stores inside BVH nodes -> that could affect traversal performance. Thanks!
Posted
by ikryukov.
Last updated
.
Post marked as solved
1 Replies
298 Views
Hi! I'm currently trying to convert this Objective-C example project ( https://developer.apple.com/documentation/metal/performing_calculations_on_a_gpu?language=objc ) to one using the metal-cpp wrapper. However when I make the MetalAdder class extend NS::Object (just like in the original codebase) it removes my constructor. class MetalAdder : public NS::Object{ ... } is what I have. When I instantiate this MetalAdder class as: MetalAdder adder; adder.initWithDevice(device); or auto adder = NS::TransferPtr(new MetalAdder); I get the error Call to implicitly-deleted default constructor of 'MetalAdder'. Is there something I'm doing wrong? Should I instantiate in a different way or should my MetalAdder class just not extend the NS::Object class? Thanks in advance!
Posted
by Jerne.
Last updated
.
Post not yet marked as solved
4 Replies
906 Views
Hello All, I have code on CUDA, and I can create several CUDA streams and run my kernels in parallel and get a performance boost for my task. Next, I rewrote the code for Metal and try to parallelize the task in the same way. CUDA Streams Metal device: Mac Studio with M1 Ultra. (write the code on Metal-cpp) I creating several MTLCommandBuffer in 1 MTLCommandQueue or several MTLCommandQueue with more MTLCommandBuffer. Regarding Metal resources, there are two options: Buffers (MTLBuffer) was created with an option MTLResourceStorageModeShared. In the profiler, all Command buffers are performed sequentially on the timeline of Compute. Buffers (MTLBuffer) was created with an option "MTLResourceStorageModeShared | MTLResourceHazardTrackingModeUntracked". In the profiler, I really saw the parallelism. But the maximum number of threads in the Compute timeline is always no more than 2 (see pictures). Also weird. Computing commands do not depend on each other. METAL Compute timeline About performance: [1] In the first variant, the performance is the same for different amounts of MTLCommandQueue and MTLCommandBuffer. [2] In the second variant, the performance for one MTLCommandBuffer is greater than for 2 or more. Question: why is this happening? How to parallelize the work of the compute kernels to get an increase performance? Addition information: Also, the CUDA code is rewritten in OpenCL, and it is perfectly parallelized in Windows(NVIDIA/AMD/Intel) if several OpenCL queues are running. The same code running on M1 Ultra works the same way with 1 or with many OpenCL queues. In turn, Metal is faster than OpenCL, so I am trying to figure out exactly Metal, and make the kernels work in parallel on Metal.
Posted
by abdyla_v.
Last updated
.
Post not yet marked as solved
0 Replies
350 Views
I'm not sure if this is somehow a cadence issue or something. If I attempt to use the stylized M button in Xcode to kick off a GPU capture on iOS it seems to just go on forever capturing command buffers instead of exiting when we swap to the next display surface. We are presenting the current drawable with presentDrawable and invoking nextDrawable on the MetalLayer (but note that we are doing this by extending it to be supported from C++). If I trigger and end the capture myself it works fine, and so that works for now, but I'm curious if I'm doing something wrong that causes it not to recognize the end of frame correctly for the Xcode GUI version.
Posted
by scarrow.
Last updated
.
Post not yet marked as solved
1 Replies
857 Views
Dear developers, i need support to develop a simple computation on the GPU. I would like to perform matrix multiplication: this will be good with metal-cpp because i need to export as cpp library. Following documentation: file Multiply.metal : kernel void multiply(device float *pMatA, device float *pMatB , device float *pMatC, device float *pMatR) { simdgroup_float8x8 sgMatA; simdgroup_float8x8 sgMatB; simdgroup_float8x8 sgMatR; simdgroup_load(sgMatA, pMatA); simdgroup_load(sgMatB, pMatB); simdgroup_multiply(sgMatR, sgMatA, sgMatB); simdgroup_store(sgMatR, pMatR); } File Multiply.hpp #include <Foundation/Foundation.hpp> #include <Metal/Metal.hpp> class Multiply { public: MTL::Device* m_device; MTL::ComputePipelineState *m_add_function_pso; MTL::CommandQueue *m_command_queue; MTL::Buffer *m_buffer_A; MTL::Buffer *m_buffer_B; MTL::Buffer *m_buffer_result; void init_with_device(MTL::Device*); void prepare_data(); void send_compute_command(); private: void generate_random_float_data(MTL::Buffer* buffer); void encode_dot_command(MTL::ComputeCommandEncoder* compute_encoder); void verify_results(); }; File Multiply.cpp #include <iostream> #include "Multiply.hpp" const unsigned int array_length = 1 << 5; const unsigned int buffer_size = array_length * sizeof(float); void Multiply::init_with_device(MTL::Device* device){ m_device = device; NS::Error* error; auto default_library = m_device->newDefaultLibrary(); if(!default_library){ std::cerr << "Failed to load default library."; std::exit(-1); } auto function_name = NS::String::string("multiply", NS::ASCIIStringEncoding); auto dot_function = default_library->newFunction(function_name); if(!dot_function){ std::cerr << "Failed to find the dot function."; } m_dot_function_pso = m_device->newComputePipelineState(dot_function, &error); m_command_queue = m_device->newCommandQueue(); }; void Multiply::prepare_data(){ m_buffer_A = m_device->newBuffer(buffer_size, MTL::ResourceStorageModeShared); m_buffer_B = m_device->newBuffer(buffer_size, MTL::ResourceStorageModeShared); m_buffer_result = m_device->newBuffer(buffer_size, MTL::ResourceStorageModeShared); generate_random_float_data(m_buffer_A); generate_random_float_data(m_buffer_B); } void Multiply::generate_random_float_data(MTL::Buffer* buffer) { float* data_ptr = (float*)buffer->contents(); for (unsigned long index = 0; index < array_length; index++) { for(unsigned long index2 =0; index2 < array_length; index2++) { data_ptr[index][index2] = (float)rand() / (float)(RAND_MAX); } } void Multiply::send_compute_command() { MTL::CommandBuffer* command_buffer = m_command_queue->commandBuffer(); // assert(command_buffer != nullptr); MTL::ComputeCommandEncoder* compute_encoder = command_buffer->computeCommandEncoder(); encode_dot_command(compute_encoder); compute_encoder->endEncoding();// MTL::CommandBufferStatus status = command_buffer->status(); // std::cout << status << std::endl; command_buffer->commit(); command_buffer->waitUntilCompleted(); verify_results(); } void Multiply::encode_dot_command(MTL::ComputeCommandEncoder* compute_encoder){ compute_encoder->setComputePipelineState(m_dot_function_pso); compute_encoder->setBuffer(m_buffer_A, 0, 0); compute_encoder->setBuffer(m_buffer_B, 0, 1); compute_encoder->setBuffer(m_buffer_result, 0, 2); MTL::Size grid_size = MTL::Size(array_length, 1, 1); NS::UInteger thread_group_size_ = m_dot_function_pso->maxTotalThreadsPerThreadgroup(); if(thread_group_size_ > array_length){ thread_group_size_ = array_length; } MTL::Size thread_group_size = MTL::Size(thread_group_size_, 1, 1); compute_encoder->dispatchThreads(grid_size, thread_group_size); } void Multiply::verify_results(){ auto a = (float*) m_buffer_A->contents(); auto b = (float*) m_buffer_B->contents(); auto result = (float*) m_buffer_result->contents(); for (unsigned long index = 0; index < array_length; index++) { for (unsigned long index2 = 0; index < array_length; index2++) { if (result[index][index2] != (a[index][index2] * b[index][index2])) { std::cout << "Comput ERROR: index=" << index << "result=" << result[index][index2] << "vs " << a[index][index2] + b[index][index2] << "=a*b\n"; assert(result[index][index2] == (a[index][index2] * b[index][index2])); } } std::cout << "Compute results as expected\n";}} Is all this implementation correct? Can someone kindly give suggestions about speed improvement or other solutions? Thank you in advance.
Posted Last updated
.
Post not yet marked as solved
1 Replies
481 Views
Hi all. I'm trying to get C++ code working with Metal. I get the array of MTL:Device by calling NS::Array *device_array = MTL::CopyAllDevices(); Next, I want to get the only element of the MTL::Device array by calling MTL::Device *device = device_array->object(0); I get an error: Cannot initialize a variable of type 'MTL::Device *' with an rvalue of type 'NS::Object *' Question: how to get an MTL::Device object from NS::Array?
Posted
by squirtgt.
Last updated
.
Post marked as solved
1 Replies
539 Views
Hi. I'm new to Metal(actually any type of software development run on apple products). I have many questions about using MTL::Buffer and dispatch_semaphore, and drawInMTKView(). I read README.md, but I need some more help understanding it. Full code of 03-animation in metal-cpp sample This is sample code in metal-cpp sample code by apple(I downloaded here). In this code, _pFrameData is an array of MTLBuffer, and kMaxFramesInFlight is the size of this array. Its type is static const int, and the value is 3. When Renderer is created, _pFrameData are initialized like that. void Renderer::buildFrameData() {     for ( int i = 0; i < Renderer::kMaxFramesInFlight; ++i )     {         _pFrameData[ i ]= _pDevice->newBuffer( sizeof( FrameData ), MTL::ResourceStorageModeManaged );     } } draw method, call by drawInMTKView. void Renderer::draw( MTK::View* pView ) {     NS::AutoreleasePool* pPool = NS::AutoreleasePool::alloc()->init();     _frame = (_frame + 1) % Renderer::kMaxFramesInFlight; MTL::Buffer* pFrameDataBuffer = _pFrameData[ _frame ];     MTL::CommandBuffer* pCmd = _pCommandQueue->commandBuffer();     dispatch_semaphore_wait( _semaphore, DISPATCH_TIME_FOREVER );     Renderer* pRenderer = this;     pCmd->addCompletedHandler( ^void( MTL::CommandBuffer* pCmd ){         dispatch_semaphore_signal( pRenderer->_semaphore );     });     reinterpret_cast< FrameData * >( pFrameDataBuffer->contents() )->angle = (_angle += 0.01f);     pFrameDataBuffer->didModifyRange( NS::Range::Make( 0, sizeof( FrameData ) ) );     MTL::RenderPassDescriptor* pRpd = pView->currentRenderPassDescriptor();     MTL::RenderCommandEncoder* pEnc = pCmd->renderCommandEncoder( pRpd );     pEnc->setRenderPipelineState( _pPSO );     pEnc->setVertexBuffer( _pArgBuffer, 0, 0 );     pEnc->useResource( _pVertexPositionsBuffer, MTL::ResourceUsageRead );     pEnc->useResource( _pVertexColorsBuffer, MTL::ResourceUsageRead );     pEnc->setVertexBuffer( pFrameDataBuffer, 0, 1 );     pEnc->drawPrimitives( MTL::PrimitiveType::PrimitiveTypeTriangle, NS::UInteger(0), NS::UInteger(3) );     pEnc->endEncoding();     pCmd->presentDrawable( pView->currentDrawable() );     pCmd->commit();     pPool->release(); } Q1. what is the meaning of kMaxFramesInFlight's name and value? Q2. how are dispatch_semaphore_wait() and drawInMTKView() working? At first, I guess if the count of dispatch_semaphore change 0, the Renderer::draw are blocked by dispatch_semaphore_wait() until GPU read the buffer and execute dispatch_semaphore_signal. But now I think it's not a correct understanding because I don't know about drawInMTKView. How much drawInMTKView is called in 1 second and when? Q3. and.... why use dispatch_semaphore for here? I try to change my code to use a single MTLBuffer for the same work. Just changing some code(add a single buffer, remove code for dispatch_semaphore), the changed code works same.
Posted Last updated
.
Post not yet marked as solved
2 Replies
913 Views
Not quite understanding these. As far as I can tell the Foundation, QuartzCore and Metal frameworks are included in the link line: -framework Metal -framework QuartzCore -framework Foundation Technically they are in there a few times. Not familiar enough with our project to know why. Getting a ton of undefined symbols. Metal-cpp is a header only library and so doesn't have any additional libraries of it's own right? Undefined symbols for architecture arm64: "MTL::Private::Selector::s_knewTextureViewWithPixelFormat_textureType_levels_slices_swizzle_", referenced from: "MTL::Private::Selector::s_knewTextureWithDescriptor_", referenced from: "NS::Private::Selector::s_kinit", referenced from: "NS::Private::Selector::s_kautorelease", referenced from: This is while compiling for iOS (thus the arm64).
Posted
by scarrow.
Last updated
.
Post marked as solved
2 Replies
584 Views
When trying to debug my compute shaders by pressing the ladyBird button inside of GPU trace, the error "Unable to create shader debug session" occurs. This only happens on my M1 MAX mbp, the exact same project does not have this error on my old intel mbp. I have tried reinstalling the past 3 generations of Xcode yet I cannot fix this error. It is making it impossible for me to develop my program as there is no information online about how to fix this error.
Posted Last updated
.
Post not yet marked as solved
1 Replies
841 Views
Hi, I am new to Metal and macOS development, and trying to learn Metal with the CPP wrapper for a toy rendering engine. I am mostly following the "Learn Metal with C++" sample code. I am trying to read mouse and keyboard input. It seems like the Objective-C or Swift wrappers allow you to override your own MTK::View class, and then override the respective keyDown(), keyUp() methods. However, when looking at the CPP wrapper, MTK::View doesn't have any virtual functions to override. How can I read mouse and keyboard inputs in my application? Hopefully without having an Objective-C bridge. Thank you, Robin
Posted Last updated
.