metal-cpp

RSS for tag

C++ games and apps can tap into the power of Metal by bridging with metal-cpp.

Posts under metal-cpp tag

71 Posts

Post

Replies

Boosts

Views

Activity

Metal API supported files for models?
Hello everyone! I have a small concern about one little thing when it comes to programming in metal. There are some models that I wish to use along with animations and skins on them, the file extension for them is called gltf. glTF has been used in a number of projects such as unity and unreal engine and godot and blender. I was wondering if metal supports this file extension or not. Anyone here knows the answer?
3
1
2.1k
Sep ’23
Metal Shader Converter shader debug symbols
Hello, I’ve started testing the Metal Shader Converter to convert my HLSL shaders to metallib directly, and I was wondering if the option ’-frecord-sources’ was supported in any way? Usually I’m compiling my shaders as follows (from Metal): xcrun -sdk macosx metal -c -frecord-sources shaders/shaders.metal -o shaders/shaders.air xcrun -sdk macosx metallib shaders/shaders.air -o shaders/shaders.metallib The -frecord-sources allow me to see the source when debugging and profiling a Metal frame. Now with DXC we have a similar option, I can compile a typical HLSL shader with embedded debug symbols with: dxc -T vs_6_0 -E VSMain shaders/triangle.hlsl -Fo shaders/triangle.dxil -Zi -O0 -Qembed_debug The important options here are ’-Zi` and ’-Qembed_debug’, as they make sure debug symbols are embedded in the DXIL. It seems that right now Metal Shader Converter doesn’t pass through the DXIL debug information, and I was wondering if it was possible. I’ve looked at all the options in the utility and haven’t seen anything that looked like it. Right now debug symbols in my shaders is a must-have both for profiling and debugging. For reference an alternative pipeline would be to use spir-v cross instead. For reference here's what a typical pipeline with dxc and spir-v cross look like: HLSL -> SPIRV -> AIR -> METALLIB dxc -T ps_6_0 -E PSMain -spirv shaders/triangle.hlsl -Zi -Qembed_debug -O0 -Fo shaders/triangle.frag.spirv spirv-cross --msl shaders/triangle.frag.spirv --output shaders/triangle.frag.metal xcrun -sdk macosx metal -c -frecord-sources shaders/triangle.frag.metal -o shaders/triangle.frag.air xcrun -sdk macosx metallib shaders/triangle.frag.air -o shaders/triangle.frag.metallib As you can see, it's a lot more steps than metal shader converter, but after all those steps you can get some sort of shader symbols in xcode when debugging a metal frame, which is better than nothing: Please let me know if I can provide files, projects or anything that can help supporting shader symbols directly with metal shader converter. Thank you for your time!
0
0
1.1k
Aug ’23
Metal Shader Library - invalid UUID
Hi, I am generating a Metal library that I build using the command line tools on macOS for iphoneos, following the instructions here. Then I serialise this to a binary blob that I load at runtime, which seems to work ok as everything renders as expected. When I am doing a frame capture and open up a shader function it tries to load the symbols and fails. I tried pointing it to the directory (and the file) containing the symbols file, but it never resolves those. In the bottom half of the Import External Sources dialogue there is one entry in the Library | Debug Info section: The library name is Library 0x21816b5dc0 and below Debug Info it says Invalid UUID. The validation layer doesn't flag any invalid behaviour so I am a bit lost and not sure what to try next?
1
0
1.1k
Jul ’23
Build and execute metal app which perform calcuations on gpu without using xcode
I am following this https://developer.apple.com/documentation/metal/performing_calculations_on_a_gpu on building a metal app for performing a GPU calculation. I am not able to figure out how to build and execute the project from the command line. Any help on how to build a main.m file using xcrun will be useful. I have tried xcrun -sdk macosx clang MetalComputeBasic/main.m but it doesn't work.
0
0
880
Jun ’23
How to compute aspect using metal-cpp
Hi, I am experimenting with the 05-perspective.cpp code in LearnMetalCPP. I have made the window resizable by dragging an edge or corner with the mouse. But this causes the image to distort because the original drawing was square and the aspect set to 1.0. Now I want to compute aspect = width/height to use in simd::float4x4 makePerspective( float fovRadians, float aspect, float znear, float zfar ) in the 05-perspective.cpp app. What is the best way to get width and height, such as from CGSize.width and CGSize.height?
1
0
969
Apr ’23
Input with Metal-CPP, overriding MTK::View
Hi, I am new to Metal and macOS development, and trying to learn Metal with the CPP wrapper for a toy rendering engine. I am mostly following the "Learn Metal with C++" sample code. I am trying to read mouse and keyboard input. It seems like the Objective-C or Swift wrappers allow you to override your own MTK::View class, and then override the respective keyDown(), keyUp() methods. However, when looking at the CPP wrapper, MTK::View doesn't have any virtual functions to override. How can I read mouse and keyboard inputs in my application? Hopefully without having an Objective-C bridge. Thank you, Robin
4
1
3.5k
Apr ’23
Compilation Issues with Xcode 14 for <vector> <map> <iostream> etc.
I'm trying to compile with a brand new install of Xcode 14. A very simple .cpp and .h for testing. The .h has: #include "mcut.h" #include <stdio.h> #include <stdlib.h> The .cpp has: #include "MCut_Wrapper.hpp" #include <iostream> I know this doesn't do anything.... but with the #include When I compile I suddenly get: No member named 'memcpy' in namespace 'std::__1'; did you mean simply 'memcpy'? No member named 'memmove' in namespace 'std::1'; did you mean simply 'memmove'?_ etc. without the #include everything compiles without any errors. The same errors show up for any #includes such as etc. I've also tried adding using namespace std; But that doesn't work. I've tried a simple hello world program like this: #include "MCut_Wrapper.hpp" #include <iostream> #include <vector> #include <string> using namespace std; int main() { vector<string> msg{"Hello", "C++", "World", "from", "VS Code", "and the C++ extension!"}; for (const string& word : msg) { cout << word << " "; } cout << endl; } But it also throws out the same error. Any and all assistance to get this working would be appreciated. I'm attempting to make a bundle with Xcode 14 on a machine with an M2 Max Chip (if that makes any difference). Cheers
1
0
2.5k
Mar ’23
Why do developers say Xcode isn't that great?
Hi i am very new to this coding gig. I am an Art Student who happened to develop an interest due to some problems. To a lot of developers shock and disgust i am in fact starting with c/c++ and I am using the humble Xcode, out of lack of better knowledge. I tried to use visual basic however i had an SVN problem because my mac is on Catalina. I just want some intel into what makes Xcode not that great in case i run into problems in the future I would be able to know. Your info will be much appreciated.
0
0
987
Mar ’23
MTLCreateSystemDefaultDevice crash?
Device: iPod 7 iOS version: 14.4.1 , GMetalDevice = MTLCreateSystemDefaultDevice(); (lldb) bt * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGABRT   frame #0: 0x00000001d7a3784c libsystem_kernel.dylib`__pthread_kill + 8  * frame #1: 0x00000001f39c29e8 libsystem_pthread.dylib`pthread_kill + 212   frame #2: 0x00000001b4b738f4 libsystem_c.dylib`abort + 100   frame #3: 0x00000001bb238030 libsystem_malloc.dylib`malloc_vreport + 556   frame #4: 0x00000001bb2381e8 libsystem_malloc.dylib`malloc_report + 60   frame #5: 0x00000001bb22e2e8 libsystem_malloc.dylib`free + 432   frame #6: 0x00000001f48dab10 AGXMetalA10`___lldb_unnamed_symbol1249 + 1644   frame #7: 0x00000001f49064d4 AGXMetalA10`___lldb_unnamed_symbol1504 + 72   frame #8: 0x00000001c18ca694 Metal`+[MTLIOAccelDevice registerDevices] + 224   frame #9: 0x00000001c18ccbf8 Metal`invocation function for block in MTLDeviceArrayInitialize() + 872   frame #10: 0x0000000119505528 libdispatch.dylib`_dispatch_client_callout + 16   frame #11: 0x0000000119506e6c libdispatch.dylib`_dispatch_once_callout + 84   frame #12: 0x00000001c18cc470 Metal`MTLCreateSystemDefaultDevice + 200   frame #13: 0x00000001054f5f54 MyProject`+[FIOSView layerClass](self=<unavailable>, _cmd=<unavailable>) at IOSView.cpp:134:18 [opt]   frame #14: 0x00000001aebbd034 UIKitCore`UIViewCommonInitWithFrame + 1040   frame #15: 0x00000001aebbcbcc UIKitCore`-[UIView initWithFrame:] + 124   frame #16: 0x00000001054f6274 MyProject`-[FIOSView initWithFrame:](self=<unavailable>, _cmd=<unavailable>, Frame=<unavailable>) at IOSView.cpp:233:14 [opt]   frame #17: 0x0000000102d237f8 MyProject`invocation function for block in FAppEntry::PlatformInit() [inlined] MainThreadInit() at LaunchIOS.cpp:348:24 [opt]   frame #18: 0x0000000102d237a8 MyProject`invocation function for block in FAppEntry::PlatformInit(.block_descriptor=<unavailable>) at LaunchIOS.cpp:373:47 [opt]   frame #19: 0x0000000119503ce4 libdispatch.dylib`_dispatch_call_block_and_release + 24   frame #20: 0x0000000119505528 libdispatch.dylib`_dispatch_client_callout + 16   frame #21: 0x0000000119513994 libdispatch.dylib`_dispatch_main_queue_callback_4CF + 972   frame #22: 0x00000001abdf85e0 CoreFoundation`__CFRUNLOOP_IS_SERVICING_THE_MAIN_DISPATCH_QUEUE__ + 12   frame #23: 0x00000001abdf2a88 CoreFoundation`__CFRunLoopRun + 2480   frame #24: 0x00000001abdf1ba0 CoreFoundation`CFRunLoopRunSpecific + 572   frame #25: 0x00000001c2b5a598 GraphicsServices`GSEventRunModal + 160   frame #26: 0x00000001ae6e32f4 UIKitCore`-[UIApplication _run] + 1052   frame #27: 0x00000001ae6e8874 UIKitCore`UIApplicationMain + 164   frame #28: 0x0000000102d312bc MyProject`main(argc=3, argv=0x000000016d4e7700) at LaunchIOS.cpp:584:13 [opt]   frame #29: 0x00000001abad0568 libdyld.dylib`start + 4
2
0
2.9k
Mar ’23
Asset Formats in Example Metal Projects
Hey all, I have a few questions about the Loading textures and models using Metal fast resource loading project. I'm not experienced in 3D rendering or Mac development in general, so bear with me :) I noticed that the model and texture data for the scene objects had a.dat extension and appeared to be binary data files. This is different from the models and textures from the Rendering a Scene with Deferred Lighting in C++, which contains .obj and .mtl files that seem to be commonly exported from 3D modelling programs like Blender or Maya. First question: I didn't notice any explanation of for the difference in format between the two projects in the source code. From reading a bit online it seems like binary formats are more efficient and generally more representative of what you'd want to actually ship, is that correct? Second, I was also wondering if the binary format used in "Loading textures and models using Metal fast resource loading" is in a standard format, or if it was created by the project's author just for the project. Third, I was wondering what the typical process was for storing assets in a binary format. Are these usually directly exported from 3D modelling programs? Or is intermediate output from 3D modelling programs (such as .obj files) usually parsed by the developer and then written to the binary format? If this is the case, are there commonly used libraries for this, or do people usually just hand-roll parsers? Any recommended learning material would be appreciated. Thank you!
0
0
763
Mar ’23
Is it a good idea to learn C++ to use metal for AR optimization? Instead of RealityKit and Reality Composer ( LOL ?
I have been working with AR for a while now, sorta learning still, and getting increasingly frustrated. I created complex animations using reality composer and now that feels like a joke. The thing is huge, I need to rewrite it in RealityKit but something tells me if I go Metal on this mofo it's going to decrease the latency and make things run so much faster. I really need my app to be as light as possible because the 3D graphics will function like a 3D UI system. My greatest painpoint is the anxiety I feel when I know my application will be HUGE and crash once I publish it. It makes me feel so anxious. Like an earthquake is coming. My goal is to create the lightest thing possible. I am reading a book on Linear Algebra for Machine Learning and I am leaning into this more mathy direction so I am thinking I might as well just re-write using metal and C++? never used metal it btw, never used C++ either. I have some experience with animation and experience with sculpture so the 3D world IRL is not new to me.
1
0
1.1k
Feb ’23
Setting up Metal cpp with GLFW and ImGUI
Dear experts, I'm working on adding UI for my cpp based path tracer renderer. I want to create metal cpp device and pass it to renderer, but also I want to use ImGui and GLFW (for window manager and input events handling). I've found solution how I can mix obj c code that requires by GLFW window setup and cpp code: https://github.com/ikryukov/MetalCppImGui // Here is key thing for integration GLFW with Metal Cpp // GLFW supports only obj c window handle NSWindow *nswin = glfwGetCocoaWindow(window); CA::MetalLayer* layer = CA::MetalLayer::layer(); layer->setDevice(device); layer->setPixelFormat(MTL::PixelFormatBGRA8Unorm); // bridge to obj c here because NSWindow expetcs objc CAMetalLayer* l = (__bridge CAMetalLayer*)layer; nswin.contentView.layer = l; nswin.contentView.wantsLayer = YES; Is there any official way to handle event in Metal cpp without objective c support? Maybe MetalKit will have such features in future?
1
0
3.9k
Feb ’23
Metal Ray Tracing per-primitive data size
Hi experts, I'm working on PathTracer using metal-cpp and I use per-primitive data from you latest presentation: https://developer.apple.com/videos/play/wwdc2022/10105/ and described here: https://developer.apple.com/documentation/metal/ray_tracing_with_acceleration_structures Currently I want to store something like this structure: struct Triangle {     vector_float3 positions[3];     uint32_t normals[3];     uint32_t tangent[3];     uint32_t uv[3]; }; Are there any guidelines on size of this small amounts of data that I could store in acceleration structure? And if it stores inside BVH nodes -> that could affect traversal performance. Thanks!
0
0
933
Feb ’23
Extending NS::Object removes constructor
Hi! I'm currently trying to convert this Objective-C example project ( https://developer.apple.com/documentation/metal/performing_calculations_on_a_gpu?language=objc ) to one using the metal-cpp wrapper. However when I make the MetalAdder class extend NS::Object (just like in the original codebase) it removes my constructor. class MetalAdder : public NS::Object{ ... } is what I have. When I instantiate this MetalAdder class as: MetalAdder adder; adder.initWithDevice(device); or auto adder = NS::TransferPtr(new MetalAdder); I get the error Call to implicitly-deleted default constructor of 'MetalAdder'. Is there something I'm doing wrong? Should I instantiate in a different way or should my MetalAdder class just not extend the NS::Object class? Thanks in advance!
1
0
1.6k
Feb ’23
Metal and low performance with parallel execution of kernels (MTLComputeCommandEncoder)
Hello All, I have code on CUDA, and I can create several CUDA streams and run my kernels in parallel and get a performance boost for my task. Next, I rewrote the code for Metal and try to parallelize the task in the same way. CUDA Streams Metal device: Mac Studio with M1 Ultra. (write the code on Metal-cpp) I creating several MTLCommandBuffer in 1 MTLCommandQueue or several MTLCommandQueue with more MTLCommandBuffer. Regarding Metal resources, there are two options: Buffers (MTLBuffer) was created with an option MTLResourceStorageModeShared. In the profiler, all Command buffers are performed sequentially on the timeline of Compute. Buffers (MTLBuffer) was created with an option "MTLResourceStorageModeShared | MTLResourceHazardTrackingModeUntracked". In the profiler, I really saw the parallelism. But the maximum number of threads in the Compute timeline is always no more than 2 (see pictures). Also weird. Computing commands do not depend on each other. METAL Compute timeline About performance: [1] In the first variant, the performance is the same for different amounts of MTLCommandQueue and MTLCommandBuffer. [2] In the second variant, the performance for one MTLCommandBuffer is greater than for 2 or more. Question: why is this happening? How to parallelize the work of the compute kernels to get an increase performance? Addition information: Also, the CUDA code is rewritten in OpenCL, and it is perfectly parallelized in Windows(NVIDIA/AMD/Intel) if several OpenCL queues are running. The same code running on M1 Ultra works the same way with 1 or with many OpenCL queues. In turn, Metal is faster than OpenCL, so I am trying to figure out exactly Metal, and make the kernels work in parallel on Metal.
4
0
2.4k
Dec ’22
Metal frame capture issues
I'm not sure if this is somehow a cadence issue or something. If I attempt to use the stylized M button in Xcode to kick off a GPU capture on iOS it seems to just go on forever capturing command buffers instead of exiting when we swap to the next display surface. We are presenting the current drawable with presentDrawable and invoking nextDrawable on the MetalLayer (but note that we are doing this by extending it to be supported from C++). If I trigger and end the capture myself it works fine, and so that works for now, but I'm curious if I'm doing something wrong that causes it not to recognize the end of frame correctly for the Xcode GUI version.
0
0
948
Dec ’22
Design libraries for Matrix Multiplication
Dear developers, i need support to develop a simple computation on the GPU. I would like to perform matrix multiplication: this will be good with metal-cpp because i need to export as cpp library. Following documentation: file Multiply.metal : kernel void multiply(device float *pMatA, device float *pMatB , device float *pMatC, device float *pMatR) { simdgroup_float8x8 sgMatA; simdgroup_float8x8 sgMatB; simdgroup_float8x8 sgMatR; simdgroup_load(sgMatA, pMatA); simdgroup_load(sgMatB, pMatB); simdgroup_multiply(sgMatR, sgMatA, sgMatB); simdgroup_store(sgMatR, pMatR); } File Multiply.hpp #include <Foundation/Foundation.hpp> #include <Metal/Metal.hpp> class Multiply { public: MTL::Device* m_device; MTL::ComputePipelineState *m_add_function_pso; MTL::CommandQueue *m_command_queue; MTL::Buffer *m_buffer_A; MTL::Buffer *m_buffer_B; MTL::Buffer *m_buffer_result; void init_with_device(MTL::Device*); void prepare_data(); void send_compute_command(); private: void generate_random_float_data(MTL::Buffer* buffer); void encode_dot_command(MTL::ComputeCommandEncoder* compute_encoder); void verify_results(); }; File Multiply.cpp #include <iostream> #include "Multiply.hpp" const unsigned int array_length = 1 << 5; const unsigned int buffer_size = array_length * sizeof(float); void Multiply::init_with_device(MTL::Device* device){ m_device = device; NS::Error* error; auto default_library = m_device->newDefaultLibrary(); if(!default_library){ std::cerr << "Failed to load default library."; std::exit(-1); } auto function_name = NS::String::string("multiply", NS::ASCIIStringEncoding); auto dot_function = default_library->newFunction(function_name); if(!dot_function){ std::cerr << "Failed to find the dot function."; } m_dot_function_pso = m_device->newComputePipelineState(dot_function, &error); m_command_queue = m_device->newCommandQueue(); }; void Multiply::prepare_data(){ m_buffer_A = m_device->newBuffer(buffer_size, MTL::ResourceStorageModeShared); m_buffer_B = m_device->newBuffer(buffer_size, MTL::ResourceStorageModeShared); m_buffer_result = m_device->newBuffer(buffer_size, MTL::ResourceStorageModeShared); generate_random_float_data(m_buffer_A); generate_random_float_data(m_buffer_B); } void Multiply::generate_random_float_data(MTL::Buffer* buffer) { float* data_ptr = (float*)buffer->contents(); for (unsigned long index = 0; index < array_length; index++) { for(unsigned long index2 =0; index2 < array_length; index2++) { data_ptr[index][index2] = (float)rand() / (float)(RAND_MAX); } } void Multiply::send_compute_command() { MTL::CommandBuffer* command_buffer = m_command_queue->commandBuffer(); // assert(command_buffer != nullptr); MTL::ComputeCommandEncoder* compute_encoder = command_buffer->computeCommandEncoder(); encode_dot_command(compute_encoder); compute_encoder->endEncoding();// MTL::CommandBufferStatus status = command_buffer->status(); // std::cout << status << std::endl; command_buffer->commit(); command_buffer->waitUntilCompleted(); verify_results(); } void Multiply::encode_dot_command(MTL::ComputeCommandEncoder* compute_encoder){ compute_encoder->setComputePipelineState(m_dot_function_pso); compute_encoder->setBuffer(m_buffer_A, 0, 0); compute_encoder->setBuffer(m_buffer_B, 0, 1); compute_encoder->setBuffer(m_buffer_result, 0, 2); MTL::Size grid_size = MTL::Size(array_length, 1, 1); NS::UInteger thread_group_size_ = m_dot_function_pso->maxTotalThreadsPerThreadgroup(); if(thread_group_size_ > array_length){ thread_group_size_ = array_length; } MTL::Size thread_group_size = MTL::Size(thread_group_size_, 1, 1); compute_encoder->dispatchThreads(grid_size, thread_group_size); } void Multiply::verify_results(){ auto a = (float*) m_buffer_A->contents(); auto b = (float*) m_buffer_B->contents(); auto result = (float*) m_buffer_result->contents(); for (unsigned long index = 0; index < array_length; index++) { for (unsigned long index2 = 0; index < array_length; index2++) { if (result[index][index2] != (a[index][index2] * b[index][index2])) { std::cout << "Comput ERROR: index=" << index << "result=" << result[index][index2] << "vs " << a[index][index2] + b[index][index2] << "=a*b\n"; assert(result[index][index2] == (a[index][index2] * b[index][index2])); } } std::cout << "Compute results as expected\n";}} Is all this implementation correct? Can someone kindly give suggestions about speed improvement or other solutions? Thank you in advance.
1
0
1.9k
Nov ’22
Metal API supported files for models?
Hello everyone! I have a small concern about one little thing when it comes to programming in metal. There are some models that I wish to use along with animations and skins on them, the file extension for them is called gltf. glTF has been used in a number of projects such as unity and unreal engine and godot and blender. I was wondering if metal supports this file extension or not. Anyone here knows the answer?
Replies
3
Boosts
1
Views
2.1k
Activity
Sep ’23
Metal Shader Converter shader debug symbols
Hello, I’ve started testing the Metal Shader Converter to convert my HLSL shaders to metallib directly, and I was wondering if the option ’-frecord-sources’ was supported in any way? Usually I’m compiling my shaders as follows (from Metal): xcrun -sdk macosx metal -c -frecord-sources shaders/shaders.metal -o shaders/shaders.air xcrun -sdk macosx metallib shaders/shaders.air -o shaders/shaders.metallib The -frecord-sources allow me to see the source when debugging and profiling a Metal frame. Now with DXC we have a similar option, I can compile a typical HLSL shader with embedded debug symbols with: dxc -T vs_6_0 -E VSMain shaders/triangle.hlsl -Fo shaders/triangle.dxil -Zi -O0 -Qembed_debug The important options here are ’-Zi` and ’-Qembed_debug’, as they make sure debug symbols are embedded in the DXIL. It seems that right now Metal Shader Converter doesn’t pass through the DXIL debug information, and I was wondering if it was possible. I’ve looked at all the options in the utility and haven’t seen anything that looked like it. Right now debug symbols in my shaders is a must-have both for profiling and debugging. For reference an alternative pipeline would be to use spir-v cross instead. For reference here's what a typical pipeline with dxc and spir-v cross look like: HLSL -> SPIRV -> AIR -> METALLIB dxc -T ps_6_0 -E PSMain -spirv shaders/triangle.hlsl -Zi -Qembed_debug -O0 -Fo shaders/triangle.frag.spirv spirv-cross --msl shaders/triangle.frag.spirv --output shaders/triangle.frag.metal xcrun -sdk macosx metal -c -frecord-sources shaders/triangle.frag.metal -o shaders/triangle.frag.air xcrun -sdk macosx metallib shaders/triangle.frag.air -o shaders/triangle.frag.metallib As you can see, it's a lot more steps than metal shader converter, but after all those steps you can get some sort of shader symbols in xcode when debugging a metal frame, which is better than nothing: Please let me know if I can provide files, projects or anything that can help supporting shader symbols directly with metal shader converter. Thank you for your time!
Replies
0
Boosts
0
Views
1.1k
Activity
Aug ’23
Metal Shader Library - invalid UUID
Hi, I am generating a Metal library that I build using the command line tools on macOS for iphoneos, following the instructions here. Then I serialise this to a binary blob that I load at runtime, which seems to work ok as everything renders as expected. When I am doing a frame capture and open up a shader function it tries to load the symbols and fails. I tried pointing it to the directory (and the file) containing the symbols file, but it never resolves those. In the bottom half of the Import External Sources dialogue there is one entry in the Library | Debug Info section: The library name is Library 0x21816b5dc0 and below Debug Info it says Invalid UUID. The validation layer doesn't flag any invalid behaviour so I am a bit lost and not sure what to try next?
Replies
1
Boosts
0
Views
1.1k
Activity
Jul ’23
Build and execute metal app which perform calcuations on gpu without using xcode
I am following this https://developer.apple.com/documentation/metal/performing_calculations_on_a_gpu on building a metal app for performing a GPU calculation. I am not able to figure out how to build and execute the project from the command line. Any help on how to build a main.m file using xcrun will be useful. I have tried xcrun -sdk macosx clang MetalComputeBasic/main.m but it doesn't work.
Replies
0
Boosts
0
Views
880
Activity
Jun ’23
Metal & C++
What is the best source for information/tutorial material on using Metal with C++? metal-cpp?
Replies
1
Boosts
0
Views
1.7k
Activity
May ’23
How to compute aspect using metal-cpp
Hi, I am experimenting with the 05-perspective.cpp code in LearnMetalCPP. I have made the window resizable by dragging an edge or corner with the mouse. But this causes the image to distort because the original drawing was square and the aspect set to 1.0. Now I want to compute aspect = width/height to use in simd::float4x4 makePerspective( float fovRadians, float aspect, float znear, float zfar ) in the 05-perspective.cpp app. What is the best way to get width and height, such as from CGSize.width and CGSize.height?
Replies
1
Boosts
0
Views
969
Activity
Apr ’23
Input with Metal-CPP, overriding MTK::View
Hi, I am new to Metal and macOS development, and trying to learn Metal with the CPP wrapper for a toy rendering engine. I am mostly following the "Learn Metal with C++" sample code. I am trying to read mouse and keyboard input. It seems like the Objective-C or Swift wrappers allow you to override your own MTK::View class, and then override the respective keyDown(), keyUp() methods. However, when looking at the CPP wrapper, MTK::View doesn't have any virtual functions to override. How can I read mouse and keyboard inputs in my application? Hopefully without having an Objective-C bridge. Thank you, Robin
Replies
4
Boosts
1
Views
3.5k
Activity
Apr ’23
Compilation Issues with Xcode 14 for <vector> <map> <iostream> etc.
I'm trying to compile with a brand new install of Xcode 14. A very simple .cpp and .h for testing. The .h has: #include "mcut.h" #include <stdio.h> #include <stdlib.h> The .cpp has: #include "MCut_Wrapper.hpp" #include <iostream> I know this doesn't do anything.... but with the #include When I compile I suddenly get: No member named 'memcpy' in namespace 'std::__1'; did you mean simply 'memcpy'? No member named 'memmove' in namespace 'std::1'; did you mean simply 'memmove'?_ etc. without the #include everything compiles without any errors. The same errors show up for any #includes such as etc. I've also tried adding using namespace std; But that doesn't work. I've tried a simple hello world program like this: #include "MCut_Wrapper.hpp" #include <iostream> #include <vector> #include <string> using namespace std; int main() { vector<string> msg{"Hello", "C++", "World", "from", "VS Code", "and the C++ extension!"}; for (const string& word : msg) { cout << word << " "; } cout << endl; } But it also throws out the same error. Any and all assistance to get this working would be appreciated. I'm attempting to make a bundle with Xcode 14 on a machine with an M2 Max Chip (if that makes any difference). Cheers
Replies
1
Boosts
0
Views
2.5k
Activity
Mar ’23
Why do developers say Xcode isn't that great?
Hi i am very new to this coding gig. I am an Art Student who happened to develop an interest due to some problems. To a lot of developers shock and disgust i am in fact starting with c/c++ and I am using the humble Xcode, out of lack of better knowledge. I tried to use visual basic however i had an SVN problem because my mac is on Catalina. I just want some intel into what makes Xcode not that great in case i run into problems in the future I would be able to know. Your info will be much appreciated.
Replies
0
Boosts
0
Views
987
Activity
Mar ’23
MTLCreateSystemDefaultDevice crash?
Device: iPod 7 iOS version: 14.4.1 , GMetalDevice = MTLCreateSystemDefaultDevice(); (lldb) bt * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGABRT   frame #0: 0x00000001d7a3784c libsystem_kernel.dylib`__pthread_kill + 8  * frame #1: 0x00000001f39c29e8 libsystem_pthread.dylib`pthread_kill + 212   frame #2: 0x00000001b4b738f4 libsystem_c.dylib`abort + 100   frame #3: 0x00000001bb238030 libsystem_malloc.dylib`malloc_vreport + 556   frame #4: 0x00000001bb2381e8 libsystem_malloc.dylib`malloc_report + 60   frame #5: 0x00000001bb22e2e8 libsystem_malloc.dylib`free + 432   frame #6: 0x00000001f48dab10 AGXMetalA10`___lldb_unnamed_symbol1249 + 1644   frame #7: 0x00000001f49064d4 AGXMetalA10`___lldb_unnamed_symbol1504 + 72   frame #8: 0x00000001c18ca694 Metal`+[MTLIOAccelDevice registerDevices] + 224   frame #9: 0x00000001c18ccbf8 Metal`invocation function for block in MTLDeviceArrayInitialize() + 872   frame #10: 0x0000000119505528 libdispatch.dylib`_dispatch_client_callout + 16   frame #11: 0x0000000119506e6c libdispatch.dylib`_dispatch_once_callout + 84   frame #12: 0x00000001c18cc470 Metal`MTLCreateSystemDefaultDevice + 200   frame #13: 0x00000001054f5f54 MyProject`+[FIOSView layerClass](self=<unavailable>, _cmd=<unavailable>) at IOSView.cpp:134:18 [opt]   frame #14: 0x00000001aebbd034 UIKitCore`UIViewCommonInitWithFrame + 1040   frame #15: 0x00000001aebbcbcc UIKitCore`-[UIView initWithFrame:] + 124   frame #16: 0x00000001054f6274 MyProject`-[FIOSView initWithFrame:](self=<unavailable>, _cmd=<unavailable>, Frame=<unavailable>) at IOSView.cpp:233:14 [opt]   frame #17: 0x0000000102d237f8 MyProject`invocation function for block in FAppEntry::PlatformInit() [inlined] MainThreadInit() at LaunchIOS.cpp:348:24 [opt]   frame #18: 0x0000000102d237a8 MyProject`invocation function for block in FAppEntry::PlatformInit(.block_descriptor=<unavailable>) at LaunchIOS.cpp:373:47 [opt]   frame #19: 0x0000000119503ce4 libdispatch.dylib`_dispatch_call_block_and_release + 24   frame #20: 0x0000000119505528 libdispatch.dylib`_dispatch_client_callout + 16   frame #21: 0x0000000119513994 libdispatch.dylib`_dispatch_main_queue_callback_4CF + 972   frame #22: 0x00000001abdf85e0 CoreFoundation`__CFRUNLOOP_IS_SERVICING_THE_MAIN_DISPATCH_QUEUE__ + 12   frame #23: 0x00000001abdf2a88 CoreFoundation`__CFRunLoopRun + 2480   frame #24: 0x00000001abdf1ba0 CoreFoundation`CFRunLoopRunSpecific + 572   frame #25: 0x00000001c2b5a598 GraphicsServices`GSEventRunModal + 160   frame #26: 0x00000001ae6e32f4 UIKitCore`-[UIApplication _run] + 1052   frame #27: 0x00000001ae6e8874 UIKitCore`UIApplicationMain + 164   frame #28: 0x0000000102d312bc MyProject`main(argc=3, argv=0x000000016d4e7700) at LaunchIOS.cpp:584:13 [opt]   frame #29: 0x00000001abad0568 libdyld.dylib`start + 4
Replies
2
Boosts
0
Views
2.9k
Activity
Mar ’23
Asset Formats in Example Metal Projects
Hey all, I have a few questions about the Loading textures and models using Metal fast resource loading project. I'm not experienced in 3D rendering or Mac development in general, so bear with me :) I noticed that the model and texture data for the scene objects had a.dat extension and appeared to be binary data files. This is different from the models and textures from the Rendering a Scene with Deferred Lighting in C++, which contains .obj and .mtl files that seem to be commonly exported from 3D modelling programs like Blender or Maya. First question: I didn't notice any explanation of for the difference in format between the two projects in the source code. From reading a bit online it seems like binary formats are more efficient and generally more representative of what you'd want to actually ship, is that correct? Second, I was also wondering if the binary format used in "Loading textures and models using Metal fast resource loading" is in a standard format, or if it was created by the project's author just for the project. Third, I was wondering what the typical process was for storing assets in a binary format. Are these usually directly exported from 3D modelling programs? Or is intermediate output from 3D modelling programs (such as .obj files) usually parsed by the developer and then written to the binary format? If this is the case, are there commonly used libraries for this, or do people usually just hand-roll parsers? Any recommended learning material would be appreciated. Thank you!
Replies
0
Boosts
0
Views
763
Activity
Mar ’23
Is it a good idea to learn C++ to use metal for AR optimization? Instead of RealityKit and Reality Composer ( LOL ?
I have been working with AR for a while now, sorta learning still, and getting increasingly frustrated. I created complex animations using reality composer and now that feels like a joke. The thing is huge, I need to rewrite it in RealityKit but something tells me if I go Metal on this mofo it's going to decrease the latency and make things run so much faster. I really need my app to be as light as possible because the 3D graphics will function like a 3D UI system. My greatest painpoint is the anxiety I feel when I know my application will be HUGE and crash once I publish it. It makes me feel so anxious. Like an earthquake is coming. My goal is to create the lightest thing possible. I am reading a book on Linear Algebra for Machine Learning and I am leaning into this more mathy direction so I am thinking I might as well just re-write using metal and C++? never used metal it btw, never used C++ either. I have some experience with animation and experience with sculpture so the 3D world IRL is not new to me.
Replies
1
Boosts
0
Views
1.1k
Activity
Feb ’23
Setting up Metal cpp with GLFW and ImGUI
Dear experts, I'm working on adding UI for my cpp based path tracer renderer. I want to create metal cpp device and pass it to renderer, but also I want to use ImGui and GLFW (for window manager and input events handling). I've found solution how I can mix obj c code that requires by GLFW window setup and cpp code: https://github.com/ikryukov/MetalCppImGui // Here is key thing for integration GLFW with Metal Cpp // GLFW supports only obj c window handle NSWindow *nswin = glfwGetCocoaWindow(window); CA::MetalLayer* layer = CA::MetalLayer::layer(); layer->setDevice(device); layer->setPixelFormat(MTL::PixelFormatBGRA8Unorm); // bridge to obj c here because NSWindow expetcs objc CAMetalLayer* l = (__bridge CAMetalLayer*)layer; nswin.contentView.layer = l; nswin.contentView.wantsLayer = YES; Is there any official way to handle event in Metal cpp without objective c support? Maybe MetalKit will have such features in future?
Replies
1
Boosts
0
Views
3.9k
Activity
Feb ’23
Metal Ray Tracing per-primitive data size
Hi experts, I'm working on PathTracer using metal-cpp and I use per-primitive data from you latest presentation: https://developer.apple.com/videos/play/wwdc2022/10105/ and described here: https://developer.apple.com/documentation/metal/ray_tracing_with_acceleration_structures Currently I want to store something like this structure: struct Triangle {     vector_float3 positions[3];     uint32_t normals[3];     uint32_t tangent[3];     uint32_t uv[3]; }; Are there any guidelines on size of this small amounts of data that I could store in acceleration structure? And if it stores inside BVH nodes -> that could affect traversal performance. Thanks!
Replies
0
Boosts
0
Views
933
Activity
Feb ’23
Extending NS::Object removes constructor
Hi! I'm currently trying to convert this Objective-C example project ( https://developer.apple.com/documentation/metal/performing_calculations_on_a_gpu?language=objc ) to one using the metal-cpp wrapper. However when I make the MetalAdder class extend NS::Object (just like in the original codebase) it removes my constructor. class MetalAdder : public NS::Object{ ... } is what I have. When I instantiate this MetalAdder class as: MetalAdder adder; adder.initWithDevice(device); or auto adder = NS::TransferPtr(new MetalAdder); I get the error Call to implicitly-deleted default constructor of 'MetalAdder'. Is there something I'm doing wrong? Should I instantiate in a different way or should my MetalAdder class just not extend the NS::Object class? Thanks in advance!
Replies
1
Boosts
0
Views
1.6k
Activity
Feb ’23
Does metal-cpp support big sur?
does it?
Replies
2
Boosts
0
Views
979
Activity
Dec ’22
Metal and low performance with parallel execution of kernels (MTLComputeCommandEncoder)
Hello All, I have code on CUDA, and I can create several CUDA streams and run my kernels in parallel and get a performance boost for my task. Next, I rewrote the code for Metal and try to parallelize the task in the same way. CUDA Streams Metal device: Mac Studio with M1 Ultra. (write the code on Metal-cpp) I creating several MTLCommandBuffer in 1 MTLCommandQueue or several MTLCommandQueue with more MTLCommandBuffer. Regarding Metal resources, there are two options: Buffers (MTLBuffer) was created with an option MTLResourceStorageModeShared. In the profiler, all Command buffers are performed sequentially on the timeline of Compute. Buffers (MTLBuffer) was created with an option "MTLResourceStorageModeShared | MTLResourceHazardTrackingModeUntracked". In the profiler, I really saw the parallelism. But the maximum number of threads in the Compute timeline is always no more than 2 (see pictures). Also weird. Computing commands do not depend on each other. METAL Compute timeline About performance: [1] In the first variant, the performance is the same for different amounts of MTLCommandQueue and MTLCommandBuffer. [2] In the second variant, the performance for one MTLCommandBuffer is greater than for 2 or more. Question: why is this happening? How to parallelize the work of the compute kernels to get an increase performance? Addition information: Also, the CUDA code is rewritten in OpenCL, and it is perfectly parallelized in Windows(NVIDIA/AMD/Intel) if several OpenCL queues are running. The same code running on M1 Ultra works the same way with 1 or with many OpenCL queues. In turn, Metal is faster than OpenCL, so I am trying to figure out exactly Metal, and make the kernels work in parallel on Metal.
Replies
4
Boosts
0
Views
2.4k
Activity
Dec ’22
drawIndexedPrimitives index buffer not showing up in iOS GPU capture?
I've stepped through to verify that I have MTL::Buffer objects for each index buffer, but when I capture the GPU frame it just reads "indexBuffer: Null" for each draw. Is this just a bug? I'm guessing so as some of the geometry is appearing correctly.
Replies
1
Boosts
0
Views
1.6k
Activity
Dec ’22
Metal frame capture issues
I'm not sure if this is somehow a cadence issue or something. If I attempt to use the stylized M button in Xcode to kick off a GPU capture on iOS it seems to just go on forever capturing command buffers instead of exiting when we swap to the next display surface. We are presenting the current drawable with presentDrawable and invoking nextDrawable on the MetalLayer (but note that we are doing this by extending it to be supported from C++). If I trigger and end the capture myself it works fine, and so that works for now, but I'm curious if I'm doing something wrong that causes it not to recognize the end of frame correctly for the Xcode GUI version.
Replies
0
Boosts
0
Views
948
Activity
Dec ’22
Design libraries for Matrix Multiplication
Dear developers, i need support to develop a simple computation on the GPU. I would like to perform matrix multiplication: this will be good with metal-cpp because i need to export as cpp library. Following documentation: file Multiply.metal : kernel void multiply(device float *pMatA, device float *pMatB , device float *pMatC, device float *pMatR) { simdgroup_float8x8 sgMatA; simdgroup_float8x8 sgMatB; simdgroup_float8x8 sgMatR; simdgroup_load(sgMatA, pMatA); simdgroup_load(sgMatB, pMatB); simdgroup_multiply(sgMatR, sgMatA, sgMatB); simdgroup_store(sgMatR, pMatR); } File Multiply.hpp #include <Foundation/Foundation.hpp> #include <Metal/Metal.hpp> class Multiply { public: MTL::Device* m_device; MTL::ComputePipelineState *m_add_function_pso; MTL::CommandQueue *m_command_queue; MTL::Buffer *m_buffer_A; MTL::Buffer *m_buffer_B; MTL::Buffer *m_buffer_result; void init_with_device(MTL::Device*); void prepare_data(); void send_compute_command(); private: void generate_random_float_data(MTL::Buffer* buffer); void encode_dot_command(MTL::ComputeCommandEncoder* compute_encoder); void verify_results(); }; File Multiply.cpp #include <iostream> #include "Multiply.hpp" const unsigned int array_length = 1 << 5; const unsigned int buffer_size = array_length * sizeof(float); void Multiply::init_with_device(MTL::Device* device){ m_device = device; NS::Error* error; auto default_library = m_device->newDefaultLibrary(); if(!default_library){ std::cerr << "Failed to load default library."; std::exit(-1); } auto function_name = NS::String::string("multiply", NS::ASCIIStringEncoding); auto dot_function = default_library->newFunction(function_name); if(!dot_function){ std::cerr << "Failed to find the dot function."; } m_dot_function_pso = m_device->newComputePipelineState(dot_function, &error); m_command_queue = m_device->newCommandQueue(); }; void Multiply::prepare_data(){ m_buffer_A = m_device->newBuffer(buffer_size, MTL::ResourceStorageModeShared); m_buffer_B = m_device->newBuffer(buffer_size, MTL::ResourceStorageModeShared); m_buffer_result = m_device->newBuffer(buffer_size, MTL::ResourceStorageModeShared); generate_random_float_data(m_buffer_A); generate_random_float_data(m_buffer_B); } void Multiply::generate_random_float_data(MTL::Buffer* buffer) { float* data_ptr = (float*)buffer->contents(); for (unsigned long index = 0; index < array_length; index++) { for(unsigned long index2 =0; index2 < array_length; index2++) { data_ptr[index][index2] = (float)rand() / (float)(RAND_MAX); } } void Multiply::send_compute_command() { MTL::CommandBuffer* command_buffer = m_command_queue->commandBuffer(); // assert(command_buffer != nullptr); MTL::ComputeCommandEncoder* compute_encoder = command_buffer->computeCommandEncoder(); encode_dot_command(compute_encoder); compute_encoder->endEncoding();// MTL::CommandBufferStatus status = command_buffer->status(); // std::cout << status << std::endl; command_buffer->commit(); command_buffer->waitUntilCompleted(); verify_results(); } void Multiply::encode_dot_command(MTL::ComputeCommandEncoder* compute_encoder){ compute_encoder->setComputePipelineState(m_dot_function_pso); compute_encoder->setBuffer(m_buffer_A, 0, 0); compute_encoder->setBuffer(m_buffer_B, 0, 1); compute_encoder->setBuffer(m_buffer_result, 0, 2); MTL::Size grid_size = MTL::Size(array_length, 1, 1); NS::UInteger thread_group_size_ = m_dot_function_pso->maxTotalThreadsPerThreadgroup(); if(thread_group_size_ > array_length){ thread_group_size_ = array_length; } MTL::Size thread_group_size = MTL::Size(thread_group_size_, 1, 1); compute_encoder->dispatchThreads(grid_size, thread_group_size); } void Multiply::verify_results(){ auto a = (float*) m_buffer_A->contents(); auto b = (float*) m_buffer_B->contents(); auto result = (float*) m_buffer_result->contents(); for (unsigned long index = 0; index < array_length; index++) { for (unsigned long index2 = 0; index < array_length; index2++) { if (result[index][index2] != (a[index][index2] * b[index][index2])) { std::cout << "Comput ERROR: index=" << index << "result=" << result[index][index2] << "vs " << a[index][index2] + b[index][index2] << "=a*b\n"; assert(result[index][index2] == (a[index][index2] * b[index][index2])); } } std::cout << "Compute results as expected\n";}} Is all this implementation correct? Can someone kindly give suggestions about speed improvement or other solutions? Thank you in advance.
Replies
1
Boosts
0
Views
1.9k
Activity
Nov ’22