Why use dispatch_semaphore to explicitly synchronize MTLbuffer updates?

Question

Created Oct ’22

Replies 1

Boosts 0

Views 1.2k

Participants 2

Hi. I'm new to Metal(actually any type of software development run on apple products). I have many questions about using MTL::Buffer and dispatch_semaphore, and drawInMTKView(). I read README.md, but I need some more help understanding it.

Full code of 03-animation in metal-cpp sample

This is sample code in metal-cpp sample code by apple(I downloaded here). In this code, _pFrameData is an array of MTLBuffer, and kMaxFramesInFlight is the size of this array. Its type is static const int, and the value is 3. When Renderer is created, _pFrameData are initialized like that.

void Renderer::buildFrameData()
{
    for ( int i = 0; i < Renderer::kMaxFramesInFlight; ++i )
    {
        _pFrameData[ i ]= _pDevice->newBuffer( sizeof( FrameData ), MTL::ResourceStorageModeManaged );
    }
}

draw method, call by drawInMTKView.

void Renderer::draw( MTK::View* pView )
{
    NS::AutoreleasePool* pPool = NS::AutoreleasePool::alloc()->init();

    _frame = (_frame + 1) % Renderer::kMaxFramesInFlight;
    MTL::Buffer* pFrameDataBuffer = _pFrameData[ _frame ];

    MTL::CommandBuffer* pCmd = _pCommandQueue->commandBuffer();

    dispatch_semaphore_wait( _semaphore, DISPATCH_TIME_FOREVER );
    Renderer* pRenderer = this;
    pCmd->addCompletedHandler( ^void( MTL::CommandBuffer* pCmd ){
        dispatch_semaphore_signal( pRenderer->_semaphore );
    });

    reinterpret_cast< FrameData * >( pFrameDataBuffer->contents() )->angle = (_angle += 0.01f);
    pFrameDataBuffer->didModifyRange( NS::Range::Make( 0, sizeof( FrameData ) ) );
    MTL::RenderPassDescriptor* pRpd = pView->currentRenderPassDescriptor();
    MTL::RenderCommandEncoder* pEnc = pCmd->renderCommandEncoder( pRpd );

    pEnc->setRenderPipelineState( _pPSO );
    pEnc->setVertexBuffer( _pArgBuffer, 0, 0 );
    pEnc->useResource( _pVertexPositionsBuffer, MTL::ResourceUsageRead );
    pEnc->useResource( _pVertexColorsBuffer, MTL::ResourceUsageRead );

    pEnc->setVertexBuffer( pFrameDataBuffer, 0, 1 );
    pEnc->drawPrimitives( MTL::PrimitiveType::PrimitiveTypeTriangle, NS::UInteger(0), NS::UInteger(3) );

    pEnc->endEncoding();
    pCmd->presentDrawable( pView->currentDrawable() );
    pCmd->commit();
    pPool->release();
}

Q1. what is the meaning of kMaxFramesInFlight's name and value?

Q2. how are dispatch_semaphore_wait() and drawInMTKView() working? At first, I guess if the count of dispatch_semaphore change 0, the Renderer::draw are blocked by dispatch_semaphore_wait() until GPU read the buffer and execute dispatch_semaphore_signal. But now I think it's not a correct understanding because I don't know about drawInMTKView. How much drawInMTKView is called in 1 second and when?

Q3. and.... why use dispatch_semaphore for here? I try to change my code to use a single MTLBuffer for the same work. Just changing some code(add a single buffer, remove code for dispatch_semaphore), the changed code works same.

Boost

Answer 1

Apple Designer OP

Apple

Oct ’22

Accepted Answer

The main reason is to synchronize the use of resources to prevent data race conditions. The kMaxFrames constant is defined to three to allow up to three frames to be rendered simultaneously by the graphics processor. There is a MTLDrawable object that the command buffer you are referring to will render to. That also has its own synchronization when you call pView->currentRenderPassDescriptor() / pView->currentDrawable() above. For MTKView, it's typical to see 60Hz as the number of times it will be called because of vertical synchronization.

The bigger question is question three: why should you use three buffers and a semaphore. The CPU and GPU are on different timelines. If you were to record an Instruments trace (Metal System Trace for instance) on a running graphics application, you'll see this. So when you call pCmd->commit(), the driver will schedule the work for that command buffer and put it in a queue. After drawInMTKView is finished, it won't be long until it is called again to create more work for the GPU. It may be called immediately and possibly before doing the command buffer work you had submitted.

Let's consider if you were using only one buffer. Let's say you are updating the camera transform for a frame and you put it in the MTLBuffer. Then you submit your command buffer. When drawInMTKView gets called next, your command buffer may have finished, or it might be in progress, or it might be scheduled. If you were to update the camera transform in your frame, then if your command buffer had already finished, there's no problem. But the latter two cases are a problem (especially for a heavy graphics workload) because you may be overwriting data that the graphics processor is reading or is about to read. Hence, if you need to update a buffer frequently, you should reserve a buffer for each frame so that you avoid this problem.

This is especially important if you use MTL::ResourceStorageModeShared and that is a common pattern for Apple system that have a unified memory architecture. We do have a WWDC video with an illustration of these kinds of data race conditions. Some of this content starts at about 12 minutes into the video.

https://developer.apple.com/videos/play/wwdc2021/10148/

2