Issue creating MTLBuffer from MTLTexture used as inputs in CoreML Custom Layer for GPU execution

Hi there,

I am trying to create a CoreML Custom layer that runs on the GPU, using Objective-C for CoreML setup and Metal for GPU programming.

I have created the CoreML model with the custom layer and can successfully execute on the GPU, I wish to create an MTLBuffer from an input MTLTexture in my setup actual GPU execution, although I can't seem to do so, or get access to the memory address to the MTLTexture memory.

When defining a custom layer in CoreML to run on the GPU, the following function needs to be defined, with the given prototype;

(BOOL) encodeToCommandBuffer:(id<MTLCommandBuffer>)commandBuffer inputs:(NSArray<id<MTLTexture>> *)inputs outputs:(NSArray<id<MTLTexture>> *)outputs error:(NSError *__autoreleasing  _Nullable *)error{

    // GPU Setup, moving data, encoding, execution and so on here

}

Here, the inputs are passed as an NSArray of MTLTexture's, we then pass these texture's on to the Metal Shader for computation. My problem is that I want to pass an MTLBuffer to the Metal Shader, which points to the input data, say inputs[0], but I am having troubling copying the input MTLTexture to an MTLBuffer.

I have tried using the MTLBlitCommandEncoder to copy the data from the MTLTexture to an MTLBuffer like so;

id<MTLBuffer> test_buffer = [command_PSO.device newBufferWithLength:(8) options:MTLResourceStorageModeShared];
id <MTLBlitCommandEncoder> blitCommandEncoder = [commandBuffer blitCommandEncoder];
[blitCommandEncoder copyFromTexture:inputs[0]
                            sourceSlice:0
                            sourceLevel:0
                           sourceOrigin:MTLOriginMake(0, 0, 0)
                             sourceSize:MTLSizeMake(1, 1, 1)
                               toBuffer:test_buffer
                      destinationOffset:0
                 destinationBytesPerRow:8
               destinationBytesPerImage:8];
[blitCommandEncoder endEncoding];

The above example should copy a single pixel from the MTLTexture, inputs[0], to the MTLBuffer, test_buffer, but this is not the case.

MTLTextures, getBytes also doesn't work as the inputs have MTLResourceStorageModePrivate set.

When I inspect the input MTLTexture I note that the attribute buffer = <null> and I'm wondering if this could be an issue since the texture was not created from a buffer, and perhaps doesn't store the address to memory easily, but surely we should be able to get the memory address somewhere?

For further reference, here is the input MTLTexture definition;

<CaptureMTLTexture: 0x282469500> -> <AGXA14FamilyTexture: 0x133d9bb00>
    label = <none> 
    textureType = MTLTextureType2DArray 
    pixelFormat = MTLPixelFormatRGBA16Float 
    width = 8 
    height = 1 
    depth = 1 
    arrayLength = 1 
    mipmapLevelCount = 1 
    sampleCount = 1 
    cpuCacheMode = MTLCPUCacheModeDefaultCache 
    storageMode = MTLStorageModePrivate 
    hazardTrackingMode = MTLHazardTrackingModeTracked 
    resourceOptions = MTLResourceCPUCacheModeDefaultCache MTLResourceStorageModePrivate MTLResourceHazardTrackingModeTracked  
    usage = MTLTextureUsageShaderRead MTLTextureUsageShaderWrite 
    shareable = 0 
    framebufferOnly = 0 
    purgeableState = MTLPurgeableStateNonVolatile 
    swizzle = [MTLTextureSwizzleRed, MTLTextureSwizzleGreen, MTLTextureSwizzleBlue, MTLTextureSwizzleAlpha] 
    isCompressed = 0 
    parentTexture = <null> 
    parentRelativeLevel = 0 
    parentRelativeSlice = 0 
    buffer = <null> 
    bufferOffset = 0 
    bufferBytesPerRow = 0 
    iosurface = 0x0 
    iosurfacePlane = 0 
    allowGPUOptimizedContents = YES
    label = <none>

The code posted looks correct as far as encoding the commands goes. So we'd need a little more context to be sure your doing all the right things. You should not need to create a texture from a buffer to get the pixel values out of a texture.

Some obvious things to check:

  • Have you committed the command buffer?
  • Do you wait for the command buffer to complete before reading from the buffer with the CPU?

Hi there, many thanks for the quick response.

Below is an example encodeToCommandBuffer function, required by Core ML for Custom layers to run on the GPU

- (BOOL) encodeToCommandBuffer:(id<MTLCommandBuffer>)commandBuffer inputs:(NSArray<id<MTLTexture>> *)inputs outputs:(NSArray<id<MTLTexture>> *)outputs error:(NSError *__autoreleasing _Nullable *)error{
   
  id<MTLBuffer> test_buffer = [PSO.device newBufferWithLength:(8) options:MTLResourceStorageModeShared];
  id <MTLBlitCommandEncoder> blitCommandEncoder = [commandBuffer blitCommandEncoder];

  [blitCommandEncoder copyFromTexture:inputs[0]
              sourceSlice:0
              sourceLevel:0
              sourceOrigin:MTLOriginMake(0, 0, 0)
               sourceSize:MTLSizeMake(1, 1, 1)
                toBuffer:test_buffer
           destinationOffset:0
         destinationBytesPerRow:8
        destinationBytesPerImage:8];

  [blitCommandEncoder endEncoding];

  id<MTLComputeCommandEncoder> computeEncoder = [commandBuffer computeCommandEncoderWithDispatchType:MTLDispatchTypeSerial];
  assert(computeEncoder != nil);

  [computeEncoder setComputePipelineState:PSO];
  [computeEncoder setTexture:inputs[0] atIndex:0]; // Input
  [computeEncoder setTexture:outputs[0] atIndex:2]; // Output
   
  MTLSize dimThreadsBlock, dimThreadsGrid;
   
  dimThreadsGrid = MTLSizeMake(1, 1, 1);
  dimThreadsBlock = MTLSizeMake(1, 1, 1);

  [computeEncoder dispatchThreads: dimThreadsGrid threadsPerThreadgroup:dimThreadsBlock];
   
  [computeEncoder endEncoding];

Note that the above does not include [commandBuffer commit]; or [commandBuffer waitUntilCompleted];. If these are added an error occurs, stating;

failed assertion _status < MTLCommandBufferStatusCommitted at line 300 in -[IOGPUMetalCommandBuffer setCurrentCommandEncoder:]

I can run code on the GPU successfully for my Custom Layer without explicitly committing the commandBuffer within the encodeToCommandBuffer function, this make me believe this is handled by Core ML internally (or so the name of the function would imply)?

Note that all setup for GPU is done prior to above function being called, in the initWithParameterDictionary function required by Core ML;

- (instancetype) initWithParameterDictionary:(NSDictionary<NSString *,id> *)parameters error:(NSError *__autoreleasing _Nullable *)error{
 //kernel_height = [parameters[@"kernel_height"] intValue];
 //kernel_width = [parameters[@"kernel_width"] intValue];
   
 //NSLog(@"kh kw %@", [parameters allKeys]);
 //self = [super init]; // This line might introduce breaks !!!!!

 // Metal GPU setup
 id<MTLDevice> device = MTLCreateSystemDefaultDevice();

 NSError* error_pso = nil;
   
 id<MTLLibrary> defaultlibrary = [device newDefaultLibrary];
 if (defaultlibrary == nil){
  NSLog(@"Failed to find the default library");
  return nil;
 }
   
 id<MTLFunction> function = [defaultlibrary newFunctionWithName:@"kernel"];
 if (rans_decode == nil){
  NSLog(@"Failed to find the function");
  return nil;
 }

PSO = [device newComputePipelineStateWithFunction:function error:&error_pso];
 if (PSO == nil || error_pso != nil){
  NSLog(@"Failed to find the default library");
  return nil;
 }
   
 return self;
}

Thanks again!

You are correct. That assertion is letting you know that you are submitting a command buffer twice:

"Don’t commit the command buffer in this method; Core ML executes the command buffer after this method returns." https://developer.apple.com/documentation/coreml/mlcustomlayer/2936859-encodetocommandbuffer

I believe the next thing to check is if your test_buffer resource is actually used/synchronized. How are you reading the data at test_buffer? In your code snippet, it does not look like you are binding the buffer to the compute pipeline, [computeEncoder setBuffer:test_buffer offset:0 atIndex:0], nor reading its contents on the CPU, commandBuffer addCompletedHandler: to print the buffer's contents & to verify that the buffer is populated.

see also "Copying Data from a Private Texture to a Shared Buffer" in https://developer.apple.com/documentation/metal/resource_fundamentals/copying_data_to_a_private_resource?language=objc

Issue creating MTLBuffer from MTLTexture used as inputs in CoreML Custom Layer for GPU execution
 
 
Q