Buffer size limitations

I'm sitting on a working CUDA codebase that I would like to make run on hardware other than Nvidia GPUs. On macOS, OpenCL and Metal are the obvious choices, and with Apple not updating their OpenCL implementaiton in ages, I'm reading Metal documentation right now.


One thing that stuck out, in the Metal device capabilities table, it lists "Maximum buffer length: 256MB". Does that mean I cannot allocate a memory buffer larger than that in Metal? If so, then that already disqualifies Metal from my use cases, since I'd like to work on buffers of 1GB and more. Or am I misreading this?


What is the API of choice for GPU compute application on macOS then?

Answered by thomas.pintaric in 173224022

Well, despite being documented as 512MB in one place and 256MB in another place, an exception is thrown on OS X 10.11.6 when trying to allocate buffers larger than 256MB.


Example code:

Tested with XCode Version 8.0 beta 6 (8S201h), deployment target: OSX 10.11


NSArray <id<MTLDevice>> *devices = MTLCopyAllDevices();
id<MTLDevice> device = devices[0]; // in my case: "NVIDIA GeForce GT 750M"
id<MTLCommandQueue> commandQueue = [device newCommandQueue];
id<MTLCommandBuffer> commandBuffer = [commandQueue commandBuffer];
id<MTLComputeCommandEncoder> encoder = [commandBuffer computeCommandEncoder];
id<MTLBuffer> buffer = [device newBufferWithLength:(256*1024*1024) options:MTLResourceStorageModeShared]; // works
id<MTLBuffer> buffer2 = [device newBufferWithLength:(512*1024*1024) options:MTLResourceStorageModeShared]; // throws exception
[commandBuffer commit];
[commandBuffer waitUntilCompleted];


Exception thrown by the code on line 7:


/Library/Caches/com.apple.xbs/Sources/Metal/Metal-56.6.1/ToolsLayers/Debug/MTLDebugDevice.mm:89: failed assertion `newBufferWith*:length 0x20000000 must not exceed 256 MB.'

You haven't misread the feature set tables. A single MTLBuffer is limited to a maximum length of 256MB, even on OSX (OS X GPU Family 1 v1).

The maximum buffer size on macOS is documented as 512 MB. If you need a total allocation of more than 512 MB, you can allocate multiple buffers and split your data among them.

Accepted Answer

Well, despite being documented as 512MB in one place and 256MB in another place, an exception is thrown on OS X 10.11.6 when trying to allocate buffers larger than 256MB.


Example code:

Tested with XCode Version 8.0 beta 6 (8S201h), deployment target: OSX 10.11


NSArray <id<MTLDevice>> *devices = MTLCopyAllDevices();
id<MTLDevice> device = devices[0]; // in my case: "NVIDIA GeForce GT 750M"
id<MTLCommandQueue> commandQueue = [device newCommandQueue];
id<MTLCommandBuffer> commandBuffer = [commandQueue commandBuffer];
id<MTLComputeCommandEncoder> encoder = [commandBuffer computeCommandEncoder];
id<MTLBuffer> buffer = [device newBufferWithLength:(256*1024*1024) options:MTLResourceStorageModeShared]; // works
id<MTLBuffer> buffer2 = [device newBufferWithLength:(512*1024*1024) options:MTLResourceStorageModeShared]; // throws exception
[commandBuffer commit];
[commandBuffer waitUntilCompleted];


Exception thrown by the code on line 7:


/Library/Caches/com.apple.xbs/Sources/Metal/Metal-56.6.1/ToolsLayers/Debug/MTLDebugDevice.mm:89: failed assertion `newBufferWith*:length 0x20000000 must not exceed 256 MB.'

Thanks, that saves me a lot of headaches then - I'll pass on using Metal for now.

Thanks for investigating. I've filed a radar (28058425) to address this.

We've updated the documentation to address this discrepancy. Thanks again for the report.

Buffer size limitations
 
 
Q