How do I get multiple Floats available to a single thread in metal?

I've got a MLTBuffer with 20 Floats in it. I want each thread to have 4 floats available for it to work on, meaning a total of 5 threads.

I've been playing around with this for a day and so far no joy/luck. I have attempted to manipulate the two MTLSizes that are passed to dispatchThreads without success. I can get this to work if both MTLSizes are (MemoryLayout<Float>.stride*20,1,1) but then the number of threads used is much too large.

Any ideas?

Thanks ahead of time.

Lee

lbarney,

You can access any part of a buffer set as an input to a kernel or shader. For instance, this kernel adds 20 floats in a buffer in a single thread:

kernel void add_floats(device const float* mybuffer)
{

    float sum = 0;

    for(int i = 0; i < 20; i++)

    {

        sum += mybuffer[i];

    }

    ...

}
How do I get multiple Floats available to a single thread in metal?
 
 
Q