Hi - I've been developing code to determine star shapes using the GPU.
The code passes a patch of intensity values near the star and then uses a Levenberg-Marquardt algorithm to get a least squares fit.
I can get the code to work well for up to five stars at a time provided I pass an array of size 15x15. However, if I pass more than 5 stars, I get a GPU timeout error.
If I change the size of the array to 16x16, or 14x14 then even one star will cause a GPU timeout error.
The Levenberg-Marquardt algorithm does use lots of if statements - about 10 per loop, and the loop is executed multiple times. Not sure if there is a limit to the number of if statements allowed?
I use one threadgroup with a single thread per star.
Is there anyway to debug this to work out what causes the problem? In one run I did get a slightly different error:
Stack Overflow Exception. Please check the [MTLComputePipelineDescriptor maxCallStackDepth] setting.
The subroutine does call routines which call others, so that may be an issues - I think the call depth is max 2.
Any other thoughts gratefully received.
Colin