Hello everyone, I have a visual convolutional model and a video that has been decoded into many frames. When I perform inference on each frame in a loop, the speed is a bit slow. So, I started 4 threads, each running inference simultaneously, but I found that the speed is the same as serial inference, every single forward inference is slower. I used the mactop tool to check the GPU utilization, and it was only around 20%. Is this normal? How can I accelerate it?
CoreML Inference Acceleration
hello, can anyone help me? please
Instruments is your friend. Check this WWDC video: https://developer.apple.com/videos/play/wwdc2023/10049.
Core ML used to serialize predictions per MLModel
instance. In recent years this per-instance lock has been relaxed, but the optimization is often available only for the newer model type (ML Program
) and API usage (async predictions.)
Using Instruments, we can see which activities are serialized and make an informed decision to utilize the compute resource.