Is there anyway we can set the number of threads used during coreML inference? My model is relatively small and the overhead of launching new threads is too expensive. When using TensorFlow C API, forcing to single thread results in significant decrease in CPU usage. (So far coreML with multiple threads has 3 times the cpu usage compares to TensorFlow with single thread).
Also, wondering if anyone has compared the performance between TensorFlow in C and coreML?