For apps using the 'MLX distributor' or local adapters, are there any specialized background processing entitlements needed to ensure inference isn't killed by the OS during long-running tasks
For apps using the 'MLX distributor' or local adapters
I'm not sure what you mean here, can you elaborate?
are there any specialized background processing entitlements needed to ensure inference isn't killed by the OS during long-running tasks
In general, if you want to allow a model inference to continue running in the background, you should look at the Background Tasks framework to achieve this: https://developer.apple.com/documentation/BackgroundTasks . For models, it's especially important to consider the memory footprint, so make sure your model is small enough to fit on your targeted devices without any issues. If your model uses GPU in the background, you may also need to add the com.apple.developer.background-tasks.continued-processing.gpu entitlement:
Hope that helps!