We have a product which uses a Network Extension (a socket filter and a packet content filter). The application contains the network extension, as well as an un-sandboxed LaunchDaemon which connects to the service at the NEMachServiceName.
Occasionally, usually after an upgrade where the system extension is swapped for the new version, our un-sandboxed process isn't able to contact the network extension. From the logging, we receive the following XPC error
(libxpc.dylib) [com.apple.xpc:connection] [0x7fd6d0307f40] failed to do a bootstrap look-up: xpc_error=[3: No such process]
in the unsandboxed process. Eventually, we receive an invalidated
callback on the XPC connection with the error Couldn’t communicate with a helper application.
. We have confirmed that an appropriate service is running via the launchctl command, and the network extension process appears to have initialised correctly. We don't see any indication of a received connection at the Network Extension process however (probably not surprising given the error).
Once a system enters this state, repeated attempts to connect are unsuccessful and continue to produce the same error.
We've also confirmed that there are no XPC codec exceptions apparent that might cause the connection to fail.
I'm at a bit of a loss to explain why this failure might be occurring, other than a problem in the bootstrap/launchd being able to find the appropriate service. Is there possibly some problem with unsandboxed processes accessing the sandboxed network extension via XPC? They are both provisioned in an app group together. Is there possibly some issue where attempting to connect at a critical point during network extension installation causes it to become inaccessible?
We've observed this specifically on macOS 14.5 (23F79), however this is something we've noticed on other versions of macOS and our code. The problem isn't systematic, and systems end up in this state only occasionally. We do seem to find some customers have more instances of this problems than others, but we haven't been successful at teasing out any common thread that might explain why.