Launch daemon running but XPC is down

Hello, I have a question about a edge case scenario.

Before that some info on my project- I have a launchdaemon that carries out some business logic, it also has XPC listener (built using C APIs).

Question-

  1. Can there be a situation when the daemon is up and running but the XPC listener is down(due to some error or crash)? If yes then do I need to handle it in my code or launchd will handle it?

  2. when the daemon is stopped or shut down, how do I stop the XPC listener? After getting listener object from xpc_connection_create_mach_service should I just call xpc_connection_cancel followed by a call to xpc_release?

Thanks!

K

Answered by DTS Engineer in 832531022
Written by Kray16 in 778924021
1. Can there be a situation when the daemon is up and running but the XPC listener is down … ?

Yes, but that doesn’t call for action on your part.

To understand what’s going on here you need more backstory. Lemme start you out with the discussion of launchd’s on-demand architecture in XPC and App-to-App Communication. I realise you’re not asking about app-to-app, but the on-demand architecture discussion is relevant to you.

So, if your launchd daemon publishes a named XPC endpoint via MachServices, launchd sets up its own listener for that endpoint. If a message comes to the endpoint, it starts your daemon process. Your code is then responsible for ‘checking in’ with launchd for that listener. Once it does that, it starts receiving connections.

So, if your daemon starts and fails to check in, your daemon will be running but the listener will be non-functional. Likewise, if your daemon checks in but then stops servicing the listener, it’ll be similarly non-functional. However, it doesn’t make sense to add additional code to your daemon to handle that case. Rather, the correct solution is to write your daemon so that it works reliably.

The crashing case is different. If your daemon exits, launchd starts the on-demand cycle again. That is, it resumes listening for messages on your named XPC endpoints and, if one arrives, it starts your daemon process. So, again, there’s nothing for you to do here.

Well, that’s not strictly true. If your daemon crashes, your clients connections will report an interrupted error. You need to design your XPC protocol such that your clients can handle this. For example, if a connection is interrupted the client has to assume that the daemon has lost any state about the connection. You can handle that by either not storing state in the daemon or having some mechanism for the daemon to recover that state, either itself or with the assistance of the client.

Written by Kray16 in 778924021
2. when the daemon is stopped or shut down, how do I stop the XPC listener?

You don’t.

First, terminology:

  • A daemon is either loaded or unloaded, which is whether launchd knows about it or not.

  • A loaded daemon may be started or stopped, which is whether the process is running or not.

If the daemon stops, the listener doesn’t go away. Rather, responsibility for it falls back to launchd, which does the on-demand thing as I described above.

OTOH, if someone unloads the daemon then the first thing that launchd does it stop it. That causes the responsibility for the listener to fall back to launchd, which then completes the unload which removes the listener.

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

Accepted Answer
Written by Kray16 in 778924021
1. Can there be a situation when the daemon is up and running but the XPC listener is down … ?

Yes, but that doesn’t call for action on your part.

To understand what’s going on here you need more backstory. Lemme start you out with the discussion of launchd’s on-demand architecture in XPC and App-to-App Communication. I realise you’re not asking about app-to-app, but the on-demand architecture discussion is relevant to you.

So, if your launchd daemon publishes a named XPC endpoint via MachServices, launchd sets up its own listener for that endpoint. If a message comes to the endpoint, it starts your daemon process. Your code is then responsible for ‘checking in’ with launchd for that listener. Once it does that, it starts receiving connections.

So, if your daemon starts and fails to check in, your daemon will be running but the listener will be non-functional. Likewise, if your daemon checks in but then stops servicing the listener, it’ll be similarly non-functional. However, it doesn’t make sense to add additional code to your daemon to handle that case. Rather, the correct solution is to write your daemon so that it works reliably.

The crashing case is different. If your daemon exits, launchd starts the on-demand cycle again. That is, it resumes listening for messages on your named XPC endpoints and, if one arrives, it starts your daemon process. So, again, there’s nothing for you to do here.

Well, that’s not strictly true. If your daemon crashes, your clients connections will report an interrupted error. You need to design your XPC protocol such that your clients can handle this. For example, if a connection is interrupted the client has to assume that the daemon has lost any state about the connection. You can handle that by either not storing state in the daemon or having some mechanism for the daemon to recover that state, either itself or with the assistance of the client.

Written by Kray16 in 778924021
2. when the daemon is stopped or shut down, how do I stop the XPC listener?

You don’t.

First, terminology:

  • A daemon is either loaded or unloaded, which is whether launchd knows about it or not.

  • A loaded daemon may be started or stopped, which is whether the process is running or not.

If the daemon stops, the listener doesn’t go away. Rather, responsibility for it falls back to launchd, which does the on-demand thing as I described above.

OTOH, if someone unloads the daemon then the first thing that launchd does it stop it. That causes the responsibility for the listener to fall back to launchd, which then completes the unload which removes the listener.

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

Likewise, if your daemon checks in but then stops servicing the listener, it’ll be similarly non-functional.

What do you mean by "servicing" the listener? Can you elaborate?

Written by Kray16 in 832960022
What do you mean by "servicing" the listener? Can you elaborate?

Sure. However, it’s a bit tricky because XPC has so many different APIs. They all do pretty much the same stuff, but the terminology doesn’t line up exactly. So, I’m gonna focus on the API I know best, Foundation.

In Foundation you manage a listener using NSXPCListener. For a launchd daemon, you construct that by calling init(machServiceName:). That name has to match one of the entries in your MachServices array. You activate the listener by calling activate(). When you do that, Foundation checks in with launchd to get a receive right on the underlying Mach port.

That’s step 1, checking in.

If there’s a connection request waiting on that port, Foundation spins up an NSXPCConnection for the new client connection and passes that to your listener(_:shouldAcceptNewConnection:). Your delegate then chooses what to accept the connection or not. If it does, the connection now runs independently of the listener. This process repeats for each new connection request.

That’s step 2, servicing the listener.

If you skip either of these steps, you’re not servicing your listener and things will be borked.

Note that in Foundation the listener has an internal serial Dispatch queue and it calls your delegate on that queue. If your delegate never returns — let’s say it does Thread.sleep(until: .distantFuture) — then that stops the listener from servicing requests, with similar borkage.

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

Launch daemon running but XPC is down
 
 
Q