On and off I've been trying to figure out how to do hang detection in-application (at least from the user's point of view). Qualitatively what I'd like to do is have a process which runs sample(1)
on the application after it's been unresponsive for more than a second or so. Basically, an in-app replacement for Spin Control. The problem I've been stuck on is: how do I tell?
There used to be Core Graphics SPI (CGSRegisterNotifyProc
with a value of kCGSEventNotificationAppIsUnresponsive
) for doing this, but it doesn't work anymore (either due to sandboxing or system-wide security changes, I can't tell which but it doesn't matter).
One thought I had was to have an XPC service which would expect to receive a checkin once per second from the host (via a timer set up by the host). If it didn't, it would start sample(1)
. This seems pretty heavyweight to me, since it means that once per second, I'm going to be consuming cycles to check in with the service. But I haven't been able to come up with a scheme that doesn't include some kind of check-in by the target process.
Are there any APIs or strategies that I could use to accomplish this? Or is there some entitlement which would allow the application to request "application became unresponsive"/"application became responsive" notifications from the window server?
This is an interesting problem, and Kevin and I sat down to chat about it yesterday. We have some suggestions for you but, before I go into the details, I have a three point preface:
-
Have you look at MetricKit for this? It seems like its
MXHangDiagnostic
payload would be really helpful. And it’s definitely a lot easier that anything I’m suggesting here. -
Beyond that, there’s no API for this. If you’d like to see us add something to help in this space, you should file an enhancement request describing your requirements. If you do, please post your bug number, just for the record.
-
The approach I’m going to suggest is risky. It has many of the same challenges as building a crash reporter, which is something I cover in depth in Implementing Your Own Crash Reporter. If you do go down this path, read that post carefully.
In terms of doing this yourself, here’s how I’d approach it.
-
Start a thread that preallocates all of the necessary resources and waits for events.
-
Add something to your run loop that pings that thread.
-
If the thread doesn’t get a ping from the run loop within your timeout, have your thread sublaunch a helper tool.
-
Have that helper tool run
sample
on your app and wrangle the result.
IMPORTANT This thread is kinda like an async signal handler. It can run at any time, including when various global locks are held. In fact, that’s exactly the sort of time when you’d expect it to run! So it can’t use Objective-C or Swift, call malloc
, and so on. It has to be C or C++, and it’d be best if you restricted it to using just system calls [1].
That’s why step 1 preallocates stuff. You don’t want to call NSBundle
to locate the helper tool when your process is potentially hung, so do that stuff in advance [2]. When it determines that the main thread is hung, the process should run the helper tool with posix_spawn
and that’s about it.
Regarding step 2, Kevin and I had different suggestions on that front. I’d experiment with using a run loop observer. The .afterWaiting
activity is a good place to ‘start’ your timer, and the .beforeWaiting
activity is a good place to ‘stop’ it.
IMPORTANT I’m not literally talking about a timer. Rather, you’d do something to deactivate the waiting thread while your main thread is waiting in the run loop.
OTOH, Kevin suggested using just a timer for this. Add a slow-runner NSTimer
that pings the waiting thread and that’s it. That has the disadvantage of preventing your app from suspending for long periods of time, but it’s a lot simpler and, if you choose a sufficiently long interval, it’s not going to have much impact. Kevin point out is that simplicity trump absolute efficiency in this space, and I can’t disagree with that (-:
In terms of how to communicate between threads, I’d probably use a Unix domain socket for that. The advantage of a socket is that you can send and receive messages and block with a timeout, all using system calls.
Share and Enjoy
—
Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"
[1] Well, calls libSystem routines that are directly backed by a system call.
[2] Caching the path is a problem if the user moves your app while it’s running. If your app is otherwise resilient to such shenanigans, you could open the helper tool in advanced and then gets its path (using fcntl
with F_GETPATH
) immediately before spawning it. That’s a bit more complex, but it should be safe as long as you’re working with a preallocated buffer.