Simulating a signal

Hello,

I'm trying to simulate miscellaneous crashes to test my handlers. Things works as expected with NSException and C++ exceptions, however I cannot find a way to trig a C signal.

I tried with this code:

NSArray *runningApplications = [NSRunningApplication runningApplicationsWithBundleIdentifier:@"com.myCompany.myApp"];
    NSRunningApplication *app = runningApplications[0];
    UInt32 pid = [app processIdentifier];
    kill(pid, SIGABRT);

It is caught by my handler, but it doesn't crash the app (although it's detached from the debugger), I can even continue using the app normally.

I'm wondering if this could be related to something wrong in my handler (especially on how it ends):

signal(sig, SIG_IGN);
dispatch_source_t source = dispatch_source_create(DISPATCH_SOURCE_TYPE_SIGNAL, sig, 0, dispatch_get_global_queue(0, 0));
dispatch_source_set_event_handler(source, ^{

    // I write some logs on disk here, then uninstall the handlers associated with this or that signal:

    for(int i=0; i<SignalSourceCount; i++) {
        if (_signalSources[i]) {
            dispatch_source_cancel(_signalSources[i]);
            _signalSources[i] = NULL;
         }
     }
});
dispatch_resume(source);

I've seen some examples finishing rather with exit() or abort(). Abort crashes the app as expected, however the Crash Report produced by Apple then focuses on the handler instead of the code triggering the signal...

Any help appreciated, thanks

Accepted Reply

This context is not captured within the handler but before, during the normal usage of the app.

That’s still quite tricky. You have to protect this data structure from concurrent access, because the signal handler might fire while you’re updating it. And you can’t use a mutex for that, because the signal handler might fire regardless of whether the updating thread is holding that mutex.

To make this work you need an atomic. Consider this general outline:

  1. Maintain the data structure in the normal way, if necessary using a mutex.

  2. Then flatten it into a pointer-sized snapshot for the benefit of your signal handler.

  3. Atomically replace the old snapshot with the new.

  4. In the signal handler, work from the current snapshot. Remember that this read needs to be atomic.

But even this isn’t as easy as it sounds. Consider this sequence:

  1. The signal handler fires on thread A.

  2. It reads the snapshot and starts working on it.

  3. Thread B, which continues to run even though you’re in a signal handler, replaces the snapshot with a new one.

  4. And then deallocates the old one.

At this point the signal handler is working with deallocated memory.

There are ways around this but none of them are straightforward:

  • You could have the signal handler suspend all the other threads before it starts working from the snapshot. But that introduces its own complexities, for example, how do you avoid deadlock if two threads crash at the same time.

  • You could create a more elaborate data structure. The key term here is lock free. If you search the ’net for that, you’ll find lots of info about it. However, you’re in a particularly bad place because of the signal handler constraints.

Finally … I should implement a second handler using signal or sigaction?

If you’re doing a crash reporter, ignore Dispatch sources and use only sigaction.


Finally, this is on the Mac right? If so, you have a much better option: Do this in a separate process. So, spawn a child process and have it monitor the app process. When the app process dies, the child process writes out the app process’s state. This is in a separate process, and thus has none of the constraints associated with a signal handler.

The traditional challenge with such an approach is getting state out of the crashed process. That involves Mach exception handlers. Given the choice between signal handlers, Mach exception handlers, and gnawing my own leg off, I’m not 100% sure which way I’d go (-:

However, in this case that’s not needed because you only want to capture app state. Thus, you could use the snapshot approach I’ve described above but put the result into memory that you share between the app and this child process. The mechanics of this are still tricky, but it’s likely to be more reliable than anything you do in process. Moreover, there’s no way it could disrupt the Apple crash reporter.

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

Replies

In this case the only winning move is not to play (-: I have an extensive explanation as to why in Implementing Your Own Crash Reporter.

So, I’m going to be blunt here, and I apologise if this comes off as rude, but…

The code snippet you posted suggests you’ve massively underestimated the difficulty of this task. I recommend that you not work on your own crash reporter but instead invest your time in other efforts.

WARNING A badly implemented crash reporter will prevent the Apple crash reporter from working properly, which makes it harder to debug any real crashes you encounter.

If you do decide to go down the crash reporting rabbit hole, there’s a limit to how much I can follow you.


Anyway, coming back to your question, your code is failing because you’re using a Dispatch source for your signal handling. Normally a Dispatch source is the best way to implement a signal handler, but that’s not true when you’re trying to catch the types of signals involved in a crash (things like SIGBUS, SIGSEGV, and SIGABRT). In that case you need a signal handler. Install this using sigaction, or the legacy signal. Both of these have man pages.

Implementing an async signal handler is really hard because there are very tight limits on what you can do in that code. I talk about this in detail in Implementing Your Own Crash Reporter, but some highlights are:

  • There’s a short list of system routines you’re allowed to call.

  • That list doesn’t include malloc

  • Which means that you can’t use Swift or Objective-C. It’s C or C++ only.

  • Even calling routines across a shared library boundary can get you into trouble.


One other option here is to use a third-party crash reporter. I’ve looked at a lot of third-party crash reporters, and I’ve yet to see one that actually follows all the rules )-:

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

Thanks for the reply!

You're right, I probably underestimate the difficulty. However I've already read your interesting post about implementing a custom crash report, and many other related articles on the web. From a previous discussion with you, I finally decided to implement a minimal handling of signals and C++ exceptions and to postpone the dialog I was displaying - to inform the user that we're aware of the problem and try to fix it - to the next startup (if any...).

My attempt here is not to test the real instable context of miscellaneous crashes, but to test the mechanism of saving some files on disk while crashing and react to the existence of these files during the next launch. These kind of things is difficult to test, but I recently found that it is possible with UI test thanks to its proxy mechanism.

I'm aware of the strong limitations: to write on disk, I only use open, write and other allowed C functions. The first file saved is just an empty file to mark the session as crashed. Then data written on disk is not a crash report but a short description of the application context at crash time to help understanding what happened. This context is not captured within the handler but before, during the normal usage of the app. This way, the writing is only done with dates, numbers and strings already stored so that no memory is allocated.

Naturally, this is not perfect, but be sure that I really try to do my best and listen your recommandations very seriously.

Finally, about dispatch_source: should I understand that to support SIGABRT and the other signals you mentioned, I should implement a second handler using signal or sigaction? I mean, not only for tests but even for production codes?

This context is not captured within the handler but before, during the normal usage of the app.

That’s still quite tricky. You have to protect this data structure from concurrent access, because the signal handler might fire while you’re updating it. And you can’t use a mutex for that, because the signal handler might fire regardless of whether the updating thread is holding that mutex.

To make this work you need an atomic. Consider this general outline:

  1. Maintain the data structure in the normal way, if necessary using a mutex.

  2. Then flatten it into a pointer-sized snapshot for the benefit of your signal handler.

  3. Atomically replace the old snapshot with the new.

  4. In the signal handler, work from the current snapshot. Remember that this read needs to be atomic.

But even this isn’t as easy as it sounds. Consider this sequence:

  1. The signal handler fires on thread A.

  2. It reads the snapshot and starts working on it.

  3. Thread B, which continues to run even though you’re in a signal handler, replaces the snapshot with a new one.

  4. And then deallocates the old one.

At this point the signal handler is working with deallocated memory.

There are ways around this but none of them are straightforward:

  • You could have the signal handler suspend all the other threads before it starts working from the snapshot. But that introduces its own complexities, for example, how do you avoid deadlock if two threads crash at the same time.

  • You could create a more elaborate data structure. The key term here is lock free. If you search the ’net for that, you’ll find lots of info about it. However, you’re in a particularly bad place because of the signal handler constraints.

Finally … I should implement a second handler using signal or sigaction?

If you’re doing a crash reporter, ignore Dispatch sources and use only sigaction.


Finally, this is on the Mac right? If so, you have a much better option: Do this in a separate process. So, spawn a child process and have it monitor the app process. When the app process dies, the child process writes out the app process’s state. This is in a separate process, and thus has none of the constraints associated with a signal handler.

The traditional challenge with such an approach is getting state out of the crashed process. That involves Mach exception handlers. Given the choice between signal handlers, Mach exception handlers, and gnawing my own leg off, I’m not 100% sure which way I’d go (-:

However, in this case that’s not needed because you only want to capture app state. Thus, you could use the snapshot approach I’ve described above but put the result into memory that you share between the app and this child process. The mechanics of this are still tricky, but it’s likely to be more reliable than anything you do in process. Moreover, there’s no way it could disrupt the Apple crash reporter.

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

No doubt that working with a child process would be the best option. However, it would also be a little bit dangerous for me to use such a technique as I'm not at all familiar with it. Moreover I'm afraid that it could lead to some complex issues with the sandbox which could require a lot of time to solve. But I keep this interesting suggestion in mind for a future update (I've seen a nice example you wrote in Swift and Objective-C, it could be a good starting point). For now, I will go with snapshots and lock free structures which are more familiar to me, and I will install my signal handlers with sigaction instead of dispatch_sources.

Thanks for the great help!