Using raise in GCD can cause timing issues with the signal mechanism.

when we use raise in GCD, the signal handler is executed asynchronously, whereas in pthread, it is executed synchronously as expected.

example:

#include <Foundation/Foundation.h>
#include <pthread/pthread.h>

static void HandleSignal(int sigNum, siginfo_t* signalInfo, void* userContext) {
    printf("handle signal %d\n", sigNum);
    printf("begin sleep\n");
    sleep(3);
    printf("end sleep\n");
}

void InstallSignal(void) {
    static const int g_fatalSignals[] =
    {
        SIGABRT,
        SIGBUS,
        SIGFPE,
        SIGILL,
        SIGPIPE,
        SIGSEGV,
        SIGSYS,
        SIGTRAP,
    };
    int fatalSignalsCount = sizeof(g_fatalSignals) / sizeof(int);
    struct sigaction action = {{0}};
    action.sa_flags = SA_SIGINFO | SA_ONSTACK;
#if defined(__LP64__)
    action.sa_flags |= SA_64REGSET;
#endif
    sigemptyset(&action.sa_mask);
    action.sa_sigaction = &HandleSignal;
    struct sigaction pre_sa;
    for(int i = 0; i < fatalSignalsCount; i++) {
        int sigResult = sigaction(g_fatalSignals[i], &action, &pre_sa);
    }
}

void* RaiseAbort(void *userdata) {
    raise(SIGABRT);
    printf("signal handler has finished\n");
    return NULL;
}

int main(int argc, const char * argv[]) {

    InstallSignal();

    dispatch_async(dispatch_get_global_queue(0, 0), ^{
        raise(SIGABRT);
        // abort(); // abort() is ok
        RaiseAbort(nullptr);
    });
    
    // pthread is ok
    // pthread_t tid;
    // int ret = pthread_create(&tid, NULL, RaiseAbort, NULL);
    // if (ret != 0) {
    //     fprintf(stderr, "create thread failed\n");
    //     return EXIT_FAILURE;
    // }
    [[NSRunLoop mainRunLoop] run];
    return 0;
}

console log:

signal handler has finished
handle signal 6
begin sleep
end sleep
Answered by DTS Engineer in 850673022

OK, let’s take a step back: What are you trying to do with signals?

It’s very hard to use signals correctly. That especially true if you’re doing this in a process that uses Apple frameworks. For example, the code you posted is illegal because it calls printf from an async signal handler, and printf is not an async signal safe function.

Most folks who ask about signal handlers are trying to implement a crash reporter. That’s impossible to do well. I talk about it in great depth in Implementing Your Own Crash Reporter.

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

OK, let’s take a step back: What are you trying to do with signals?

It’s very hard to use signals correctly. That especially true if you’re doing this in a process that uses Apple frameworks. For example, the code you posted is illegal because it calls printf from an async signal handler, and printf is not an async signal safe function.

Most folks who ask about signal handlers are trying to implement a crash reporter. That’s impossible to do well. I talk about it in great depth in Implementing Your Own Crash Reporter.

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

We attempted to capture Go panic crash information on Darwin, but found that Go panics triggered by cgo calls under GCD fail to generate logs. Through comparative testing, we discovered that the issue arises because calling raise to throw SIGABRT does not block the thread under GCD. I'm curious: is this problem caused by GCD or by XNU? Also, could you please explain how this issue occurs? thx.

Meanwhile, in this case, the stack trace of the crashed thread generated by Apple CrashReporter is also incorrect.

The test code is as follows:

#import <Foundation/Foundation.h>
int main(int argc, const char * argv[]) {
    dispatch_async(dispatch_get_global_queue(0, 0), ^{
        raise(SIGABRT);
    });
    printf("Test");
    [[NSRunLoop mainRunLoop] run];
    return 0;
}

Crash log gen by Apple CrashReporter:

Exception Type:        EXC_CRASH (SIGABRT)
Exception Codes:       0x0000000000000000, 0x0000000000000000

Termination Reason:    Namespace SIGNAL, Code 6 Abort trap: 6
Terminating Process:   CrashTestMac [32830]

Thread 0 Crashed:
0   libdispatch.dylib             	       0x195ef99ac 0x195ef5000 + 18860
1   CoreFoundation                	       0x1961226f0 0x19611d000 + 22256
2   CoreFoundation                	       0x19615b5a4 0x19611d000 + 255396
3   CoreFoundation                	       0x19615b4f4 0x19611d000 + 255220
4   CoreFoundation                	       0x19615c69c 0x19611d000 + 259740
5   CoreFoundation                	       0x19615c46c 0x19611d000 + 259180
6   CoreFoundation                	       0x19626bd34 0x19611d000 + 1371444
7   libdispatch.dylib             	       0x195f1085c 0x195ef5000 + 112732
8   libdispatch.dylib             	       0x195ef9a28 0x195ef5000 + 18984
9   CoreFoundation                	       0x19616062c 0x19611d000 + 276012
10  Foundation                    	       0x1977a793c 0x19770d000 + 633148
11  CrashTestMac                  	       0x10091bea0 main + 104 (main.m:7)
12  dyld                          	       0x195d0eb98 0x195d08000 + 27544

The crash site is incorrectly shown as [[NSRunLoop mainRunLoop] run], and when debugging with lldb, it still points to this line.

Sorry I didn’t reply sooner. I was notified of your earlier posts )-:

calling raise to throw SIGABRT does not block the thread under GCD.

By “under GCD” I presume you mean “when using a Dispatch signal event source to handle the signal”. If so, then, yes, that’s expected. Dispatch signal event sources are very much like kqueues, in that they are delivered asynchronously. Thus if you raise a signal by calling raise (or kill with your own pid) then the thread calling raise may well return before the signal is delivered to the queue.

If you’re building your own crash reporter — which, as I noted above, is something I specifically discourage — and you choose to implement in-process crash reporting via signals — again, not something I recommend — then you have to use a signal handler rather than a Dispatch signal event source.


Meanwhile, in this case, the stack trace of the crashed thread generated by Apple CrashReporter is also incorrect.

It took me a while to figure out what’s going on here. Lemme explain.

Your call to raise on line 4 is running on a Dispatch worker thread. Those threads block signal delivered by default. If you replace that statement with a call to abort, you’ll get the backtrace you expect:

Thread 1 Crashed::  Dispatch queue: com.apple.root.default-qos
0  libsystem_kernel.dylib  … __pthread_kill + 8
1  libsystem_pthread.dylib … pthread_kill + 296
2  libsystem_c.dylib       … abort + 124
3  Test794589              … __main_block_invoke + 28 (main.m:4)
4  libdispatch.dylib       … _dispatch_call_block_and_release + 32
5  libdispatch.dylib       … _dispatch_client_callout + 16
6  libdispatch.dylib       … _dispatch_queue_override_invoke.cold.3 + 32
7  libdispatch.dylib       … _dispatch_queue_override_invoke + 848
8  libdispatch.dylib       … _dispatch_root_queue_drain + 364
9  libdispatch.dylib       … _dispatch_worker_thread2 + 156
10 libsystem_pthread.dylib … _pthread_wqthread + 232
11 libsystem_pthread.dylib … start_wqthread + 8

The mechanics of this are interesting. It’s not just that these worker threads mask the signal using pthread_sigmask. Rather, they’re tied into a work queue facility that prevents them from taking signals. This causes pthread_kill to return ENOTSUP, in response to which raise calls kill on the process itself. This alternative delivery path disconnects the source of the signal from the calling thread, and so the Apple crash reporter acts like you’d sent a SIGABRT from some other process.

If you’re curious, check out the use and implementation of __pthread_workqueue_setkill in the Darwin source.

The obvious fix here is to call abort rather than raise.

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

Using raise in GCD can cause timing issues with the signal mechanism.
 
 
Q