Pipe and Poll MacM1.

Hi since couple of day I introspect a bug that my application have on specifically on MacM1. All of this is code in C on a Mac Monterey (begun on Big Sure but updated recently).

In nutshell we allow user to send some custom command to OS. In practice what does we do: Create 2 pair pipe (and a thirds one for error but not pertinent here).

input parent were the command will be wrote -> output child where the command will read the command. input child where we the child will wrote the output of command-> output parent where we will read the return value.

We configure them with "close on exec", "non blocking", "asynchronous" and "FIOSETOWN" raise a signal when data are ready.

Forking send the command from child. And then we wait in parent with poll that data is ready in output parent. And this pattern work like a charm on all of our supported platform (including mac x64 Monterey with xcode 14 and an Unix A64v8). But on Mac M1 (tested on big sure, Monterey with xcode 13 an 14), it look like poll never go in state "some datas are ready to be read", even when there are datas ready to read.

One of the possible conflict that I can see here, is that our application heavily use SIGIO for other purpose that force us to ignore the EINTR error and just wait that poll successfully return. if that poll basically return the EINTR error as the File descriptor signal the interrupt before poll returning, but the next poll should be instantaneous and success as the data are still ready. So I assume that not the issue here.

But there are probably something that I miss. So here is some questions, are there any significant known difference between poll() on x64 or x86, and m1 ?

Are there some known issue with poll behaviours on MacM1 ?

What can we possibility miss here ?

It’s hard to say what’s going on here. It’s quite possible that this is just a race condition in your code that’s exposed by the M1. It’s also perfectly possible that is a bug in macOS that’s exposed by M1.

Are you able to extract this code from your main product and reproduce the problem in a reasonably-sized test project?

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

hi, I have been playing a lot with poll to try to reproduce but I have a lot of confused result. I was still convince basically we do something wrong, as we have a lot of abstraction to cross so good probability. But flatten all the code, I strangely succeed all time. And Mac x64, I make it fail but not in way that expected.

I dig a little bit, basically same context 3 examples, I tried multiple time to try to confirm. I just change the loop and something is fishy,

Here is the loop:

do {
        fprintf(pdDebugLog(FALSE),"looping until we do not have EINTR  %d \n", childFileDescriptors[0]);
        errno = 0;
        needMessageToWakeup = FALSE;
        if ((result = poll(&pfds,1,2500)) < 0)
            perror("err return: ");
        fprintf(pdDebugLog(FALSE),"result %d \n", result);
        perror("second test: ");
    }
    while (result == 0 && breakD == 0);

This never reach the end (result is always 0);

    do {
        fprintf(pdDebugLog(FALSE),"looping until we do not have EINTR  %d \n", childFileDescriptors[0]);
        errno = 0;
        needMessageToWakeup = FALSE;
        result = poll(&pfds,1,2500);
        perror("err return: ");
        fprintf(pdDebugLog(FALSE),"result %d \n", result);
        perror("second test: ");
    }
    while (result == 0 && breakD == 0);

The interrupt error is immediately raise. What I expected.

    do {
        fprintf(pdDebugLog(FALSE),"looping until we do not have EINTR  %d \n", childFileDescriptors[0]);
        errno = 0;
        needMessageToWakeup = FALSE;
        result = poll(&pfds,1,2500);
        //perror("err return: ");
        fprintf(pdDebugLog(FALSE),"result %d \n", result);
        perror("second test: ");
    }
    while (result == 0 && breakD == 0);

Here basically same code without the perror. poll never return error always 0 (after time out);

I maybe wrong but It look like the structure of the code will change the overall result, that may show us a clang issue. So I retake the thirds example and add the optimization none attribute on the function. Make it work all the time as the second one. So that look like a clang issue. I found nothing pertinent in the ASM of the loop. But I didnt push really hard.

I will try to make a short standalone example, once I will have found a workaround to make our product work first. Sadly I am still not 100% sure that not just a noisy bug unrelated to our issue here. Even if that explain a lot of things, that wasnt making sense. I just do this report this in case that someone is interested to take a look.

What is breakD is these examples?

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

Pipe and Poll MacM1.
 
 
Q