UDP socket bind with ephemeral port on macos results in OS allocating a already bound/in-use port

We have been observing an issue where when binding a UDP socket to an ephemeral port (i.e. port 0), the OS ends up allocating a port which is already bound and in-use. We have been seeing this issue across all macos versions we have access to (10.x through recent released 13.x).

Specifically, we (or some other process) create a udp4 socket bound to wildcard and ephemeral port. Then our program attempts a bind on a udp46 socket with ephemeral port. The OS binds this socket to an already in use port, for example you can see this netstat output when that happens:

netstat -anv -p udp | grep 51630
udp46      0      0  *.51630                *.*                                 786896    9216  89318      0 00000 00000000 00000000001546eb 00000000 00000800      1      0 000001
udp4       0      0  *.51630                *.*                                 786896    9216  89318      0 00000 00000000 0000000000153d9d 00000000 00000800      1      0 000001

51630 is the (OS allocated) port here, which as you can see has been allocated to 2 sockets. The process id in this case is the same (because we ran an explicit reproducer to reproduce this), but it isn't always the case.

We have a reproducer which consistenly shows this behaviour. Before filing a feedback assistant issue, I wanted to check if this indeed appears to be an issue or if we are missing something here, since this appears to be a very basic thing.

How do you create/bind the socket?

What flags do you use (for proto?)

Both sockets above are owned by PID 89318, what PID is that?

Hello enodev,

How do you create/bind the socket?

We use low level system calls, the udp4 socket creation/bind looks something like:

fd = socket(AF_INET, SOCK_DGRAM, 0);
if (fd < 0) {
    printf("Failed to open socket: %d\n", errno);
    return -1;
}
SOCKETADDRESS sa;
memset((char *)&sa, 0, sizeof(SOCKETADDRESS));
sa.sa4.sin_family = AF_INET;
sa.sa4.sin_port = 0;
sa.sa4.sin_addr.s_addr = htonl(0x0); // bind to wildcard
socklen_t len = sizeof(sa.sa4);
int res;
res = bind(fd, &sa.sa, len);
if (res < 0) {
    printf("Failed to bind: %d\n", errno);
    return -2;
}

and the udp46 socket creation/bind looks similar except for the additional setsockopt call which we do to disable IPv6_ONLY on that socket:

int fd;
fd = socket(AF_INET6, SOCK_DGRAM, 0);
if (fd < 0) {
    printf("Failed to open socket: %d\n", errno);
    return -1;
}
// mark it as dual socket
int ipv6_only = 0; // dual socket
if (setsockopt(fd, IPPROTO_IPV6, IPV6_V6ONLY, &ipv6_only, sizeof(int)) < 0) {
    return -2;
}
SOCKETADDRESS sa;
memset((char *)&sa, 0, sizeof(SOCKETADDRESS));
sa.sa6.sin6_family = AF_INET6;
sa.sa6.sin6_port = 0;
char caddr[16];
memset((char *)caddr, 0, 16); // wildcard
memcpy((void *)&sa.sa6.sin6_addr, caddr, sizeof(struct in6_addr));
socklen_t len = sizeof(sa.sa6);
int res;
res = bind(fd, &sa.sa, len);
if (res < 0) {
    printf("Failed to bind: %d\n", errno);
    return -3;
}

Both sockets above are owned by PID 89318, what PID is that?

Please ignore that process id. As I noted in the original description this process id is the reproducer code that we ran to reproduce this issue. So in this case, the reproducer first creates a udp4 socket and binds it to an ephemeral port (and lets it stay bound till the program exits) and then creates udp46 socket and binds it to an ephemeral port. Occasionally (but consistently), this programs ends up with the already in-use port being assigned to the udp46 socket bind call. So in this case, the process ids are the same but in reality (and in production), the processes are different and unrelated (for example: we have noticed that one of our program when it binds a udp46 socket with ephemeral port, it gets assigned the port which is already in use by the system's syslogd process which has it bound to udp4)

tcp46 (and presumably udp46) are an ongoing source of pain with BSD Sockets. I recommend that you avoid them, and instead bind separate file descriptors for IPv4 and IPv6. If you want both to use the same ephemeral port, you need a loop like this:

repeat
  open and bind v4 to port 0
  get v4 port
  open and bind v6 to that port
  if success exit
  close v4
  close v6
end repeat

Having said that…

We have a reproducer which consistenly shows this behaviour.

You should definitely file a bug about this. Please post your bug number, just for the record.

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

Hello Quinn,

Having said that…

We have a reproducer which consistenly shows this behaviour.

You should definitely file a bug about this. Please post your bug number, just for the record.

Thank you for your inputs. I've filed the feedback issue as suggested. The issue id is FB12128351

Has there been any progress on this issue? This bug has cost me a few days of debugging work to track down flaky test failures in quic-go. It also seems to be the root cause behind https://github.com/golang/go/issues/67226.

The issue id is FB12128351

Is the issue public? I'm getting a "Feedback Not Found" under this link.

Hello Marten,

Is the issue public? I'm getting a "Feedback Not Found" under this link.

The issue isn't public. None of the issues filed with "Feedback assistant" are public. It's still an open issue and we very regularly run into this. I have been told in a different discussion that the issue is being investigated by Apple. There's no fix for it right now.

This bug has cost me a few days of debugging work to track down flaky test failures in quic-go. It also seems to be the root cause behind https://github.com/golang/go/issues/67226.

I am not from Apple, but my recommendation would be to file a feedback assistant issue of your own with these details (and any other details) so that this gets additional attention. While filing that issue, I would recommend following Quinn's suggestions here https://forums.developer.apple.com/forums/thread/751587?answerId=787971022#787971022 (specifically choose Developer Technologies & SDKs at the top level when filing the issue)

P.S: I didn't receive any notification from this thread when you posted your message. I only accidentally happened to view this thread today and noticed your post.

UDP socket bind with ephemeral port on macos results in OS allocating a already bound/in-use port
 
 
Q