macos 15.3.x local network restrictions leading to EHOSTUNREACH "No route to host"

Continuing with my investigations of several issues that we have been noticing in our testing of the JDK with macosx 15.x, I have now narrowed down at least 2 separate problems for which I need help. For a quick background, starting with macosx 15.x several networking related tests within the JDK have started failing in very odd and hard to debug ways in our internal lab. Reading through the macos docs and with help from others in these forums, I have come to understand that a lot of these failures are to do with the new restrictions that have been placed for "Local Network" operations. I have read through https://developer.apple.com/documentation/technotes/tn3179-understanding-local-network-privacy and I think I understand the necessary background about these restrictions.

There's more than one issue in this area that I will need help with, so I'll split them out into separate topics in this forum. That above doc states:

macOS 15.1 fixed a number of local network privacy bugs. If you encounter local network privacy problems on macOS 15.0, retest on macOS 15.1 or later.

We did have (and continue to have) 15.0 and 15.1 macos instances within our lab which are impacted by these changes. They too show several networking related failures. However, I have decided not to look into those systems and instead focus only on 15.3.1.

People might see unexpected behavior in System Settings > Privacy & Security if they have multiple versions of the same app installed (FB15568200).

This feedback assistant issue and several others linked in these documentations are inaccessible (even when I login with my existing account). I think it would be good to have some facility in the feedback assistant tool/site to make such issues visible (even if read-only) to be able to watch for updates to those issues.

So now coming to the issue. Several of the networking tests in the JDK do mulicasting testing (through BSD sockets API) in order to test the Java SE multicasting socket API implementations. One repeated failure we have been seeing in our labs is an exception with the message "No route to host". It shows up as:

Process id: 58700
...
java.net.NoRouteToHostException: No route to host
at java.base/sun.nio.ch.DatagramChannelImpl.send0(Native Method)
at java.base/sun.nio.ch.DatagramChannelImpl.sendFromNativeBuffer(DatagramChannelImpl.java:914)
at java.base/sun.nio.ch.DatagramChannelImpl.send(DatagramChannelImpl.java:871)
at java.base/sun.nio.ch.DatagramChannelImpl.send(DatagramChannelImpl.java:798)
at java.base/sun.nio.ch.DatagramChannelImpl.blockingSend(DatagramChannelImpl.java:857)
at java.base/sun.nio.ch.DatagramSocketAdaptor.send(DatagramSocketAdaptor.java:178)
at java.base/java.net.DatagramSocket.send(DatagramSocket.java:593)

(this is just one example stacktrace from java program)

That "send0" is implemented by the JDK by invoking the sendto() system call. In this case, the sendto() is returning a EHOSTUNREACH error which is what is then propagated to the application.

The forum text editor doesn't allow me to post long text, so I'm going to post the rest of this investigation and logs as a reply.

Here's the relevant system logs from one such failing setup:

PID Type Date & Time Process Message
706 default 2025-03-13 09:53:42.297782 +0000 cfprefsd [0x74c933e80] activating connection: mach=false listener=false peer=true name=com.apple.cfprefsd.daemon.peer[58700].0x74c933e80
default 2025-03-13 09:53:42.337316 +0000 kernel AppleAVD: addFrameWorkLoad(): clientID 0 clientFPS too slow! Forcing VMAX! rollingAvgFPS: 186 < streamFPS: 240 c[0].activeLoadRate 1990656000 c[0].clientCount 1 minSpeed 1
default 2025-03-13 09:53:42.391103 +0000 kernel AppleAVD: addFrameWorkLoad(): clientID 0 clientFPS too slow! Forcing VMAX! rollingAvgFPS: 186 < streamFPS: 240 c[0].activeLoadRate 1990656000 c[0].clientCount 1 minSpeed 1
638 info 2025-03-13 09:53:42.444295 +0000 UserEventAgent Got local network blocked notification: pid: 7384, uuid: 4E7709E7-AD5C-38B8-9ED0-0354767877BD, bundle_id: (null)
638 default 2025-03-13 09:53:42.444323 +0000 UserEventAgent LocalNetwork: found bundle id marco-foobar by PID
638 info 2025-03-13 09:53:42.444489 +0000 UserEventAgent LocalNetwork: did not find bundle ID for UUID 4E7709E7-AD5C-38B8-9ED0-0354767877BD
638 info 2025-03-13 09:53:42.444489 +0000 UserEventAgent Found bundle ID: marco-foobar
726 info 2025-03-13 09:53:42.444717 +0000 nehelper application record search init. Node: (null) bundleID: <private> itemID: 0
708 info 2025-03-13 09:53:42.445100 +0000 runningboardd PERF: Received request from [osservice<com.apple.nehelper>:726] (euid 0, auid 0) (persona (null)): lookupHandleForPredicate:error:
708 default 2025-03-13 09:53:42.445107 +0000 runningboardd PERF: Received lookupHandleForPredicate request from [osservice<com.apple.nehelper>:726] (euid 0, auid 0) (persona (null))
708 info 2025-03-13 09:53:42.445481 +0000 runningboardd [osservice<com.apple.nehelper>:726] handle lookup could not find a matching process
708 info 2025-03-13 09:53:42.445554 +0000 runningboardd Error handling message from [osservice<com.apple.nehelper>:726]: <Error Domain=RBSRequestErrorDomain Code=3 "Specified predicate did not match any processes" UserInfo={NSLocalizedFailureReason=Specified predicate did not match any processes}>
debug 2025-03-13 09:53:42.445918 +0000 kernel igmp_append_relq: adding inm e76b811a46a86ff3 on relq ifp en0
debug 2025-03-13 09:53:42.445930 +0000 kernel igmp_append_relq: adding inm e76b811a46a86f13 on relq ifp en0
debug 2025-03-13 09:53:42.445954 +0000 kernel igmp_flush_relq: flushing e76b811a46a86ff3 on relq ifp en0
debug 2025-03-13 09:53:42.445957 +0000 kernel igmp_flush_relq: flushing e76b811a46a86f13 on relq ifp en0
708 info 2025-03-13 09:53:42.446219 +0000 runningboardd PERF: Received request from [osservice<com.apple.nehelper>:726] (euid 0, auid 0) (persona (null)): lookupHandleForPredicate:error:
708 default 2025-03-13 09:53:42.446224 +0000 runningboardd PERF: Received lookupHandleForPredicate request from [osservice<com.apple.nehelper>:726] (euid 0, auid 0) (persona (null))
708 info 2025-03-13 09:53:42.446495 +0000 runningboardd _multiInstance = 0
708 info 2025-03-13 09:53:42.446497 +0000 runningboardd _executablePath = /opt/marco-foobar/start-marco-foobar.bash
708 info 2025-03-13 09:53:42.446500 +0000 runningboardd no additional launch properties found for <private>
708 default 2025-03-13 09:53:42.446543 +0000 runningboardd _resolveProcessWithIdentifier pid 7384 euid 0 auid 0
708 default 2025-03-13 09:53:42.446584 +0000 runningboardd Resolved pid 7384 to [osservice<polo-foobar>:7384]
708 error 2025-03-13 09:53:42.446629 +0000 runningboardd memorystatus_control error: MEMORYSTATUS_CMD_CONVERT_MEMLIMIT_MB(-1) returned -1 22 (Invalid argument)
708 error 2025-03-13 09:53:42.446631 +0000 runningboardd memorystatus_control error: MEMORYSTATUS_CMD_CONVERT_MEMLIMIT_MB(0) returned -1 22 (Invalid argument)
708 default 2025-03-13 09:53:42.447432 +0000 runningboardd Full encoding handle <private>, with data 47a8317e00001cd8, and pid 7384
708 default 2025-03-13 09:53:42.447678 +0000 runningboardd [osservice<polo-foobar>:7384] is not RunningBoard jetsam managed.
708 default 2025-03-13 09:53:42.447686 +0000 runningboardd [osservice<polo-foobar>:7384] This process will not be managed.
708 info 2025-03-13 09:53:42.448200 +0000 runningboardd PERF: Received request from [osservice<com.apple.nehelper>:726] (euid 0, auid 0) (persona (null)): lookupProcessName:error:
726 info 2025-03-13 09:53:42.448319 +0000 nehelper No team ID found for (bundleID: marco-foobar, name: marco-foobar)

This log was captured when the java process 58700 initiated the sendto() and ran into an exception. That's the entirety of the log during that duration and I haven't left out anything.

Here's what I understand of the log messages that I pasted in my previous reply. Remember that the process id of interest is 58700. In the log above, you will notice that the only place where this proces id 58700 is present is the line:

cfprefsd	[0x74c933e80] activating connection: mach=false listener=false peer=true name=com.apple.cfprefsd.daemon.peer[58700].0x74c933e80

That line isn't too interesting and I don't know if it's relevant in this discussion, so I'll skip that one. The good thing however is that these logs appear to have captured enough details about the "Local Network" restriction's implementation. Several of those above log messages have something interesting, and I think it can be summarized by these few:

UserEventAgent Got local network blocked notification: pid: 7384, uuid: 4E7709E7-AD5C-38B8-9ED0-0354767877BD, bundle_id: (null)
UserEventAgent LocalNetwork: found bundle id marco-foobar by PID
UserEventAgent LocalNetwork: did not find bundle ID for UUID 4E7709E7-AD5C-38B8-9ED0-0354767877BD
UserEventAgent Found bundle ID: marco-foobar
nehelper application record search init. Node: (null) bundleID: <private> itemID: 0
...
708 info 2025-03-13 09:53:42.446497 +0000 runningboardd _executablePath = /opt/marco-foobar/start-marco-foobar.bash
708 info 2025-03-13 09:53:42.446500 +0000 runningboardd no additional launch properties found for <private>
708 default 2025-03-13 09:53:42.446543 +0000 runningboardd _resolveProcessWithIdentifier pid 7384 euid 0 auid 0
708 default 2025-03-13 09:53:42.446584 +0000 runningboardd Resolved pid 7384 to [osservice<polo-foobar>:7384]
...
runningboardd [osservice<polo-foobar>:7384] is not RunningBoard jetsam managed.
runningboardd [osservice<polo-foobar>:7384] This process will not be managed.
runningboardd PERF: Received request from [osservice<com.apple.nehelper>:726] (euid 0, auid 0) (persona (null)): lookupProcessName:error:
nehelper No team ID found for (bundleID: marco-foobar, name: marco-foobar)

So when the java program (in process 58700) intitated the sendto() local network operation, it appears to have triggered a "local network blocked notification". I'm not sure if that log line from UserEventAgent means that the notification pop-up (asking the user to allow/disallow the operation) was generated or whether it is determining if the pop-up needs to be generated. In any case, that log message indicates that the pid is 7384. Looking at the additional data I collected from that system and the documentation of "Local Networking" which states:

When a process performs a local network operation, macOS tries to track down the responsible code. For example, if your app spawns a helper tool and the helper tool performs a local network operation, macOS considers the app to be the responsible code.

So what that log tells me is that the local network restriction has identified 7384 process as the top level root application for the 58700 java process which is doing the networking operation. So I went back and looked into the process and launch hierarchy on this setup.

7384 is a process launched through launchd with the /opt/marco-foobar/start-marco-foobar.bash bash script file as the executable. When I say launchd, I may not be using the right term here. I think the /opt/marco-foobar/start-marco-foobar.bash bash script gets launched whenever that macos system starts (I will confirm that today). The plist file which configures this bash script as the entry point looks something like:

<plist version="1.0">
<dict>
<key>Label</key>
<string>polo-foobar</string>
<key>ProgramArguments</key>
<array>
<string>/opt/marco-foobar/start-marco-foobar.bash</string>
...

and launchtl list on that system shows:

PID Status Label
...
7384 0 polo-foobar
...

The /opt/marco-foobar/start-marco-foobar.bash ultimately ends up doing a:

...
exec /opt/polo/bin/polo-foobar

polo-foobar is a 3rd party binary and doesn't have any plist file of its own:

launchctl plist /opt/polo/bin/polo-foobar
64-bit Mach-O does not have a __TEXT,__info_plist or is invalid.

polo-foobar is a generic (3rd party) framework which allows application specific code to be executed. In this case, it ends up launching a java process which then further launches (through Java's ProcessBuilder API https://docs.oracle.com/en/java/javase/23/docs/api/java.base/java/lang/ProcessBuilder.html#start()) the ultimate 58700 java process.

So given all this, it seems to me that the entry point process /opt/marco-foobar/start-marco-foobar.bash (which "exec"ed the polo-foobar binary) is being denied access to the network operation. Did I understand it right? If so, how do I go about addressing this issue. The local networking document states:

If you ship a launchd agent that’s not installed using SMAppService, make macOS aware of the responsible code by setting the AssociatedBundleIdentifiers property in your launchd property list.

Does that mean I need to add the AssociatedBundleIdentifiers property to the plist snippet that I shared previously? What value (values?) would I add to it?

Furthermore, these processes are running on a system which acts like a "server" and there's no user interaction involved. So what are the options of making this "allow this process (sequence) access to networking operations" non-interactive and configurable/automatable?

While at it, I would like to note that the output of ps -Meo pid,pcpu,cputime,start,pmem,vsz,rss,state,wchan,user,args for this top level 7384 process which seems to have been denied the network operation access shows that it is running as root:

USER PID TT %CPU STAT PRI STIME UTIME COMMAND PID %CPU TIME STARTED %MEM VSZ RSS STAT WCHAN USER ARGS
root 7384 ?? 0.0 S 20T 0:00.03 0:00.55 /opt/ma 7384 0.0 73:40.19 1Mar25 0.4 412145952 58976 S - root /opt/polo/bin/polo-foobar

So this local network access denial for this process seems to go against what the local networking documentation states:

macOS considerations

macOS maintains separate local network privacy state for each user account.

macOS automatically allows local network access by:

Any daemon started by launchd Any program running as root ...

Overall this feels way too complicated to manage and if I understand it correctly, none of these issues has to do with java itself and I can imagine the exact same launch sequence leading to a go (or even python) application which uses that language's standard networking APIs to run into this same thing. Have I misunderstood this?

While at it, I would also like to understand if those above log messages show any other issues that might need to be addressed.

That "send0" is implemented by the JDK by invoking the sendto() system call. In this case, the sendto() is returning a EHOSTUNREACH error which is what is then propagated to the application.

I would like to note that this (and one other error message that I'm investigating), I feel, are a bit misleading. When these issues showed up several weeks back, I and others started looking into this. At first we spent several days trying to understand if this was something to do with the networking configurations on the host or other devices. We had to involve some network admins from our lab to try and debug this. After several days of investigation, we came to realize that the "Local Networking" restrictions put in place in 15.x of macosx are at play.

7384 is a process launched through launchd with the /opt/marco-foobar/start-marco-foobar.bash bash script file as the executable. When I say launchd, I may not be using the right term here. I think the /opt/marco-foobar/start-marco-foobar.bash bash script gets launched whenever that macos system starts (I will confirm that today).

I've confirmed that this process is indeed launched by placing the plist file in the /Library/LaunchDaemons directory. That plist file, as noted in my previous post contains:

<key>ProgramArguments</key>
<array>
<string>/opt/marco-foobar/start-marco-foobar.bash</string>
</array>

So the 7384 process that has been identified as the root top level app/process is a LaunchDaemon.

Written by jaikiran in 776552021
This feedback assistant issue and several others linked in these documentations are inaccessible

Indeed. This is the way that Apple’s bug tracking system works. There are some ways to get some insight into existing bugs — see Bug Reporting: How and Why? for more on that — but there’s no way to view the contents of arbitrary bugs filed by other developers.

Coming back to your main issue, you wrote:

Written by jaikiran in 829329022
I've confirmed that this process is indeed launched by placing the plist file in the /Library/LaunchDaemons directory.

OK, cool, that was the next thing I was gonna ask.

Earlier you posted an excerpt from your launchd property list file, but not the whole thing. I’m specifically curious if:

  • The launchd property list has a UserName property to run the job as a user other than root.

  • If this script, or anything in the chain between it and the Java code that’s calling sendto, is changing the user ID. For example, by calling setuid or setgid or using tools like su or sudo.

In general, a launchd daemon shouldn’t be troubled by local network privacy. That includes the daemon itself and any processes that it spawns. I’ve seen cases where that’s not the case but, at least so far, those are associated with daemon that change their user ID, either via the UserName property or explicitly. That can cause problems on macOS because macOS maintains a bunch of execution context beyond that standard BSD user and group IDs [1].

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

[1] The very old, but still surprisingly relevant, TN2083 Daemons and Agents covers this in much more detail.

Hello Quinn,

Earlier you posted an excerpt from your launchd property list file, but not the whole thing. I’m specifically curious if:

The launchd property list has a UserName property to run the job as a user other than root.

No, the plist file doesn't have a UserName.

For reference, here's the complete plist file:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple Computer//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>Label</key>
<string>polo-foobar</string>
<key>ProgramArguments</key>
<array>
<string>/opt/marco-foobar/start-marco-foobar.bash</string>
</array>
<key>WorkingDirectory</key>
<string>/cores</string>
<key>RunAtLoad</key>
<true/>
<key>OnDemand</key>
<false/>
<key>EnvironmentVariables</key>
<dict>
</dict>
<key>StandardErrorPath</key>
<string>/var/log/foo-err.log</string>
<key>StandardOutPath</key>
<string>/var/log/foo-out.log</string>
<key>KeepAlive</key>
<true/>
<key>HardResourceLimits</key>
<dict>
<key>Core</key>
<integer>9223372036854775807</integer> <!-- RLIM_INFINITY -->
</dict>
<key>SoftResourceLimits</key>
<dict>
<key>Core</key>
<integer>9223372036854775807</integer> <!-- RLIM_INFINITY -->
</dict>
<key>ProcessType</key>
<string>Interactive</string>
</dict>
</plist>

If this script, or anything in the chain between it and the Java code that’s calling sendto, is changing the user ID. For example, by calling setuid or setgid or using tools like su or sudo.

I will look into that part and get back with the details.

[1] The very old, but still surprisingly relevant, TN2083 Daemons and Agents covers this in much more detail.

Yes, that's a very good documentation. I've been reading through it the past few days.

While at it, coming to this point:

In this case, the sendto() is returning a EHOSTUNREACH error which is what is then propagated to the application.

I would like to note that this (and one other error message that I'm investigating), I feel, are a bit misleading.

Do you think it would be possible to reconsider how these BSD socket APIs report these local network privacy restrictions with a more appropriate errno?

Instead of this and other related BSD socket APIs returning errnos like EHOSTUNREACH, which have a pre-established meaning, would it be possible to instead return EPERM from these APIs?

man errno states:

1 EPERM Operation not permitted. An attempt was made to perform an operation limited to processes with appropriate privileges or to the owner of a file or other resources.

This feels much more closer and accurate to what these local network operation restrictions are about. Of course, returning EPERM errno from these APIs would also mean that the man pages of relevant BSD functions like connect(), sendto() and such would have to be updated to state that this is now one of the possible errno they return. If that can be done, then I think this can reduce the confusion and at the same time more accurately represent the nature of the operation failure.

Hello Quinn,

Earlier you posted an excerpt from your launchd property list file, but not the whole thing. I’m specifically curious if:

... If this script, or anything in the chain between it and the Java code that’s calling sendto, is changing the user ID. For example, by calling setuid or setgid or using tools like su or sudo.

I had a look at the process launching code involved in this entire hierarchy. So the flow is as follows - there's a launchd daemon process (launched through a plist file in /Library/LaunchDaemons) which acts as a task management "server". Just for additional information, it's a third party framework that's implemented in C++ (so it's not a Java program in itself). That launchd daemon receives regular requests for launching "tasks". You can imagine a task to be a process that this server launches on the same macos host.

The task request that comes in to the launchd daemon has the ability to say that the task needs to be run as a specific user. The launchd daemon then creates a new process. If a user is specified, then the newly launched process first identifies the "uid", "gid" and "grouplist" of the chosen user through system calls getuid(), getgid() and getgroups() respectively. The newly launched process then calls the setuid(), setgid() and setgroups() system calls to change the user id of itself.

In our setup where we are noticing this issue, the launchd daemon (running as root) launches processes which then have their user changed to a different user. These processes, running as a different user, then ulimately end up launching the java executable which then, as part of the application code, end up calling the sendto() system call.

In general, a launchd daemon shouldn’t be troubled by local network privacy. That includes the daemon itself and any processes that it spawns. I’ve seen cases where that’s not the case but, at least so far, those are associated with daemon that change their user ID, either via the UserName property or explicitly. That can cause problems on macOS because macOS maintains a bunch of execution context beyond that standard BSD user and group IDs [1].

So yes, it appears that switching of the user is playing a role here. Having said that, is this a bug in macos? The implementation of local network restrictions appears to have identified the launchd daemon, running as root, as the top level application against which (as per the local network documentation) it is evaluating the permissions. Yet, it appears to be tripped by the user id of the leaf (and intermediate?) processes when making this decision.

Would you have some suggestion on how to get past this?

Do you think it would be possible to reconsider how these BSD socket APIs report these local network privacy restrictions with a more appropriate errno?

Instead of this and other related BSD socket APIs returning errnos like EHOSTUNREACH, which have a pre-established meaning, would it be possible to instead return EPERM from these APIs?

I am considering filing this and the other issue about the user switch causing the launchd daemon, running as root, losing its local network permission as two separate issues through feedback assistant later today. Would that be OK?

macos 15.3.x local network restrictions leading to EHOSTUNREACH "No route to host"
 
 
Q