Network framework on macOS

This was mentioned in another thread 4 years ago:

This whole discussion assumes that every network connection requires a socket. This isn’t the case on most Apple platforms, which have a user-space networking stack that you can access via the Network framework [1].

[1] The one exception here is macOS, where Network framework has to run through the kernel in order to support NKEs. This is one of the reasons we’re in the process of phasing out NKE support, starting with their deprecation in the macOS 10.15 SDK.

Is macOS still an unfortunate exception that requires a socket per Network framework's connection?

Is macOS still an unfortunate exception that requires a socket per Network framework's connection?

No. User-space networking landed in macOS 12. That came with the fabulous skywalkctl command-line tool that lets you view the state of the user-space networking state. See the skywalkctl man page for details.

Having said that, Network framework retains its ability to use a socket under the covers, and you’ll see that in action in various edge cases (like Unix domains sockets).

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

Do I somehow opt-out of using sockets or should they be not used by default?

When running this test app:

import SwiftUI
import Network

var activeConnections: [NWConnection] = [] {
    didSet {
        print("connection count: \(activeConnections.count)")
    }
}

let site = "www.apple.com"

func openNewConnections(count: Int) {
    if count <= 0 { return }
    let c = NWConnection(host: NWEndpoint.Host(site), port: .https, using: .tls)
    activeConnections.append(c)
    c.stateUpdateHandler = { state in
        switch state {
            case .cancelled: fatalError("• cancelled")
            case .failed(let error): fatalError("• Error: \(error)")
            case .setup: break
            case .waiting: break
            case .preparing: break
                
            case .ready:
                c.send(content: "GET https://\(site)/index.html HTTP/1.0\n\n".data(using: .utf8), completion: NWConnection.SendCompletion.contentProcessed { error in
                    if let error {
                        fatalError("• send ended with Error: \(error)")
                    }
                })
                c.receive(minimumIncompleteLength: 1, maximumLength: 20) { data, contentContext, isComplete, error in
                    if let data {
                        if let s = String(data: data, encoding: .utf8) {
                            openNewConnections(count: count - 1)
                        } else {
                            openNewConnections(count: count - 1)
                        }
                    } else {
                        fatalError("• Error: \(String(describing: error))")
                    }
                }
            @unknown default:
                fatalError("TODO")
        }
    }
    c.start(queue: .main)
}

@main struct NetTestApp: App {
    init() {
        var r = rlimit()
        let err = getrlimit(RLIMIT_NOFILE, &r)
        precondition(err == 0)
        print("max files:", r.rlim_cur)
        openNewConnections(count: 20_000)
    }
    var body: some Scene {
        WindowGroup { Text("Hello, World") }
    }
}

I'm hitting the "too many open files" error around the number reported by getrlimit + RLIMIT_NOFILE (249'th connection out of 256 max files) and I can see the word "socket" in the log, in function names and "Failed to initialize socket" error message:

max files: 256
connection count: 1
...
connection count: 2
connection count: 246
connection count: 247
connection count: 248
nw_socket_initialize_socket <private> Failed to create socket(2,1) [24: Too many open files]
nw_socket_initialize_socket Failed to create socket(2,1) [24: Too many open files]
nw_socket_initialize_socket Failed to create socket(2,1) [24: Too many open files], dumping backtrace:
        [arm64] libnetcore-3100.140.3
    0   Network                             0x00000001938e4564 __nw_create_backtrace_string + 192
    1   Network                             0x0000000193b0b164 _ZL27nw_socket_initialize_socketP11nw_protocol + 2008
    2   Network                             0x0000000193b2917c _ZL27nw_socket_add_input_handlerP11nw_protocolS0_ + 1416
    3   Network                             0x0000000193c901b4 nw_endpoint_flow_attach_socket_protocol + 380
    4   Network                             0x0000000193c800a0 nw_endpoint_flow_attach_protocols + 6492
    5   Network                             0x0000000193c7d304 nw_endpoint_flow_setup_protocols + 3664
    6   Network                             0x0000000193c989e8 -[NWConcrete_nw_endpoint_flow startWithHandler:] + 4092
    7   Network                             0x00000001937625ac nw_endpoint_handler_path_change + 9400
    8   Network                             0x000000<…>
nw_socket_add_input_handler [C248.1.1:2] Failed to initialize socket

I think the same would happen in the console app (where getrlimit + RLIMIT_NOFILE returns a higher number 7168 = 0x1C00) if I wait long enough (possibly make the test more robust first to handle the cancellation errors, etc).

I am on macOS 13.6.


Edit: It's also not obvious how to recover from that error as it is not reported back via normal Swift error mechanism, looks like it uses either C++ or Objective-C exception mechanism. The last NWConnection() call completes, setting stateUpdateHandler, and "start" calls complete, then the OS internals fires this error in the log and the update handler is not called two more times (with "preparing" and "waiting" – normally waiting is not called) without "ready" – thus the logic of handling response or restarting a new connection stops proceeding without having a chance of handling or printing out the relevant error!

More specifically if I add more log entries I see this at the very end:

state: preparing, connectionCount: 248
// here goes string of errors
state: waiting, connectionCount: 248

and nothing else afterwards. For the previous connections before the file limit is hit – "ready" callout happens after "preparing" and without "waiting" callout.

looks like it uses either C++ or Objective-C exception mechanism.

That’s not an exception. Rather, this is error handling code that just happens to log a backtrace.

Do I somehow opt-out of using sockets or should they be not used by default?

Network framework uses user-space networking by default but there are lots of constraints involved. Consider this test project:

import Foundation
import Network

func start() -> NWConnection {
    print("connection will start")
    let connection = NWConnection(to: .hostPort(host: "example.com", port: 80), using: .tcp)
    connection.stateUpdateHandler = { newState in
        print("connection did change state, new: \(newState)")
    }
    connection.start(queue: .main)
    return connection
}

func main() throws {
    let connections = (0..<10).map { _ in start() }
    withExtendedLifetime(connections) {
        print("did start connections, pid: \(getpid())")
        dispatchMain()
    }
}

try main()

Running it on macOS 14.4.1, this is what I see:

% ./Test751581
connection will start
…
connection will start
did start connections, pid: 84766
connection did change state, new: preparing
…
connection did change state, new: preparing
connection did change state, new: ready
…
connection did change state, new: ready

Now let’s look at the process’s file descriptors:

% lsof -p 84766
COMMAND    … USER   FD      TYPE DEVICE SIZE/OFF     NODE NAME
Test751581 … quinn  cwd       DIR   1,15      192 27676134 /Users/quinn/…
Test751581 … quinn  txt       REG   1,15   161760 29137837 /Users/quinn/…
Test751581 … quinn  txt       REG   1,15    58072 29015793 /Library/…
Test751581 … quinn    0u      CHR   16,1   0t5163     3215 /dev/ttys001
Test751581 … quinn    1u      CHR   16,1   0t5163     3215 /dev/ttys001
Test751581 … quinn    2u      CHR   16,1   0t5163     3215 /dev/ttys001
Test751581 … quinn    3   NPOLICY                          
Test751581 … quinn    4      CHAN flowsw                   E2F7D96B-046F-…

There are not 10 TCP sockets there. There is a flowsw (flow switch) entry, which suggests user-space networking is in play.

Deploying skywalkctl, I see the 10 connections:

% sudo skywalkctl flow -n -P 84766
Proto Local Address          Remote Address      …
 tcp4 192.168.1.171.51497    93.184.215.14.80    …
 tcp4 192.168.1.171.51496    93.184.215.14.80    …
 tcp4 192.168.1.171.51489    93.184.215.14.80    …
 tcp4 192.168.1.171.51498    93.184.215.14.80    …
 tcp4 192.168.1.171.51490    93.184.215.14.80    …
 tcp4 192.168.1.171.51494    93.184.215.14.80    …
 tcp4 192.168.1.171.51495    93.184.215.14.80    …
 tcp4 192.168.1.171.51491    93.184.215.14.80    …
 tcp4 192.168.1.171.51493    93.184.215.14.80    …
 tcp4 192.168.1.171.51492    93.184.215.14.80    …

Note that I had to disable iCloud Private Relay to get these results. Without that I see just a single user-space networking flow, the UDP (QUIC IIUC) flow to the private relay server.

Additionally, there are things that opt you out of user-space networking entirely. I don’t have a definitive list, but the most obvious candidate is legacy VPNs. If you can’t reproduce my results:

  1. Retry on macOS 14.

  2. Set up a vanilla VM and retry there.

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

Thank you Quinn!

I tried your test two different ways:

  1. To run it in UI app I slightly modified it (moved connections to a global and increased the count to 500):
var connections: [NWConnection] = []

func quinnTest() throws {
    connections = (0..<500).map { _ in start() }
}

// call quinnTest() from view's init()
  • The app logs "Too many open files "errors",

  • "lsof" lists 251 entries like these:

NetTest 62541 yam  252u     IPv4 0xcbd9eab08f46aaf1        0t0                 TCP yam-mac:53997->93.184.215.14:http (ESTABLISHED)
  • there's no flowsw in the list

  • skywalkctl returns an empty list:

Proto Local Address Remote Address InBytes OutBytes InPkts/InSPkts OutPkts/OutSPkts SvC NetIf Port Adv Flags Local State Remote State Local RTT Remote RTT Process.PID

  1. I also tried your test unmodified, results are very similar to (1): many socket entries in lsof, no flowsw entry, empty list in skywalkctl, although I am not getting "too many files" in your test for two reasons: 10 is not a big number and getrlimit+RLIMIT_NOFILE by default returns a significantly higher number in console apps than in UI apps.

Other than that:

  • I do not use iCloud+ and as for iCloud I use FindMy only with the rest iCloud features turned off.

  • I have VPN installed but it is switched off, could it still matter?

  • I am on macOS 13.6.3, will recheck on macOS 14 when possible.

I wonder what results you are getting if you run my test if you can do it.

Could it be the case that this feature (of not using a socket per NWConnection) is macOS 14+ feature only?

I wonder what results you are getting if you run my test if you can do it.

Sorry, I’m out of the office and don’t have access to my macOS 13 VMs.

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

Hi, I'd really appreciate if anyone here could run my self-contained test above on their current macOS version to see if it matches my results or not. If you do please share your result here indicating the OS version and whether you have VPN installed or not (and whether it's activated or disabled).

I’m back in my office now so I was able to repeat my test on macOS 13. Specifically:

  • This was real hardware running 13.6.1.

  • I disabled iCloud Private Relay.

  • There’s no active VPN.

My Test751581 tool produced exactly the same results as I saw on macOS 14.4.1.

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

Wow, thank you. What would you recommend me to do to troubleshoot and find the culprit that causes a different outcome on my computer? Other than inactive VPN I also have inactive Network Link conditioner, otherwise I can't think of anything else. Could any of those two two things affect the outcome?

I could write an app that tests various conditions that must be met for user-space networking to work properly (so long as those could be tested with public API's and I know what calls to make :)

My hardware: 2021 16 inch MacBook Pro, running macOS 13.6.3 (22G436)

What would you recommend me to do to troubleshoot and find the culprit that causes a different outcome on my computer?

Set up a VM with a vanilla macOS 13 in it and see if that matches your behaviour or mine. I suspect it’ll match mine, at which point you can start adding stuff to the VM to see if you can flip it over to yours.

My guess that it’s your VPN. Prior to the introduction of the Network Extension architecture, third-party macOS VPN products adopted a variety of ad hoc techniques. If you’re using one of those, it’s possible that its behaviour is causing Network framework to fall back to BSD Sockets even when the VPN is disconnected.

However, that’s just a guess. Running a VM test will lead to a definitive answer.

Let me know what you discover, ’cause I’m curious myself.

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

Thank you Quinn, I didn't try VM yet. Just now I tried removing both network conditioner and VPN + restart – no joy, same outcome. If I don't use iCloud+ (and as for iCloud I only use FindMy, with everything else disabled). should I be worried about iCloud Private Relay? How to switch that off?

I tried Quinn's test program, and I did see 10 TCP sockets. MacOS 13.6.7. I have iCloud+ but had Private Relay turned off. (Private Relay is part of iCloud+). No VPN.

By the way, since the skywalkctl command was recommended here, I want to make a couple of comments about it. First, it can only be run using sudo. Second, the man page says you can get help on a command using skywalkctl COMMAND help, but that doesn't work.

I did the test again on a different Mac on the same local network, this time running macOS 14.5. And then I saw the flowsw thing instead of 10 TCP sockets. Another difference between the machines is that the one running Ventura has an Intel CPU, while the one with Sonoma is M1.

I tried Quinn's program once more under macOS 14.5 on my Intel-based Mac, and saw flowsw instead of 10 TCP sockets. So it appears that the difference is Sonoma versus Ventura.

it appears that the difference is Sonoma versus Ventura.

But it’s not just that because, as I mentioned above, my macOS 13 machine uses user-space networking.

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

I asked about this internally and one of my colleagues pointed out that the built-in firewall (System Settings > Network > Firewall) disables user-space networking. I just tried it here in my office and, yeah, my test tool starts using sockets as soon as I enable the firewall.

Note As with everything we’ve been talking about on this thread, this is an implementation detail that could well change in the future.

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

I just tried it here in my office and, yeah, my test tool starts using sockets as soon as I enable the firewall.

On macOS 14 as well?

Sorry, I should’ve mentioned, I was testing on macOS 14.4.1.

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

I see, thank you!

Where does it leave us... There are a few things currently that could make user-space networking not working. Imagine you asking users to do a number of steps like switching the firewall off to make your app working – not many users will like doing that.

Note As with everything we’ve been talking about on this thread, this is an implementation detail that could well change in the future.

I hope it will!

There are a few things currently that could make user-space networking not working. Imagine you asking users to do a number of steps

I’m not sure how the latter follows from the former. If your app uses a lot of network connections simultaneously, use setrlimit to raise the number of file descriptors it has access to (RLIMIT_NOFILE). See the setrlimit man page for details. If Network framework uses BSD Sockets, that’ll avoid this problem. And if Network framework uses user-space networking, the file descriptor limit isn’t relevant but there’s very little downside to increasing it.

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

Ah, sorry, I must have misread your answer in the original thread! I thought you are implying that while there is a limitation of 64K on sockets (which translates to the corresponding limitation on the maximum simultaneous connections) these limitations is not applicable anymore now that we have user-space networking. The logical path I've been following was: as we do have situations still where user-space networking is inactive we thus still have the issue of 64K sockets -> connections! I must admit I assumed quite a lot from what you didn't say explicitly.

So instead of assuming any further: please confirm there is no issue of having more than 64K connections even in cases where user-space networking is not active. At worst I'd have to call setrlimit to bump the count beyond 64K (which is a non-privileged AppStore compatible call).

Thanks!

I'd have to call setrlimit to bump the count beyond 64K (which is a non-privileged AppStore compatible call).

setrlimit is a supported API which you can call in any App Store app.

I’m not aware of any meaningful limit to how high you can raise this. However, if you go too far you’ll hit some other limit. For example, for open files, you’ll run out of inodes.

For sockets, you’ll hit other limits. At some point you’ll run the system out of mbufs. However, I think you’ll hit the NECP limit first [1]. That isn’t a simple count, but dependent on resource usage. It usually kicks in around 500-ish active flows.

This NECP limit applies to NW flows as well as sockets.

The symptoms of hitting the NECP limit is that things fail with otherwise hard to explain ENOSPC errors.

Taking a step back, if your goal is to run tens of thousands of network flows in a single process, I doubt you’ll be able to achieve that on Apple platforms. If that’s important to you, I recommend that you start by creating some prototypes. Make sure to:

  • Test that the flows work, not just that you can open them.

  • Test on all the platforms you’re targeting. You might, for example, see different results on iOS and macOS.

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

[1] I discuss NECP in A Peek Behind the NECP Curtain, but I don’t go into details about its limits.

Network framework on macOS
 
 
Q