Following previous question here :https://developer.apple.com/forums/thread/801397, I've decided to move my VPN implementation using NEPacketTunnelProvider on a dedicated networkExtension.
My extension receives packets using readPacketsWithCompletionHandler and forwards them immediately to a daemon through a shared memory ring buffer with Mach port signaling. The daemon then encapsulates the packets with our VPN protocol and sends them over a UDP socket.
I'm seeing significant throughput degradation, much higher than the tunnel overhead itself. On our side, the IPC path supports parallel handling, but I'm not not sure whether the provider has any internal limitation that prevents packets from being processed in parallel. The tunnel protocol requires packet ordering, but preparation can be done in parallel if the provider allows it.
Is there any inherent constraint in NEPacketTunnelProvider that prevents concurrent packet handling, or any recommended approach to improve throughput in this model? For comparison, when I create a utun interface manually with ifconfig and route traffic through it, I observe performance that is about four times faster.