macOS NEFilterDataProvider best practices?

I've seen some discussion around performance on the forums but nothing official. What is the best practice for [handleNewFlow:] , [handleOutboundDataFromFlow:], and [handleInboundDataFromFlow:] callbacks in a content filter? Are all flows funneled through a single serial queue that calls into my subclass? If so this seems like we are back in the days of early OS X with the kernel network funnel serializing all network traffic. Should I offload flow processing onto a concurrent queue and then pause the flow and return from my callback? Or just do all processing in the callbacks?

And once I return an allow/deny verdict for the flow (without asking for more data) do I no longer see that flow's traffic in my content filter? That's what I would expect and it seems to be the case in actual practice too.

For reference I never need to interact with the user. All of the rules are loaded from an EDR platform.

I bring this up because we have users complaining of "stuttering" during Google Meet / Zoom, etc when our extension is active and from our perf metrics time spent in the callbacks are minimal (a few hundred usecs). But if all traffic is serialized through our content filter and the system is very busy I wonder if this could lead to dropped packets.

What if multiple content filters are present? Are they all serialized with each other? Oof.

Answered by DTS Engineer in 761811022

Does it dispatch requests to each in parallel or does it simply have a for(filter){request} serial loop?

I don’t know, but if you forced me to guess I’d say that the latter is the more likely option.

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

Are all flows funneled through a single serial queue that calls into my subclass?

Yes.

Should I offload flow processing onto a concurrent queue and then pause the flow and return from my callback? Or just do all processing in the callbacks?

It’s kinda up to you. Certainly, I wouldn’t use a concurrent queue for this because I wouldn’t use a concurrent queue for anything (-: See Avoid Dispatch Global Concurrent Queues. However, handing work out to your own code that has its own concurrency model is perfectly reasonable.

Now, as to whether that’ll actually improve the performance, that’s a very different question, one that doesn’t have an easy answer. My experience is that most networking code is I/O bound, so getting more CPUs to work on the problem doesn’t help. However, the only way to know for sure is to measure and test.

And once I return an allow/deny verdict for the flow … do I no longer see that flow's traffic in my content filter?

Correct.

What if multiple content filters are present? Are they all serialized with each other?

No. Remember that each filter is running in its own process. However, it wouldn’t surprise me if your profiling revealed serialisation bottlenecks within the NE infrastructure.

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

Now, as to whether that’ll actually improve the performance, that’s a very different question, one that doesn’t have an easy answer. My experience is that most networking code is I/O bound, so getting more CPUs to work on the problem doesn’t help.

It's not for getting more CPU time, it would be so multiple independent requests are not waiting on all requests before them.

No. Remember that each filter is running in its own process. However, it wouldn’t surprise me if your profiling revealed serialisation bottlenecks within the NE infrastructure.

Correct, but the kernel must wait on all filters to complete. Does it dispatch requests to each in parallel or does it simply have a for(filter){request} serial loop?

Accepted Answer

Does it dispatch requests to each in parallel or does it simply have a for(filter){request} serial loop?

I don’t know, but if you forced me to guess I’d say that the latter is the more likely option.

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

Does it dispatch requests to each in parallel or does it simply have a for(filter){request} serial loop?

I don’t know, but if you forced me to guess I’d say that the latter is the more likely option.

Ok, as I feared. So not only do content filters by default design have a head-of-line issue but the kernel itself does. We are back to the days of OS X network funnel serializing all (IP) network traffic. But only when 1+ 3rd party content filters is installed and then users blame us for performance issues.

Thanks for the insight Quinn, at least I have an idea of what might be happening now.

macOS NEFilterDataProvider best practices?
 
 
Q