Capturing file read events in Endpoint Security client

Hi everyone! I'd like to create an application for system monitoring using the Endpoint Security framework. I already have a working prototype and now I am trying to expand its capabilities to capture more event types.

Started looking at filesystem-related events as one of the most important ones for my use case. These seem to be supported fairly well by the framework (ES_EVENT_TYPE_NOTIFY_OPEN/CLOSE/CREATE/WRITE etc.) However, the "READ FILE" event seems to be absent… Am I missing something here, or Endpoint Security framework does not provide this kind of information? If it doesn't, what is the reason behind this? Capturing this type of events seems quite relevant for security-related software.

Thanks & Best regards, Roman

Answered by DTS Engineer in 855633022

Hi everyone! I'd like to create an application for system monitoring using the Endpoint Security framework. I already have a working prototype, and now I am trying to expand its capabilities to capture more event types.

So, I need to pass you a broad warning here. It is VERY easy to put together a test/development app that "works" in the basic sense that it runs and doesn't create any obvious failure, but is in fact fundamentally flawed and will fail over and over again under real-world conditions. Even worse, most of these failures will not be simple crashes or hangs in your product but will instead be a variety of strange failures "everywhere else". See the forum threads here, here, and here for a few examples.

Started looking at filesystem-related events as one of the most important ones for my use case. These seem to be supported fairly well by the framework (ES_EVENT_TYPE_NOTIFY_OPEN/CLOSE/CREATE/WRITE, etc.). However, the "READ FILE" event seems to be absent… Am I missing something here, or does the Endpoint Security framework not provide this kind of information?

No, you haven't missed anything; there is no ES_EVENT_TYPE_NOTIFY_READ.

If it doesn't, what is the reason behind this?

So, there are a few different reasons it's missing:

  1. read() is one of the single highest-volume syscalls, so the cost of notifying on it is extremely high.

  2. From a security perspective, it isn't all that useful/meaningful. The critical security "gate" here is open, not the I/O functions (particularly in light of #3).

  3. It wouldn't tell you what you think it would and, in fact, would encourage a false sense of security.

The issue with #3 is that the system HEAVILY relies on memory mapped I/O, which bypasses the standard I/O functions (both read and write) and makes the issue of "tracking real file access" EXTREMELY difficult. There's an extended forum thread on this here, but the bottom line is that once a file is mapped, it's VERY difficult to know if/when it's been modified. That's even more true for read access, since the shared cache of the system means that I don't think you can trust st_atime for memory mapped reads.

__
Kevin Elliott
DTS Engineer, CoreOS/Hardware

Accepted Answer

Hi everyone! I'd like to create an application for system monitoring using the Endpoint Security framework. I already have a working prototype, and now I am trying to expand its capabilities to capture more event types.

So, I need to pass you a broad warning here. It is VERY easy to put together a test/development app that "works" in the basic sense that it runs and doesn't create any obvious failure, but is in fact fundamentally flawed and will fail over and over again under real-world conditions. Even worse, most of these failures will not be simple crashes or hangs in your product but will instead be a variety of strange failures "everywhere else". See the forum threads here, here, and here for a few examples.

Started looking at filesystem-related events as one of the most important ones for my use case. These seem to be supported fairly well by the framework (ES_EVENT_TYPE_NOTIFY_OPEN/CLOSE/CREATE/WRITE, etc.). However, the "READ FILE" event seems to be absent… Am I missing something here, or does the Endpoint Security framework not provide this kind of information?

No, you haven't missed anything; there is no ES_EVENT_TYPE_NOTIFY_READ.

If it doesn't, what is the reason behind this?

So, there are a few different reasons it's missing:

  1. read() is one of the single highest-volume syscalls, so the cost of notifying on it is extremely high.

  2. From a security perspective, it isn't all that useful/meaningful. The critical security "gate" here is open, not the I/O functions (particularly in light of #3).

  3. It wouldn't tell you what you think it would and, in fact, would encourage a false sense of security.

The issue with #3 is that the system HEAVILY relies on memory mapped I/O, which bypasses the standard I/O functions (both read and write) and makes the issue of "tracking real file access" EXTREMELY difficult. There's an extended forum thread on this here, but the bottom line is that once a file is mapped, it's VERY difficult to know if/when it's been modified. That's even more true for read access, since the shared cache of the system means that I don't think you can trust st_atime for memory mapped reads.

__
Kevin Elliott
DTS Engineer, CoreOS/Hardware

Hi Kevin @DTS Engineer, thank you so much for the response! This is indeed a comprehensive answer. I suspected it was missing for similar reasons you mentioned, just wanted to make sure it was intentional.

Following the topic of intercepting file access events: could you suggest other technologies that are better suited to achieve this goal? Let's say it does not have to be general purpose software for end users, so anything like kernel module or even a custom kernel build also count.

P.S. Thank you for the warning and for the links to other forum threads on this topic; very useful. As someone who came from the kernel driver world, I can imagine the possible performance impact of a subsystem like Endpoint Security, and the responsibility that comes with it. My app is monitoring-only, so hopefully it will be less prone to such errors. Anyway, will use your advice and try to be more careful.

Following the topic of intercepting file access events: could you suggest other technologies that are better suited to achieve this goal?

So, I think the big question here is "what are you actually trying to do“? The core problem here is that:

  • Read I/O is REALLY frequent, so the system doesn't want to do anything that would make it slower/more complicated.

  • Memory mapping makes the idea of tracking reads somewhat... questionable. As the most extreme example of this, take the file that's the backing "libsystem.dylib". EVERY (yes, EVERY) process in the system has mapped it into memory and is accessing it very frequently. Tying any I/O activity on that file to any specific process isn't really possible or meaningful. Even if you could track actual I/O activity (meaning, read from disks) and tie it to a particular process, all that would tell you is "which process happened to ask for a page I didn't have". It wouldn't tell you anything about the other X processes that accessed the data directly from VM without ever touching the disk.

As far as reads are concerned, most ES clients side-step the issue by just assuming that opening for read access means that the file has in fact been fully "read".

P.S. Thank you for the warning and for the links to other forum threads on this topic; very useful. As someone who came from the kernel driver world, I can imagine the possible performance impact of a subsystem like Endpoint Security, and the responsibility that comes with it.

The critical point there is what the ES system actually is: the user space side of a kauth client. To a large extent, this is kernel development in much the same way writing a kauth client was. Case in point, it is not particularly hard for an ES client to panic the kernel. Delay the wrong events long enough and the kernel will assume user space is hung and panic to reset the machine.

Creating that kind of delay is harder in a monitor-only client; however, once multiple ES clients are involved, all sorts of interesting interactions become possible.

My app is monitoring-only, so hopefully it will be less prone to such errors. Anyway, I will use your advice and try to be more careful.

Being monitor-only certainly helps simplify things, but it definitely doesn't solve the issue here. The big problem here is deadlocking issues, particularly with other ES clients involved.

__
Kevin Elliott
DTS Engineer, CoreOS/Hardware

Capturing file read events in Endpoint Security client
 
 
Q