Reclaiming cached data from an `enumerateDirectory` call

If I'm in an enumerateDirectory call, I can very quickly fill in the fileID, parentID, and (maybe) the type attributes based on the directory entry I have loaded. That is, I can quickly fill in anything that is contained in the dirent structure in dirent.h, plus the parentID.

However, if any other attributes are requested (say, flags), or if the file system doesn't store the filetype in the directory entry, then I need to do additional I/O and load an inode. If I have to load an inode, I might keep a reference to it and assume that I can clean it up later whenever there is a matching call to reclaimItem. But in the enumerateDirectory call, I never provide an FSItem to the system!

By observation, I see that normally, a call to enumerateDirectory of this nature is followed up by a lookupItem call for every single fetched item, and then assumedly the system can later reclaim it if need be. At least, I tried various ways of listing directories, and each way I tried showed this behavior. If that's the case, then I can rely on a later reclaimItem call telling me when to clean up this cached data from memory.

Is this guaranteed, however? I don't see a mention of this in the documentation, so I'm not sure if I can rely on this.

Or, do I need to handle a case where, if I do additional I/O after enumerateDirectory, I might need to figure out when cached data should be cleaned up to avoid a "leak?" (Using the term "leak" loosely here, since in theory looking up the file later would make it reclaimable, but perhaps that might not happen.)

By observation, I see that normally, a call to enumerateDirectory of this nature is followed up by a lookupItem call for every single fetched item, and then assumedly the system can later reclaim it if need be.

Really? I haven't specifically tested it, but my expectation would be that iterating the directory with "geattrlistbulk" and a minimal attribute set would avoid any additional lookupItem call.

However...

If that's the case, then I can rely on a later reclaimItem call telling me when to clean up this cached data from memory.

What kind of filesystem are you working with? For a block storage file system, caching doesn't really work in those terms, as the block data you’re reading from disk doesn’t really align with the filesystem’s own structures. That's particularly true on macOS where your I/O is page oriented, so the minimum cache unit is 16Kb (one page).

On the other hand, in a network file system, not only is caching the primary tool you have to improve performance but the entire architecture is basically one giant exercise in cache management.

Last but not least, in a filter file system the problem you have is that caching ANYTHING becomes dangerous[1] - you could cache the data from enumerateDirectory, but there's no guarantee that it hasn't already changed when lookupItem is called.

[1] This is also true in a network file system, but user expectations and usage patterns tend to reduce these issues.

__
Kevin Elliott
DTS Engineer, CoreOS/Hardware

my expectation would be that iterating the directory with "geattrlistbulk" and a minimal attribute set would avoid any additional lookupItem call

Oh, I think I was a little unclear in what I wrote. What I was trying to say in "a call to enumerateDirectory of this nature" was that if I call enumerateDirectory with a non-minimal attribute set (i.e. include attributes that need more I/O to fetch) then I see that behavior. But you're right in that iterating over a directory with minimal attributes doesn't generally have additional lookupItem calls in my tests.

What kind of filesystem are you working with?

It's a block device file system.

I think maybe "cached data" wasn't the best word for me to use... I'm referring to references I'm keeping to instances of my subclass of FSItem. I have a helper function implemented on my subclass to get the "non-minimal" attributes, so what I'm doing in my enumerateDirectory implementation is checking if I have a reference for the item already, and if so, I simply use that and its helper method. If not, I create a new instance of the FSItem, then proceed. (If there are only minimal attributes that don't need any more I/O requested, then I don't do anything with these FSItems.)

My question is more so, if I'm creating a new FSItem in enumerateDirectory like that (for the purpose of getting those "non-minimal" attributes), can I assume a lookupItem call is going to follow, and thus giving the system a chance to later reclaim that FSItem? Or should I be throwing away my reference to the FSItem if I created it inside enumerateDirectory? It seems like it would be a bit wasteful if I threw away an FSItem made in the enumerateDirectory call if I know there's going to be a lookupItem call shortly after asking for it, although I don't see a guarantee that that would be the case.

That's particularly true on macOS where your I/O is page oriented, so the minimum cache unit is 16Kb (one page).

Oh? Does this mean it's not good to pass lengths smaller than 16KB to e.g. metadataRead, and what its docs are referring to when they say "This method doesn’t support partial reading of metadata?" It's often the case that I've been using 4KB, and things seem to run fine, although right now I only have reading implemented, so I don't know if that causes a problem later if I implement write.

Part 1...

Disclaimer: Some of the information here may already be obvious or well understood to you. If it is, then thank you for your patience as I use this as an excuse to push out background information that other developers may find useful.

Oh, I think I was a little unclear in what I wrote. What I was trying to say in "a call to enumerateDirectory of this nature" was that if I call enumerateDirectory with a non-minimal attribute set (i.e. include attributes that need more I/O to fetch), then I see that behavior. But you're right in that iterating over a directory with minimal attributes doesn't generally have additional lookupItem calls in my tests.

Again, you have to be careful of the API you're using. The "classic" Unix directory iteration pattern[1] is to read the directory (readdir-> enumerateDirectory) and then retrieve metadata (stat-> lookupItemNamed).

There are two problems with that:

  1. At a basic level, it generates a lot of syscalls, each of which nibble away at performance.

  2. From the file system side, it throws away some very easy and obvious performance wins.

Expanding on that last point, the vast majority of block file systems all:

  • Store most of a file’s metadata in a single data structure.

  • Organize those records such that the records for objects within the same directory are (often) physically close to each other on disk a significant portion of the time.

  • Use a storage allocation scheme ("allocation block size") which guarantees that data will be read in large enough chunks that many records will invariably be retrieved at once.

Putting all those points into concrete terms, when a file system retrieves the name of any given object, it invariably has a bunch of other information "at hand" (size, dates, etc.), since all of that information was in the same record. Similarly, the blocks it read to retrieve the data about one file are VERY likely to include the data about other files in the same directory, since they were already close to each other on disk.

All of that is an API like "getattrlistbulk()", readdir_r(), and getdirentriesattr() exist— they let the file system return the data for a bunch of files "at once", taking advantage of the fact that the file system often has much of that data already. One interesting detail about that process— the reason getattrlistbulk takes a list of attributes isn’t to optimize the retrieval process on the file system side— as I talked about above, there often isn't anything to "optimize", since all the data is a single record that the file system can't partially retrieve. What it's actually doing there is reducing what it has to RETURN to user space so that it can pack more replies into the same syscall.

With all that background, let me jump back to here:

What I was trying to say in "a call to enumerateDirectory of this nature" was that if I call enumerateDirectory with a non-minimal attribute set (i.e. include attributes that need more I/O to fetch)

The thing to be aware of here is that different APIs will generate different behavior in ways that aren't entirely obvious/predictable. Notably, NSFileManager's URL enumerator does "bulk" enumeration while its older path enumerator ends up calling stat() on every file. I have a forum post here that covers this and includes some performance comparisons.

Switching over to the FSKit side, that's why "enumerateDirectory" has the design it has— you're using FSDirectoryEntryPacker to fill up a buffer which will then be returned to the kernel (once it's full or you run out of entries).

[1] As a side note, while writing this up I noticed that "Building a passthrough file system" not only uses the (slow) readdir/stat pattern, but appears to leak the directory descriptor and has a serious issue with its telldir() usage (r.175523886). Anyone actually implementing a passthrough file system should probably NOT use that implementation as a direct model.

__
Kevin Elliott
DTS Engineer, CoreOS/Hardware

Placeholder...

Reclaiming cached data from an `enumerateDirectory` call
 
 
Q