Subdirectory navigation fails for several GUI apps on custom VFS.

Hi.

I am developing a custom virtual file system and facing such behaviour:

Upon using some graphical apps, for example Adobe Media Encoder, attempting to navigate inside my filesystem deeper than root folder will fail - nothing will happen on "double click" on that subfolder. Another problem, is that whether I try to re-navigate into root directory, it will be empty.

The problem is not present for most GUI apps - for example navigation inside Finder, upon choosing download path for file in Safari, apps like Microsoft Word, Excel and other range of applications work totally correctly.

A quick note here. From what I have seen - all apps that work correctly actually have calls to VFS_VGET - a predefined vfs layer hook. Whether the Adobe Media Encoder does not call for it - neither in my filesystem, nor in Samba, so my guess is that some applications have different browsing and retrieving algorithm. Is there anything I should examine further ? Default routines (vnop_open, vnop_lookup, vnop_readdir, vnop_close) behave as expected, without any errors.

P.S. This application (Adobe Media Encoder) works properly on Samba.

Upon using some graphical apps, for example Adobe Media Encoder, attempting to navigate inside my filesystem deeper than root folder will fail

What does the interface for this actually look like? Is it presenting it's own file navigation UI (I suspect yes) or is this the open/save panel?

Similarly, when looking at the other things you tested:

The problem is not present for most GUI apps - for example navigation inside Finder, upon choosing download path for file in Safari, apps like Microsoft Word, Excel and other range of applications work totally correctly.

Did any of those (aside from the Finder) actually present file navigation UI or did they rely on the system panels?

The reason this matters is that the system interface elements are all going to channel into fairly narrow API range. Most likely one or more "getattrlistbulk" calls (to retrieve directory contents) which would eventually return a file reference URL. That would then lead to:

A quick note here. From what I have seen - all apps that work correctly actually have calls to VFS_VGET - a predefined vfs layer hook.

From "mount.h":

*  @field vfs_vget
*  @abstract Get a vnode by file id (inode number).
*  @discussion This routine is chiefly used to build paths to vnodes.  Result should be turned with an iocount that the

In other words, VFS_VGET is what would eventually be called when a file reference URL is converted to a full path.

Whether the Adobe Media Encoder does not call for it - neither in my filesystem, nor in Samba, so my guess is that some applications have different browsing and retrieving algorithm. Is there anything I should examine further?

Yes, though it's hard to be very specific. As you've already guessed, their are multiple way to achieve very similar results through VFS system. Indeed, using "getattrlistbulk" isn't the way most developer's would think of walking a system hierarchy. The reason you're seeing it in "most" places isn't because the approach itself is common (perhaps "obvious" would be a better word) but is instead how most of "our" code happens to do so (for example, it's also what NSFileManager does).

This is an issue you need to be very aware of when testing something like a VFS driver. Broad based testing like this:

...Safari, apps like Microsoft Word, Excel and other range of applications

...is often FAR less valuable than you might think. What you actually tested there wasn't "do all these apps work" but was basically "does NSFileManager do the same thing every where". In other words, the reason so many apps "look" the same is that they ARE in fact "the same".

Default routines (vnop_open, vnop_lookup, vnop_readdir, vnop_close) behave as expected, without any errors.

FYI, the word "errors" makes me very nervous. VFS operation don't really "fail" by returning errors, they fail by returning data that doesn't create the expected/intended result.

I terms of specific recommendations, I have a few suggestions:

  • Setup as controlled a test as you possibly can, with minimal directory contents and external interactions. The ideal here is that the ONLY VFS activity that's occurring is coming directly from the app your trying to understand. You can often figure out what's going on by first isolating the issue (as above) then closely examining EXACTLY what every vfs operation returned.

  • The man page for "opendir" has a snippet showing "basic" POSIX style directory iteration and that's the first thing I would test with. As I talked about above, that's not the approach most of our APIs would use, but it is the approach someone implementing their own iteration might use, particularly if they were focusing on something like cross platform support.

  • If "opendir" works, then it would probably be worth testing with the Carbon File Manager. If the Carbon File Manager is what's failing, then I would take a look at what you're returning for volfs support and persistant ID support.

__
Kevin Elliott
DTS Engineer, CoreOS/Hardware

Thanks a lot in the first place.

What does the interface for this actually look like? Is it presenting its own file navigation UI (I suspect yes) or is this the open/save panel? Similarly, when looking at the other things you tested:

You are right, that most apps like Word, Safari and other applications that I have tested indeed use system-like navigation, with open/save buttons, while this program indeed has a separate UI, listing only media files and directories and it is simply different from Finder UI.


Regarding the cases you have asked to test.

Setup as controlled a test as you possibly can, with minimal directory contents and external interactions. The ideal here is that the ONLY VFS activity that's occurring is coming directly from the app your trying to understand. You can often figure out what's going on by first isolating the issue (as above) then closely examining EXACTLY what every vfs operation returned.

I have been sitting with samba and my driver with dtrace for around a week, comparing every field and every return code.

This might be too specific detail, but from what I have seen is that the app (Adobe Media Encoder) follows the below pattern on query:

  1. Firstly entering vfs driver, we get a vnop_open for root directory and exactly four readdirs for root (both samba and my driver have around 2-3 files and 1-2 subfolders, not much so everything is read in one vnop_readdir call). This is identical for both samba and my driver.
  2. Upon double-click on "subfolder", the following algorithm is initiated: For samba everything works correctly and logical - vnop_open for root, once again four readdirs for root, vnop_open for "subfolder" and readdir for "subfolder". However, for my driver, the behaviour is: vnop_open for root, four readdirs for root

And that's all. No more queries, no more requests, no aborting "vnop_open" calls, nothing. Afterwards, even renavigating back to my root will not work inside Adobe - the driver will simply represent itself as empty.


The man page for "opendir" has a snippet showing "basic" POSIX style directory iteration and that's the first thing I would test with. As I talked about above, that's not the approach most of our APIs would use, but it is the approach someone implementing their own iteration might use, particularly if they were focusing on something like cross platform support.

Yes, I have tried their example program, and I do get a proper output with my objects, whether they are indeed found, so this one works.


If "opendir" works, then it would probably be worth testing with the Carbon File Manager.

For this one, I wrote a little program


#include <Carbon/Carbon.h>
#include <stdio.h>

void listFilesInFolder(FSRef *folder) {
    FSIterator iterator;
    FSCatalogInfo catalogInfo;
    FSRef fileRef;
    ItemCount actualFetched;

    if (FSOpenIterator(folder, kFSIterateFlat, &iterator) != noErr) {
        printf("FSOpenIterator failed\n");
        return;
    }

    while (FSGetCatalogInfoBulk(iterator, 1, &actualFetched, NULL, kFSCatInfoNone, &catalogInfo, &fileRef, NULL, NULL) == noErr && actualFetched > 0) {
        char filePath[1024];
        if (FSRefMakePath(&fileRef, (UInt8 *)filePath, sizeof(filePath)) == noErr) {
            printf("File: %s\n", filePath);
        }
    }

    FSCloseIterator(iterator);
}

int main() {
    FSRef folderRef;
    if (FSPathMakeRef((const UInt8 *)"/Volumes/myvfs/mysubfolder", &folderRef, NULL) != noErr) {
        printf("FSPathMakeRef failed\n");
        return 1;
    }

    listFilesInFolder(&folderRef);
    return 0;
}

And I also properly retrieve all objects inside that subfolder. So I might say that carbon file manager works either.

However.

I would take a look at what you're returning for volfs support and persistant ID support.

I do not have MNT_DOVOLFS in mount flags. Attempting to set it on actually breaks some things and applications. The volume changes the icon in GUI apps, also in the Adobe Media Encoder once again the driver is no longer even showing the root folder.

For persistent object ids (VOL_CAP_FMT_PERSISTENTOBJECTIDS), I do have this one enabled, however, I am not sure I am doing anything to actually "support" it.

I have been sitting with samba and my driver with dtrace for around a week, comparing every field and every return code.

So, the risk with this kind of comparison are details (and differences) like these:

I do not have MNT_DOVOLFS in mount flags.

AND

For persistent object ids (VOL_CAP_FMT_PERSISTENTOBJECTIDS), I do have this one enabled, however, I am not sure I am doing anything to actually "support" it.

The behavior of a VFS driver isn't just determined by the data it directly returns. The details of how you "describe" your own capabilities to the system because the can and will change:

  1. How the VFS system interact with your driver and interprets your data.

  2. How user space process choose to interact with your files system.

Claiming your files system support a capability it does not will create EXACTLY the kind of problem you're seeing. Based on what you're said here:

Afterwards, even renavigating back to my root will not work inside Adobe - the driver will simply represent itself as empty.

I'd also suggest looking at "VOL_CAP_FMT_PATH_FROM_ID", however, my real answer is that I would do a full "audit" looking at exactly what you've declared support for, whether you actually provide the functionality that support requires, and also looking at what samba's does. Note that samba's behavior will vary based on the server and the underlying volume format.

Moving to the testing side:

This might be too specific detail, but from what I have seen is that the app (Adobe Media Encoder) follows the below pattern on query:

Have you had any success figuring out what syscalls they're actually making? Tools like dtrace may get you want you need, but if you're having trouble getting good data on this, one "trick" that might work would be doing something like the following:

  • Modify EVERY operation in your VFS driver to take an "absurd" amount of time (say 0.1s-> 1s).

  • Run your test while using Instruments or a command line tool like "sample" to profile the app.

The large delays you've added mean that each of "your" syscalls will be obvious, as they'll end up blocking for inordinately long periods of time.

(both samba and my driver have around 2-3 files and 1-2 subfolders, not much so everything is read in one vnop_readdir call).

Personally, I'd take this all the way down to exactly 1 directory and 1 file. Similarly, I'd be looking at EXACTLY what data I returned for every call and what that data actually "meant" to the larger system, particularly for any kind of variation in the data I'm actually returning.

This might be too specific detail, but from what I have seen is that the app (Adobe Media Encoder) follows the below pattern on query:

A few comment on your overall test process here:

  • How closely have you compared the actual data being returned, particularly the data you return in #1 vs #2?

  • As another API to look at, does "fsgetpath" work (see it's man page for the details)?

  • Are you SURE you haven't overlooked ANY possible VNOP/call/etc? Notably, when you navigate "back" and the directory is empty, did any activity occur to your driver? If that answer is "no", then you've either overlooked a syscall or the problem is specifically in the data you already gave them.

  • Reversing my previous point, what happened in samba when it navigated back? Critically, what was the first call of the "back" operation and what was the input data into that call? It's reasonably likely that the input data samba gave is exactly the data that's wrong in your implementation.

  • You've focused on comparing with samba, but have you tried comparing your working case against your failing case? I think you'll see very different vnop patterns but I'm talking about comparing the actual data returned, not just the calling patterns.

__
Kevin Elliott
DTS Engineer, CoreOS/Hardware

Subdirectory navigation fails for several GUI apps on custom VFS.
 
 
Q