NSURL - Are Cached Resource Values Really Automatically Removed After Each Pass Through the Run Loop?

The documentation says:

The caching behavior of the NSURL and CFURL APIs differ. For NSURL, all cached values (not temporary values) are automatically removed after each pass through the run loop. You only need to call the removeCachedResourceValueForKey: method when you want to clear the cache within a single execution of the run loop. The CFURL functions, on the other hand, do not automatically clear cached resource values. The client has complete control over the cache lifetimes, and you must use CFURLClearResourcePropertyCacheForKey or CFURLClearResourcePropertyCache to clear cached resource values.

https://developer.apple.com/documentation/foundation/nsurl/removeallcachedresourcevalues()?language=objc

Is this really true? In my experience I've had to explicitly remove cached resource values via -removeAllCachedResourceValues or removeCachedResourceValueForKey: otherwise the URL contains stale values.

For example on a URL that no longer exists I attempted to read NSURLIsHiddenKey and the last value was already cached. Instead of getting a NSFileNoSuchFileError I get the old cache value unless explicitly call -removeCachedResourceValueForKey: first and I'm fairly certain the value was cached on a previous run loop churn.

Answered by DTS Engineer in 873772022
But whose run loop 'owns' the URL resources?

This question doesn’t make sense. Well, it makes sense given how this feature is documented, but that documentation is aiming to offer a high-level overview of the effect of the feature, not a detailed description of its implementation.

This isn’t implemented using active cache flushing. Rather, it’s based on passively detecting invalidation. When you access a property on the main thread, the system checks to see if the cached value is still valid, that is, has the main thread ‘turned’ between when you last got the property and now.

The exact mechanics of this aren’t documented, and for good reason. It’s easy to imagine how this implementation might change over time. Rather, the take-home message is:

  • You have to worry caching.
  • If you’re working on the main thread, that worry is bounded by the turning of the run loop.
it doesn't make any sense to me

This design dates from macOS 10.6, which is over 15 years ago, and things have evolved a bit since then.

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

Thanks for the post, I must say, extremely interesting as this is something I have always keep an eye on.

Based on my experience and the documentation provided, it seems there may explain how the API work, however, there are other layers on the request itself that may affect the results in my modest opinion.

Due the request by iOS and the server cache the user may see some discrepancies or additional nuances in how and caching behave, particularly regarding stale values.

While the documentation states that caches values automatically and purges them at the end of each run loop, your observation suggests that cached resource values can persist across run loops unless explicitly cleared, the automatic cache clearing behavior may not always function as expected there might be differences in how cached values are handled depending on the system and the server changes?

Explicit cache management is recommended, especially when dealing with URLs that may have changed state (e.g., deleted or modified), to ensure your application behaves predictably across different environments and runtime sessions.

However the best thing you can do it to create a focused simple project that can add different URL and test those against servers as well as adding different modifiers for URLs like ?t=123456 That may help you to see how it works and how those behaves, maybe the documentation needs a change after your findings?

Looking forward to see how far you get with this one.

Albert Pascual
  Worldwide Developer Relations.

Thanks for responding.

I haven't yet been able to reproduce the issue in a small test project so either they made changes in a macOS update or there is some timing issue and/or way the NSURL is initialized in my real project teases the issue out. But I promise I got stale value for nonexistent NSURLIsHiddenKey until I started explicitly dumping it.

In my real project the URL comes for -fileURLWithPath: (and initially created on a background thread). My real project is large and complex so when I have more time I'll have to see if I can isolate the issue in a small sample.

Due the request by iOS and the server cache the user may see some discrepancies or additional nuances in how and caching behave, particularly regarding stale values.

I'm on macOS.

Explicit cache management is recommended, especially when dealing with URLs that may have changed state (e.g., deleted or modified), to ensure your application behaves predictably across different environments and runtime sessions.

Also CFURL is documented to have different behavior but CFURL and NSURL aren't they 'toll free bridged' so how does that work if you cast a CFURL to a NSURL?

I would certainly prefer explicit cache management. Chucking cached resource values at every churn of the run loop seems silly and IMO could cause unintended performance issues because the next time the app asks for the cache resource value I presume we hit the disk unnecessarily when nothing has changed. Chucking cached values at every churn of the run loop really reduces the benefits of caching in the first place - developers have to cache on top of the cache but you don't know what's cache? The only way to be sure you don't get a stale value is to call -removeCacheResourceValue methods, which again seems to reduce the benefits of NSURL caching in the first place.

Also the problem with explicit cache management is I don't know if resource values are shared across NSURL instances. In the past I had a talk with an Apple badged person and I think he told me it was (or could be at that time but I'll have to check notes). At that time I discovered that calls to -removeResourceValueForKey and/or -removeAllCachedResourceValues could cause crashes. Do you know if each NSURL instance gets its own isolated cache or are they still shared by NSURLs that point to the same file (which I don't think would be safe)?

So I've discovered if you create the NSURL on a background thread you can get stale resource values even on the next churn of the run loop. But whose run loop 'owns' the URL resources? The main thread's run loop ? The run loop of the thread the URL was created on? Or does NSURL do something like this:

- (BOOL)getResourceValue:(out id _Nullable * _Nonnull)value forKey:(NSURLResourceKey)key error:(out NSError ** _Nullable)error API_AVAILABLE(macos(10.6), ios(4.0), watchos(2.0), tvos(9.0));
{
   // do whatever to get the resource value
//--
// schedule delete from cache.
	[NSObject cancelPreviousPerformRequestsWithTarget:self selector:@selector(clearResourceValueForKey:)  object:key];
[self performSelector:@selector(clearResourceValueForKey:) withObject:key afterDelay:0.0];
}

I don't see how you can tie cached resource value state to a run loop - it doesn't make any sense to me, especially since the documentation advises doing file operations on a background thread/queue because hitting the disk can be slow. What if I'm sorting an array of NSURLs by some metadata on a background thread and the main thread run loop is just happily invalidating the cache resources behind my back?

Maybe I'm not understanding something. IMO I think better behavior would be just to cache on first access and keep it until I explicitly invalidate it but I guess I have to use CFURL for that. I think it makes more sense to let the developer flush the cache in response to file coordination or whatever API being used to monitor the URL but I guess I have to build a cache on top of the cache.

Thanks for the post, I think your understanding is great and this is very interesting how you use the API and how it behaves.

Your observations touch on some subtle behaviors and design philosophies around file resource handling with (and related APIs) in the frameworks that I have been reading the documentation looking for those. However, most of the time I find it better to create a test project to see how it works. I am not an expert on that field, so we should involved someone from that team as it seems these caches are generally managed by the system to balance performance and resource usage, and not strictly tied to any particular run loop like the main or a background thread's run loop?

I need to write some code as maybe creating instances or fetching their resource values on background threads is indeed recommended for performance reasons, especially for potentially blocking I/O operations.

As you suggested, using with more control over caching might suit your needs if you require finer-grained management.

Your suggestion to have more developer-controlled cache invalidation aligns with use cases requiring deterministic behavior, especially in complex applications dealing with dynamic file systems.

I’m looking forward to see where you get to. I’m definitely planning to create my own test hardness to move forward to better understand it.

Albert Pascual
  Worldwide Developer Relations.

But whose run loop 'owns' the URL resources?

This question doesn’t make sense. Well, it makes sense given how this feature is documented, but that documentation is aiming to offer a high-level overview of the effect of the feature, not a detailed description of its implementation.

This isn’t implemented using active cache flushing. Rather, it’s based on passively detecting invalidation. When you access a property on the main thread, the system checks to see if the cached value is still valid, that is, has the main thread ‘turned’ between when you last got the property and now.

The exact mechanics of this aren’t documented, and for good reason. It’s easy to imagine how this implementation might change over time. Rather, the take-home message is:

  • You have to worry caching.
  • If you’re working on the main thread, that worry is bounded by the turning of the run loop.
it doesn't make any sense to me

This design dates from macOS 10.6, which is over 15 years ago, and things have evolved a bit since then.

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

When you access a property on the main thread, the system checks to see if the cached value is still valid, that is, has the main thread ‘turned’ between when you last got the property and now.

I see. IMHO binding cache invalidation to the turning of a run loop while simultaneously recommending accessing -getResourceValue:forKey: on a background queue feels somewhat peculiar but I guess that peculiarity came during that long evolution. The main benefit of caching just until the run loop turns I guess would help if you were continuously calling -getResourceValue:forKey: in loop or in the middle of event handling but I wouldn't expect that to be so common given the recommendation to call -getResourceValue:forKey: off the main thread when possible. The downside with this short lived cache is that other threads can read stale values from -getResourceValue:forKey: calls unless they explicitly clear the cache using -removeCachedResourceValueForKey: etc. but using those methods wasn't always safe but hopefully they are now.

I was working on just getting metadata I need with other APIs and caching them but there are some important resource values I need that seemingly are only available through the NSURL apis :)

Also CFURL is documented to have different behavior but CFURL and NSURL aren't they 'toll free bridged' so how does that work if you cast a CFURL to a NSURL?

So I just did a dumb little test to answer this. I created a CFURLRef and read kCFURLIsHiddenKey and got false (as expected) so it would cache it.

Then on another CFURL instance (that points to the same file) I did the same thing (so it was cache the value).

After that I set hidden to YES:

NSURL *toNSURL = (__bridge NSURL * _Nonnull)cfURL;
[toNSURL setResourceValue:@(YES) forKey:NSURLIsHiddenKey error:nil];

Then back to the other instance I cast that one to another NSURL:

NSURL *nsCastCopiedVersion = (__bridge NSURL * _Nonnull)aCopiedCFURL;

And read isHidden and it is NO. Read aCopiedCFURL hidden and it is also NO.

Then dispatch after

dispatch_after(dispatch_time(DISPATCH_TIME_NOW, (int64_t)(2 * NSEC_PER_SEC)), dispatch_get_main_queue(), ^{
				[URLPrinter printCF:aCopiedCFURL];
	[URLPrinter printNS:nsCastCopiedVersion];
			});

And got:

Print copied CF Not hidden // cached value - real value is true.

Print copied NSCF: Hidden // hidden true value

So indeed using CFURL instead would seem to avoid this. Learn something new every day. Now if I didn't already have a million lines of code already using NSURL I would just use CF instead (still might) because at least for my app I find dumping cache values at every run loop churn undesirable.

I did have a better caching system implemented at some point but seems some of it got dumped it when I implemented a workaround to avoid -removeCacheResourceValue: crashes a couple of years ago. It'll all be back in biz soon.

Thanks for the help guys.

NSURL - Are Cached Resource Values Really Automatically Removed After Each Pass Through the Run Loop?
 
 
Q