Investigating hard-to-reproduce keychain problems

This thread has been locked by a moderator.

Keychain is a bit of a ‘call driver’ for DTS. I’ve sent instructions like this to many developers over the years. Today I decided to write it down for everyone’s benefit.

If you have questions or comments, put them in a new thread here on DevForums. Tag it with Security so that I see it.

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"


Investigating hard-to-reproduce keychain problems

SecItem is the primary API for the keychain on all Apple platforms. If you’re getting started with that API, I have two DevForums posts that you’ll find useful:

Also, if you’re on macOS, make sure to read TN3137 On Mac keychain APIs and implementations.

Every now and again folks encounter a hard-to-reproduce keychain problem. For example, you might have code that works most of the time but fails for some users in the field. This failure might be seen regularly by some users, or it might happen sporadically across all your users. Problems like this are often caused by a bug in the system itself, but the SecItem API is sufficiently tricky that I’ve seen cases of this where the actual problem was a bug in the developer’s code.

This post outlines a process for investigating such problems. I’ve helped a number of developers use it to good effect, and I figured I should share it for the enjoyment of all (hey hey!).

This process depends on the system log. If you’re not best friends with the system log, you should be! See Your Friend the System Log.

This process is a special case of the process I describe in Using a Sysdiagnose Log to Debug a Hard-to-Reproduce Problem. Read that before continuing.

Unwrap the wrapper

Many developers use a wrapper around the SecItem API. Such wrappers make it hard to investigate problems like this. For example, you might call a wrapper routine that returns a password string or nil if there’s a failure. That means you can’t distinguish between an expected nil, where the password hasn’t been saved yet, or a weird error.

If you need to debug a hard-to-reproduce keychain problem, look through the wrapper to find the calls to Apple’s SecItem APIs. It’s fine to leave your wrapper in place, but do this debugging at the SecItem level.

Add before and after log points

The basic strategy here is:

  1. Add a log point before the call to the SecItem API, including the parameters that you’ll pass in.

  2. Add a log point after the call to the SecItem API, including the returned error code and any response you got.

This is trickier than it might seem due to the way that the SecItem API is structured. Consider this example:

func copyAccountPassword(_ userName: String) -> String? {
    var copyResult: CFTypeRef? = nil
    let err = SecItemCopyMatching([
        kSecClass: kSecClassGenericPassword,
        kSecAttrService: "WaffleVarnish",
        kSecAttrAccount: userName,
        kSecReturnData: true,
    ] as NSDictionary, &copyResult)
    guard
        err == errSecSuccess,
        let result = String(data: copyResult! as! Data, encoding: .utf8)
    else { return nil }
    return result
}

There are a bunch of issues here:

  • The query dictionary is created inline, so it’s not easy to log it.

  • The query dictionary contains the kSecAttrAccount property, which is likely to hold private data.

  • The query can fail, leaving copyResult set to nil.

  • The query can work but return invalid data. In this case the function will return nil but there’s no error.

  • The password is obviously private data.

WARNING Be careful when logging keychain data. The whole point of the keychain is to protect the user’s secrets. It would be bad to then go and log those secrets to the system log.

Here’s an updated version of that routine:

func copyAccountPasswordWithLogging(_ userName: String) -> String? {
    let query = [
        kSecClass: kSecClassGenericPassword,
        kSecAttrService: "WaffleVarnish",
        kSecAttrAccount: userName,
        kSecReturnData: true,
    ] as NSDictionary
    var copyResult: CFTypeRef? = nil
    log.log("will copy account password, query: \(query)")
    let err = SecItemCopyMatching(query, &copyResult)
    guard err == errSecSuccess else {
        log.log("did not copy account password, err: \(err)")
        return nil
    }
    guard let result = String(data: copyResult! as! Data, encoding: .utf8) else {
        log.log("did not copy account password, malformed")
        return nil
    }
    log.log("did copy account password")
    return result
}

This example assumes that log is a value of type Logger.

Redact private data

This new code isn’t perfect:

  • The query value is considered private data and not recorded in the log.

  • It logs nothing about the resulting password.

Addressing the second problem is a challenge. You could do something creative like log a salted hash of the password but, honestly, I think it’s best to err on the side of caution here.

You might try to fix the first problem with code like this:

!!! DO NOT DO THIS !!!
log.log("will copy account password, query: \(query, privacy: .public)")
!!! DO NOT DO THIS !!!

However, that’s problematic because it logs the account value (kSecAttrAccount) when it should be redacted. What you do about this depends on the scope of your deployment:

  • If you’re targeting an internal test harness, you might choose to leave it as is. The machines on the test harness don’t have any truly private data.

  • If this code is going to be used by actual humans, you must further redact your logging.

For example, you might write a helper like this:

func redactedQuery(_ query: NSDictionary) -> String {
    let query = query.mutableCopy() as! NSMutableDictionary
    if query.object(forKey: kSecAttrAccount as NSString) != nil {
        query.setObject("REDACTED", forKey: kSecAttrAccount as NSString)
    }
    return "\(query)"
}

You’ll have to customise this code for your specific use case. For example, your code might put information that’s not private into kSecAttrAccount — in many of my projects, I use a fixed string for this — and so redacting that might be pointless. OTOH, your code might put private information into other properties.

If you call a SecItem API that returns a dictionary, you’ll need a similar redactedResponse(_:) routine for that response.

Test your logging

The next step is to test your logging. Make sure that the stuff you want logged is logged and the stuff you don’t want logged is not.

Test outside of Xcode, because Xcode automatically captures private data.

Look in the log

When you get a sysdiagnose log back from a user having this problem, unpack and open the system log snapshot. Find all the log entry pairs created by your before and after log points. Then look for the most recent one that illustrates the problem you’re investigating.

To learn more about the cause of this problem, look for other log entries of interest between those two points.

When looking at keychain-related log entries, keep in mind that most of the Security framework is open source. If you’re curious what a log entry means, search the source to see the context.

Note Darwin open source won’t always exactly match the source in the corresponding OS version. Also, the Security framework is only part of the story here, and some other subsystems that are relevant to the keychain aren’t open source. However, my experience is that looking at the source is super useful.

If you find that log entries are missing, remember that the system log purges older log entries to make room for newer ones. That’s why it’s important to take your sysdiagnose as soon as possible after encountering the problem.

If you escalate this problem to Apple — via a bug report, a DTS tech support incident, or whatever — make sure to include:

  • Your sysdiagnose log

  • The exact timestamps of your before and after log entries

We may not need this level of information, but in many cases it really helps.

Up vote post of eskimo
271 views