lldb sometimes doesn't stop at breakpoints

I have some python scripts that I'm using within lldb to track messages that I receive from Endpoint Security (I'm trying to track down an issue where I may not respond to an es_message_t). So I added breakpoints for es_retain_message, es_release_message, es_respond_auth_result and es_respond_flags_result. And it works great...

Some of the time.

But it--fairly frequently--won't trigger one of the breakpoints. I will see cases where the es_retain_message python code didn't run but the es_respond_auth_result and the es_release_message did. Or any combination thereof. And sometimes they will all be called and everything will work out fine.

I'm thinking there may be something thread based. All of the breakpoints are within code blocks--one from Endpoint Security and the other as an application-specific one (a concurrent queue).

The pseudo-code looks something like:

// Within the ES code block
... 
es_retain_message(msg)
...
dispatch_async(localConcurrentQueue, ^{
    ...
    es_respond_auth_result(msg) // or es_respond_flags_result, depending...
    ...
    es_release_message(msg)
}

I feel like I see this happen more often when my python script prints, but I haven't measured it to see if that's really happening. I've tried removing some of the printing and tried various creative uses of lldb.SBDebugger.SetAsync to no avail.

Is there a known issue in lldb that I'm missing?

So, you’re trying to use LLDB to debug to a live ES client? Honestly, I’m surprised you’ve got this far. See this post.

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

Well, I've found es_mute_path is your friend in these cases 😀. Muting debugserver and lldb helps a lot. I can also put myself into a state where I'm not responding to any "Auth" messages, which allows me to connect to the process. I can then subscribe to the "Auth" messages and go on from there.

The problem is when I get a SIGKILL, I got nothin'--no good way to track what messages might be waiting. This way, I can get my SIGKILL and go, "Oh, okay, what messages have I got that are outstanding?"

All of my scripts end up with frame.thread.process.Continue() so I'm not stopping for any reason. For example:

4: name = 'es_release_message', locations = 1, resolved = 1, hit count = 685539
    Breakpoint commands (Python):
      # This is code that should be attached to es_release_message
      messageAddr = frame.EvaluateExpression("$rdi")
      eventID = hex(int(messageAddr.GetValue(), 0))
      EventWatcher.releaseEventNum(eventID)
      frame.thread.process.Continue()

..and, again, it works a lot of the time. But not all the time. Sometimes that python code doesn't appear to run. I believe this is the case because ES will send me a new message with the same "eventID" as before, which shouldn't be happening if the message was retained. So it either had to be released but I got no sign of it or ES is somehow deciding it can reuse retained messages that have received a response, but I don't think that's how it works.

lldb sometimes doesn't stop at breakpoints
 
 
Q