otool -L randomly crashing? (fatal error in otool-classic)

I recently switched to Apple Silicon on a MacBook Air / M1 and ever since have been facing a weird crash when using otool.

My scenario has me working with a repository of precompiled universal dylibs which all are code signed with an adhoc profile. All of these dylibs are valid and 100% readable on disc. However sometimes, randomly any number of them will cause otool to crash with this message:

otool: fatal error in /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/otool-classic

Copying the affected library to a different place, deleting the original and moving the copy back to where the original was solves the issue for a random amount of time.

The crash is always the same and the workaround is always as described up. However I have no idea what's causing the issue as the same project has been working without any issues on an intel Mac for almost two years.

Looking into my crash logs I find the following:

Process:               otool-classic [8670]
Path:                  /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/otool-classic
Identifier:            otool-classic
Version:               980.1
Code Type:             ARM-64 (Native)
Parent Process:        ??? [8669]
Responsible:           Terminal [96515]
User ID:               501

Date/Time:             2021-08-27 17:46:21.774 +0200
OS Version:            macOS 11.5.1 (20G80)
Report Version:        12
Anonymous UUID:        E4021586-8704-4B85-AC4E-265554E01C00

Sleep/Wake UUID:       DD48C75B-1051-4B24-A92D-38EDEE55A6AE

Time Awake Since Boot: 28000 seconds
Time Since Wake:       5100 seconds

System Integrity Protection: enabled

Crashed Thread:        0  Dispatch queue: com.apple.main-thread

Exception Type:        EXC_BAD_ACCESS (Code Signature Invalid)
Exception Codes:       0x0000000000000032, 0x0000000105210000
Exception Note:        EXC_CORPSE_NOTIFY

Termination Reason:    Namespace CODESIGNING, Code 0x2

kernel messages:

VM Regions Near 0x105210000:
    __LINKEDIT                  105208000-10520c000    [   16K] r--/r-- SM=NUL  /usr/lib/dyld
--> mapped file                 10520c000-1055f4000    [ 4000K] rw-/rw- SM=COW  Object_id=580efbdb
    MALLOC_TINY                 13f600000-13f700000    [ 1024K] rw-/rwx SM=PRV  

Application Specific Information:
dyld2 mode

Thread 0 Crashed:: Dispatch queue: com.apple.main-thread
0   otool-classic                 	0x0000000104da7440 ofile_specific_arch + 448
1   otool-classic                 	0x0000000104da731c ofile_specific_arch + 156
2   otool-classic                 	0x0000000104da4690 ofile_process + 2668
3   otool-classic                 	0x0000000104da8434 main + 2336
4   libdyld.dylib                 	0x0000000181789430 start + 4

Thread 0 crashed with ARM Thread State (64-bit):
    x0: 0x000000013f606780   x1: 0x0000000104e0b3e0   x2: 0x0000000000000000   x3: 0x000000013f606787
    x4: 0x0000000000000000   x5: 0x0000000000000010   x6: 0x0000000000000000   x7: 0x0000000000000000
    x8: 0x00000001055f0660   x9: 0x0000000000216390  x10: 0x00000000003e0660  x11: 0x0000000105426390
   x12: 0x0000000000010000  x13: 0x0000000000000015  x14: 0x0000000000000800  x15: 0x000000008000001f
   x16: 0x00000001817b424c  x17: 0x000000018158e83c  x18: 0x0000000000000000  x19: 0x000000016b05f490
   x20: 0x000000016b05f4d8  x21: 0x0000000105210000  x22: 0x0000000000216390  x23: 0x000000016b05f5c8
   x24: 0x0000000000000000  x25: 0x0000000000000000  x26: 0x000000016b05f988  x27: 0x0000000000000000
   x28: 0x0000000000000001   fp: 0x000000016b05f430   lr: 0x0000000104da731c
    sp: 0x000000016b05f3e0   pc: 0x0000000104da7440 cpsr: 0x20000000
   far: 0x0000000105210000  esr: 0x92000007


Binary Images:
       0x104da0000 -        0x104e23fff +otool-classic (980.1) <67F17B71-A17E-3BDC-B6C2-038E0044413D> /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/otool-classic
       0x105104000 -        0x105183fff  dyld (852.2) <17D14D9B-B6B2-35DC-B157-4FD60213BE99> /usr/lib/dyld
[remainder removed, see attachment]

Replies

This is odd. This should be captured in a bug report, but it also looks like otool is crashing when trying to determine or read information from one of these dylibs. Do you have any more information that you can share on these dylibs when the working case is present?

Matt Eaton
DTS Engineering, CoreOS
meaton3@apple.com
  • I can share pretty much anything that could be helpful, all of these dylibs are in-house compiled versions of open source projects so I'm going to link to a XZ archive containing a few of those that are most often involved in the crashes (in this case openCASCADE, specifically libTKBO.7.dylib). There is no difference between the working and the crashing case as I do not modify the dylibs other than applying the copy, delete, replace workaround if needed.

    One thing that makes me believe it might even be some weird caching or APFS issue is the fact that, if I create a simple "Hello World" C program and link it against one of the dylibs it will normally run just fine except in the case where otool would also crash. If I get a crash from otool -L and try to run the binary linking to the affected dylib it will also crash pointing to a failure to verify the code signature of @rpath/NAME_OF_LIBRARY.

    https://www.icloud.com/iclouddrive/0m1cKLZsmmdbXfK7xg9VORzFA#openCASCADE_sample it's shared for meaton3 apple.com, libTKBO.7.dylib is one of the libs mostly causing the problem.

    A few more details to my working environment:

    MacBook Air M1, late 2020macOS Big Sur 11.5.2 (issue also happened on 11.5.1)Xcode 12.5all files are stored on an external USB 3 SSD formatted with APFS
Add a Comment

Copying the affected library to a different place, deleting the original and moving the copy back to where the original was solves the issue for a random amount of time.

OK, those symptoms closely align with the code signing kernel cache issue discussed in this thread. Something is modifying these files on disk after their signature has been cached by the kernel, and thus otool ends up crashing because the file on disk is out of sync with the cached signature.

as the same project has been working without any issues on an intel Mac for almost two years.

Yeah, I see this a lot because Intel Macs can run unsigned code, and thus there were fewer code signatures running around the system. Apple silicon Macs require that all code be signed, and thus there’s more opportunity for this problem to crop up.

To solve this problem you need to look at the process used to build, sign and install these libraries, making sure that all the steps taken after signing the code avoid overwriting a signed file. For example, imagine you install the a new version of the library like so:

% ls -i MyLib.dylib
78290841 MyLib.dylib
% cp MyLib-new.dylib MyLib.dylib
% ls -i MyLib.dylib  
78290841 MyLib.dylib

Note how the inode number hasn’t changed. This is problematic because it shows that cp has overwritten the library’s file. Contrast this with ditto:

% ls -i MyLib.dylib
78290841 MyLib.dylib
% ditto MyLib-new.dylib MyLib.dylib
% ls -i MyLib.dylib  
78290841 MyLib.dylib

See how you get a new inode number and thus avoid this problem.

This is also why your copy and move technique avoids the problem.

ps I’m in the process of writing up a doc that explains this (r. 82248510). And Matt’s gonna kick himself for not recognising this issue because he’s actually reviewed that doc (-:

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

  • That sounds very reasonable to me and I can follow your explanation as to why this would cause cache issues (the thread you linked to gives me "Error - Access Denied / content not found" though). In our setup here those libs are pulled from a repo once and pretty much never get touched again later though - other than being read by otool and friends / copied into a final .app bundle. Could there be any system process in the background causing this?

  • A small update - I created a disk image containing all of our dependency dylibs and verified they all could be read by otool -L before mounting the disk image read-only and using it as a source for headers and linking against the dylibs. Lo and behold about an hour later I get the same issue even though the image is mounted read-only so nothing should have changed in those libraries at all.

Add a Comment

Lo and behold about an hour later I get the same issue even though the image is mounted read-only so nothing should have changed in those libraries at all.

Hmmm. Do these files, or the disk image itself, end up quarantined?

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

  • The dylibs always get quarantined on the first SVN checkout as the system deems them "downloaded from the internet" and since they all carry only an adhoc signature. However once removed neither they nor the disk image got the quarantine xattr again. I'm currently investigating another theory - it looks like someone messed up the check in and instead of a symlink "libTKBO.7.dylib" pointing to "libTKBO.7.5.0.dylib" we have copies so there's a "libTKBO.7.dylib" which is a copy of "libTKBO.7.5.0.dylib" carrying the same adhoc signature that was created on the original.

Add a Comment

I’m in the process of writing up a doc that explains this

And it’s now on the web as Updating Mac Software. Finally!

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"