Our app has an old codebase, originating in 2011, which started out as purely Objective-C (and a little bit of Objective-C++), but a good amount of Swift has been added over time as well. Lots of Objective-C and Swift inter-op, but in general very few 3rd party libraries/frameworks. Like many other codebases of this size and age, we have a good amount of accumulated tech debt. In our case, that mostly comes in the form of using old/deprecated APIs (OpenGL primary amongst them), and also using some ‘tricks’ that allowed us to do highly customized UI popups and the like before they were officially supported by iOS, but unfortunately are still in use to this day (i.e. adding views directly to the UIWindow such that that are ‘on top’ of everything, instead of presenting a VC). Overall though, the app is very powerful and capable, and generally has a relatively low crash rate.
About two months ago, we started seeing some new crashes that seemed to be totally unrelated to the code changes that were made at the time. Moreover, if a new branch with a feature or bug fix was merged in, the new crash would either disappear entirely, or move somewhere else. These were not ‘normal’ crashes either - when hooked up to the debugger in Xcode, often times the crashes would happen when calling into system library (e.g. initializing a UIColor object).
Some of the steps taken to try and mitigate or eliminate these crashes include:
Rolling back merges
Often worked, but then most future merges would cause a new and different crash to appear
Using the TSan and ASan tools to try and diagnose thread or memory issues
TSan reported a couple of issues near launch that have been fixed, and there are others in some areas of the app, but they have been around a long time and don’t appear to correlate with any recent changes, nor did fixing the ones at launch (and throughout testing to try and reproduce crashes) result in elimination of the new crashes
ASan does not identify any issues
Modifying the code changes in a branch before merging it in
In one case where the changes were limited to declaring ‘@objc static var: Bool’ in a Swift class and setting a value to it in a couple of places, simply removing the @objc from the declaration would result in the crash going away. Since the var had to be exposed to Objective-C, it was eventually moved to a pure Objective-C class that already existed and is a singleton (not ideal, but it’s been around a long time and has not yet been refactored) in order to preserve the functionality and the crash was no longer reproducible
Removing all 3rd party libraries or frameworks
Not a long-term solution, and this mostly worked in that the crashes went away, but it also resulted in removal of long-existing features expected by our users
Updating 3rd party libraries and frameworks when possible (there were some very old ones)
Updating these did not have any effect on the crashes, except that the crashes moved around in the same way as when merging in a branch, and again, where the crash actually occurred was uncorrelated with the library/framework that was updated
Changes to the App’s Build Settings in Xcode
Set supported/valid architectures to arm64 exclusively
Stripping of all architectures other than arm64 from 3rd party binaries
Cleaning up of old/outdated linker flags
Removal of other custom build flags that were needed at one point, but are no longer relevant
Generally trying to make all the build settings in our (quite old/outdated) app match those of a newly created iOS app
Code signing inject base entitlements is set to YES
Removal of old/deprecated BitCode flag
These changes seemed to help and the codebase was more ‘stable’ (non-crashing) for a while, but as we tried to continue development, the crashes would reappear
Getting crash reports off of test devices and analyzing them based on the various documents about crash reports provided by Apple
This was helpful and pointed to new things to investigate, but ultimately did not help to identify the root cause of these crashes
Throughout all of the above, the crashes would come and go, very reproducibly for a given branch being merged in, but if a subsequent branch is merged in, the crash may go away, or simply move somewhere else - sometimes it would crash in our code that calls other parts of our code, and other times when calling system frameworks (like the UIColor example above). One thing that is consistent though, is that the crash would never happen anywhere near the code that was changed or added by a branch that was merged in.
Additional observations when trying to figure out the cause of these crashes:
Sometimes the smallest code change would result in a crash happening or not
The crash reports generated on-device vary quite a bit in terms of the type and reason for the crash
All crashes have an Exception Type of EXC_BAD_ACCESS, but vary between (SIGABRT) (SIGBUS) (SIGKILL) (SIGSEV)
The crashing thread is often (but not always) on Thread 0 (main thread), and often the first line in the backtrace would be just ‘???’, sometimes followed by a valid memory address and file, but often times just ‘0x0 ???’
Most crash reports have an exception subtype of KERN_PROTECTION_FAILURE
Many also state that the Termination Reason is ‘CODESIGNING 2 Invalid Page’
This in particular was investigated thoroughly, including looking at the Placing Content In A Bundle document but after further changes to ensure that everything is in the right place, the crashes were still observed
Another odd thing in most of the crash reports is in the Binary Images section, there is a line that once again is mostly ???s or 000s - specifically ‘0x0 - 0xffffffffffffffff ??? unknown-arch <00000000000000000000000000000000> ???’
The crashes occur on different physical devices, typically the same crash for a given branch, and regardless of iOS version
This includes building from different Macs. We did observe some differences between versions of Xcode (crashed similarly when built from an older version of Xcode, but not from a newer one), but we recently had all developers ensure they are running Xcode 16.4 - we also tried Xcode 26, but the crashes were still observed
Overall, it seems like there is something very strange going on in terms of how the App binary is constructed such that a small code change somehow affects the binary in such a way that memory is not being accessed correctly, or is not where it is expected to be. This level of what appears to be a build-time issue that manifests in very strange run-time crashes is both confusing and difficult to diagnose. Despite the resources provided by Apple for investigation and diagnosis, we cannot seem to find a root cause for these crashes and eliminate them for good.