Hello! Some colleagues and work on Jujutsu, a version control system compatible with git, and I think we've uncovered a potential lock contention bug in either APFS or the Darwin kernel. There are four contributing factors to us thinking this is related to APFS or the Kernel:
- jj's testsuite uses nextest, a test runner for Rust that spawns each individual test as a separate process.
- The testsuite slowed down by a factor of ~5x on macOS after jj started using
fsync
. The slowdown increases as additional cores are allocated. - A similar slowdown did not occur on ext4.
- Similar performance issues were reported in the past by a former Mercurial maintainer: https://gregoryszorc.com/blog/2018/10/29/global-kernel-locks-in-apfs/.
My friend and colleague André has measured the test suite on an M3 Ultra with both a ramdisk and a traditional SSD and produced this graph:
(The most thorough writeup is the discussion on this pull request.)
I know I should file a feedback/bug report, but before I do, I'm struggling with profiling and finding kernel/APFS frames in my profiles so that I can properly attribute the cause of this apparent lock contention. Naively, I ran xctrace record --template 'Time Profiler' --output output.trace --launch /Users/dbarsky/.cargo/bin/cargo-nextest nextest run
, and while that detected all processes spawned by nextest
, it didn't record all processes as part of the same inspectable profile and didn't really show any frames from the kernel/APFS—I had to select individual processes. So I don't waste people's time and so that I can point a frame/smoking gun in the right system, how can I can use instruments to profile where the kernel and/or APFS are spending its time? Do I need to disable SIP?