Develop kernel-resident device drivers and kernel extensions using Kernel.

Posts under Kernel tag

49 Posts

Post

Replies

Boosts

Views

Activity

KEXT Code Signing Problems
On modern systems all KEXTs must be code signed with a Developer ID. Additionally, the Developer ID must be specifically enabled for KEXT development. You can learn more about that process on the Developer ID page. If your KEXT is having code signing problems, check that it’s signed with a KEXT-enabled Developer ID. Do this by looking at the certificate used to sign the KEXT. First, extract the certificates from the signed KEXT: % codesign -d --extract-certificates MyKEXT.kext Executable=/Users/quinn/Desktop/MyKEXT/build/Debug/MyKEXT.kext/Contents/MacOS/MyKEXT This creates a bunch of certificates of the form codesignNNN, where NNN is a number in the range from 0 (the leaf) to N (the root). For example: % ls -lh codesign* -rw-r--r--+ 1 quinn staff 1.4K 20 Jul 10:23 codesign0 -rw-r--r--+ 1 quinn staff 1.0K 20 Jul 10:23 codesign1 -rw-r--r--+ 1 quinn staff 1.2K 20 Jul 10:23 codesign2 Next, rename each of those certificates to include the .cer extension: % for i in codesign*; do mv $i $i.cer; done Finally, look at the leaf certificate (codesign0.cer) to see if it has an extension with the OID 1.2.840.113635.100.6.1.18. The easiest way to view the certificate is to use Quick Look in Finder. Note If you’re curious where these Apple-specific OIDs comes from, check out the documents on the Apple PKI page. In this specific case, look at section 4.11.3 Application and Kernel Extension Code Signing Certificates of the Developer ID CPS. If the certificate does have this extension, there’s some other problems with your KEXT’s code signing. In that case, feel free to create a new thread here on DevForums with your details. If the certificate does not have this extension, there are two possible causes: Xcode might be using an out-of-date signing certificate. Re-create your Developer ID signing certificate using the developer site and see if the extension shows up there. If so, you’ll have to investigate why Xcode is not using the most up-to-date signing certificate. If a freshly-created Developer ID signing certificate does not have this extension, you need to apply to get your Developer ID enabled for KEXT development per the instructions on the Developer ID page. Share and Enjoy — Quinn “The Eskimo!” @ Developer Technical Support @ Apple let myEmail = "eskimo" + "1" + "@" + "apple.com" Change history: 20 Jul 2016 — First published. 28 Mar 2019 — Added a link to the Apple PKI site. Other, minor changes. 15 Mar 2022 — Fixed the formatting. Updated the section number in the Developer ID CPS. Made other minor editorial changes.
0
0
6.6k
Mar ’22
Instructions for debugging recent macos kernel versions?
Is there any recent and a bit authoritative documentation which explains how to debug recent versions of macos kernel? I have found some blog posts from other users but those are either outdated or don't work for some other reason. I am guessing kernel debugging is pretty common for developers working on macos itself, so I'm hoping someone in this forum would have some working instructions for that.
2
0
45
19h
Blocking USB Devices on macOS – DriverKit or Other Recommended Approach
Hi Apple, We are working on a general USB device management solution on macOS for enterprise security. Our goal is to enforce policy-based restrictions on USB devices, such as: For USB storage devices: block mount, read, or write access. For other peripherals (e.g., USB headsets or microphones, raspberry pi, etc): block usage entirely. We know in past, kernel extension would be the way to go, but as kext has been deprecated. And DriverKit is the new advertised framework. At first, DriverKit looked like the right direction. However, after reviewing the documentation more closely, we noticed that using DriverKit for USB requires specific entitlements: DriverKit USB Transport – VendorID DriverKit USB Transport – VendorID and ProductID This raises a challenge: if our solution is meant to cover all types of USB devices, we would theoretically need entitlements for every VendorID/ProductID in existence. My questions are: Is DriverKit actually the right framework for this kind of general-purpose USB device control? If not, what framework or mechanism should we be looking at for enforcing these kinds of policies? We also developed an Endpoint Security product, but so far we haven’t found a relevant Endpoint Security event type that would allow us to achieve this. Any guidance on the correct technical approach would be much appreciated. Thanks in advance for your help.
6
0
77
1d
macOS 26 kernel open source?
Hi! I was wondering if there will be new XNU version for macOS 26 published open source? As far as I remember, previous version's source code was published the moment the OS was officially released, but not this time. If yes, when we can expect it to be published?
1
0
48
3d
macos 15.6.1 - BSD sendto() fails for IPv4-mapped IPv6 addresses
There appears to be some unexplained change in behaviour in the recent version of macos 15.6.1 which is causing the BSD socket sendto() syscall to no longer send the data when the source socket is bound to a IPv4-mapped IPv6 address. I have attached a trivial native code which reproduces the issue. What this reproducer does is explained as a comment on that code's main() function: // Creates a AF_INET6 datagram socket, marks it as dual socket (i.e. IPV6_V6ONLY = 0), // then binds the socket to a IPv4-mapped IPv6 address (chosen on the host where this test runs). // // The test then uses sendto() to send some bytes. For the sake of this test, it uses the same IPv4-mapped // IPv6 address as the destination address to sendto(). The test then waits for (a maximum of) 15 seconds to // receive that sent message by calling recvfrom(). // // The test passes on macos (x64 and aarch64) hosts of versions 12.x, 13.x, 14.x and 15.x upto 15.5. // Only on macos 15.6.1 and the recent macos 26, the test fails. Specifically, the first message that is // sent using sendto() is never sent (and thus the recvfrom()) times out. sendto() however returns 0, // incorrectly indicating a successful send. Interesting, if you repeat sendto() a second message from the // same bound socket to the exact same destination address, the send message is indeed correctly sent and // received immediately by the recvfrom(). It's only the first message which goes missing (the test uses // unique content in each message to be sure which exact message was received and it has been observed that // only the second message is received and the first one lost). // // Logs collected using "sudo log collect --last 2m" (after the test program returns) shows the following log // message, which seem relevant: // ... // default kernel cfil_hash_entry_log:6088 <CFIL: Error: sosend_reinject() failed>: // [86868 a.out] <UDP(17) out so 59faaa5dbbcef55d 127846646561221313 127846646561221313 age 0> // lport 65051 fport 65051 laddr 192.168.1.2 faddr 192.168.1.2 hash 201AAC1 // default kernel cfil_service_inject_queue:4472 CFIL: sosend() failed 22 // ... // As noted, this test passes without issues on various macosx version (12 through 15.5), both x64 and aarch64 but always fails against 15.6.1. I have been told that it also fails on the recently released macos 26 but I don't have access to such host to verify it myself. The release notes don't usually contain this level of detail, so it's hard to tell if something changed intentionally or if this is a bug. Should I report this through the feedback assistant? Attached is the source of the reproducer, run it as: clang dgramsend.c ./a.out On macos 15.6.1, you will see that it will fail to send (and thus receive) the message on first attempt but the second one passes: ... created and bound a datagram dual socket to ::ffff:192.168.1.2:65055 ::ffff:192.168.1.2:65055 sendto() ::ffff:192.168.1.2:65055 ---- Attempt 1 ---- sending greeting "hello 1" sendto() succeeded, sent 8 bytes calling recvfrom() receive timed out --------------------- ---- Attempt 2 ---- sending greeting "hello 2" sendto() succeeded, sent 8 bytes calling recvfrom() received 8 bytes: "hello 2" --------------------- TEST FAILED ... The output "log collect --last 2m" contains a related error (and this log message consistently shows up every time you run that reproducer): ... default kernel cfil_hash_entry_log:6088 <CFIL: Error: sosend_reinject() failed>: [86248 a.out] <UDP(17) out so 59faaa5dbbcef55d 127846646561221313 127846646561221313 age 0> lport 65055 fport 65055 laddr 192.168.1.2 faddr 192.168.1.2 hash 201AAC1 default kernel cfil_service_inject_queue:4472 CFIL: sosend() failed 22 ... I don't know what it means though. dgramsend.c
2
0
94
6d
Pinpointing dandling pointers in 3rd party KEXTs
I'm debugging the following kernel panic to do with my custom filesystem KEXT: panic(cpu 0 caller 0xfffffe004cae3e24): [kalloc.type.var4.128]: element modified after free (off:96, val:0x00000000ffffffff, sz:128, ptr:0xfffffe2e7c639600) My reading of this is that somewhere in my KEXT I'm holding a reference 0xfffffe2e7c639600 to a 128 byte zone that wrote 0x00000000ffffffff at offset 96 after that particular chunk of memory had been released and zeroed out by the kernel. The panic itself is emitted when my KEXT requests the memory chunk that's been tempered with via the following set of calls. zalloc_uaf_panic() __abortlike static void zalloc_uaf_panic(zone_t z, uintptr_t elem, size_t size) { ... (panic)("[%s%s]: element modified after free " "(off:%d, val:0x%016lx, sz:%d, ptr:%p)%s", zone_heap_name(z), zone_name(z), first_offs, first_bits, esize, (void *)elem, buf); ... } zalloc_validate_element() static void zalloc_validate_element( zone_t zone, vm_offset_t elem, vm_size_t size, zalloc_flags_t flags) { ... if (memcmp_zero_ptr_aligned((void *)elem, size)) { zalloc_uaf_panic(zone, elem, size); } ... } The panic is triggered if memcmp_zero_ptr_aligned(), which is implemented in assembly, detects that an n-sized chunk of memory has been written after being free'd. /* memcmp_zero_ptr_aligned() checks string s of n bytes contains all zeros. * Address and size of the string s must be pointer-aligned. * Return 0 if true, 1 otherwise. Also return 0 if n is 0. */ extern int memcmp_zero_ptr_aligned(const void *s, size_t n); Normally, KASAN would be resorted to to aid with that. The KDK README states that KASAN kernels won't load on Apple Silicon. Attempting to follow the instructions given in the README for Intel-based machines does result in a failure for me on Apple Silicon. I stumbled on the Pishi project. But the custom boot kernel collection that gets created doesn't have any of the KEXTs that were specified to kmutil(8) via the --explicit-only flag, so it can't be instrumented in Ghidra. Which is confirmed as well by running: % kmutil inspect -B boot.kc.kasan boot kernel collection at /Users/user/boot.kc.kasan (AEB8F757-E770-8195-458D-B87CADCAB062): Extension Information: I'd appreciate any pointers on how to tackle UAFs in kernel space.
3
0
161
2w
Network framework crashes on fork
Hello, I have a Cocoa application from which I fork a new process (helper sort of) and it crashes on fork due to some cleanup code probably registered with pthreads_atfork() in Network framework. This is crash from the child process: Application Specific Information: *** multi-threaded process forked *** BUG IN CLIENT OF LIBPLATFORM: os_unfair_lock is corrupt Abort Cause 258 crashed on child side of fork pre-exec Thread 0 Crashed:: Dispatch queue: com.apple.main-thread 0 libsystem_platform.dylib 0x194551238 _os_unfair_lock_corruption_abort + 88 1 libsystem_platform.dylib 0x19454c788 _os_unfair_lock_lock_slow + 332 2 Network 0x19b1b4af0 nw_path_shared_necp_fd + 124 3 Network 0x19b1b4698 -[NWConcrete_nw_path_evaluator dealloc] + 72 4 Network 0x19af9d970 __nw_dictionary_dispose_block_invoke + 32 5 libxpc.dylib 0x194260210 _xpc_dictionary_apply_apply + 68 6 libxpc.dylib 0x19425c9a0 _xpc_dictionary_apply_node_f + 156 7 libxpc.dylib 0x1942600e8 xpc_dictionary_apply + 136 8 Network 0x19acd5210 -[OS_nw_dictionary dealloc] + 112 9 Network 0x19b1beb08 nw_path_release_globals + 120 10 Network 0x19b3d4fa0 nw_settings_child_has_forked() + 312 11 libsystem_pthread.dylib 0x100c8f7c8 _pthread_atfork_child_handlers + 76 12 libsystem_c.dylib 0x1943d9944 fork + 112 (...) I'm trying to create a child process with boost::process::child which does basically just a fork() followed by execv() and I do it before the - [NSApplication run] is called. Is it know bug or behavior which I've run into? Also what is a correct way to spawn child processes in Cocoa applications? As far as my understanding goes the basically all the available APIs (e.g. posix, NSTask) should be more or less the same thing calling the same syscalls. So forking the process early before main run loop starts and not starting another NSApplication in forked child should be ok ...or not?
4
0
2.2k
3w
Applications stuck in UDP sendto syscall
Hi, We’re seeing our build system (Gradle) get stuck in sendto system calls while trying to communicate with other processes via the local interface over UDP. To the end user it appears that the build is stuck or they will receive an error “Timeout waiting to lock ***. It is currently in use by another Gradle instance”. But when the process is sampled/profiled, we can see one of the threads is stuck in a sendto system call. The only way to resolve the issue is to kill -s KILL <pid> the stuck Gradle process. A part of the JVM level stack trace: "jar transforms Thread 12" #90 prio=5 os_prio=31 cpu=0.85ms elapsed=1257.67s tid=0x000000012e6cd400 nid=0x10f03 runnable [0x0000000332f0d000] java.lang.Thread.State: RUNNABLE at sun.nio.ch.DatagramChannelImpl.send0(java.base@17.0.10/Native Method) at sun.nio.ch.DatagramChannelImpl.sendFromNativeBuffer(java.base@17.0.10/DatagramChannelImpl.java:901) at sun.nio.ch.DatagramChannelImpl.send(java.base@17.0.10/DatagramChannelImpl.java:863) at sun.nio.ch.DatagramChannelImpl.send(java.base@17.0.10/DatagramChannelImpl.java:821) at sun.nio.ch.DatagramChannelImpl.blockingSend(java.base@17.0.10/DatagramChannelImpl.java:853) at sun.nio.ch.DatagramSocketAdaptor.send(java.base@17.0.10/DatagramSocketAdaptor.java:218) at java.net.DatagramSocket.send(java.base@17.0.10/DatagramSocket.java:664) at org.gradle.cache.internal.locklistener.FileLockCommunicator.pingOwner(FileLockCommunicator.java:61) at org.gradle.cache.internal.locklistener.DefaultFileLockContentionHandler.maybePingOwner(DefaultFileLockContentionHandler.java:203) at org.gradle.cache.internal.DefaultFileLockManager$DefaultFileLock$1.run(DefaultFileLockManager.java:380) at org.gradle.internal.io.ExponentialBackoff.retryUntil(ExponentialBackoff.java:72) at org.gradle.cache.internal.DefaultFileLockManager$DefaultFileLock.lockStateRegion(DefaultFileLockManager.java:362) at org.gradle.cache.internal.DefaultFileLockManager$DefaultFileLock.lock(DefaultFileLockManager.java:293) at org.gradle.cache.internal.DefaultFileLockManager$DefaultFileLock.<init>(DefaultFileLockManager.java:164) at org.gradle.cache.internal.DefaultFileLockManager.lock(DefaultFileLockManager.java:110) at org.gradle.cache.internal.LockOnDemandCrossProcessCacheAccess.incrementLockCount(LockOnDemandCrossProcessCacheAccess.java:106) at org.gradle.cache.internal.LockOnDemandCrossProcessCacheAccess.acquireFileLock(LockOnDemandCrossProcessCacheAccess.java:168) at org.gradle.cache.internal.CrossProcessSynchronizingCache.put(CrossProcessSynchronizingCache.java:57) at org.gradle.api.internal.changedetection.state.DefaultFileAccessTimeJournal.setLastAccessTime(DefaultFileAccessTimeJournal.java:85) at org.gradle.internal.file.impl.SingleDepthFileAccessTracker.markAccessed(SingleDepthFileAccessTracker.java:51) at org.gradle.internal.classpath.DefaultCachedClasspathTransformer.markAccessed(DefaultCachedClasspathTransformer.java:209) at org.gradle.internal.classpath.DefaultCachedClasspathTransformer.transformFile(DefaultCachedClasspathTransformer.java:194) at org.gradle.internal.classpath.DefaultCachedClasspathTransformer.lambda$cachedFile$6(DefaultCachedClasspathTransformer.java:186) at org.gradle.internal.classpath.DefaultCachedClasspathTransformer$$Lambda$368/0x0000007001393a78.call(Unknown Source) at org.gradle.internal.UncheckedException.unchecked(UncheckedException.java:74) at org.gradle.internal.classpath.DefaultCachedClasspathTransformer.lambda$transformAll$8(DefaultCachedClasspathTransformer.java:233) at org.gradle.internal.classpath.DefaultCachedClasspathTransformer$$Lambda$372/0x0000007001398470.call(Unknown Source) at java.util.concurrent.FutureTask.run(java.base@17.0.10/FutureTask.java:264) at org.gradle.internal.concurrent.ExecutorPolicy$CatchAndRecordFailures.onExecute(ExecutorPolicy.java:64) at org.gradle.internal.concurrent.ManagedExecutorImpl$1.run(ManagedExecutorImpl.java:49) at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@17.0.10/ThreadPoolExecutor.java:1136) at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@17.0.10/ThreadPoolExecutor.java:635) at java.lang.Thread.run(java.base@17.0.10/Thread.java:840) A part of the process sample: 2097 Thread_3879661: Java: jar transforms Thread 12 + 2097 thread_start (in libsystem_pthread.dylib) + 8 [0x18c42eb80] ...removed for brevity... + 2097 Java_sun_nio_ch_DatagramChannelImpl_send0 (in libnio.dylib) + 84 [0x102ef371c] + 2097 __sendto (in libsystem_kernel.dylib) + 8 [0x18c3f612c] We have observed the following system logs around the time the issue manifests: 2025-08-26 22:03:23.280255+0100 0x3b2c00 Default 0x0 0 0 kernel: cfil_hash_entry_log:6088 <CFIL: Error: sosend_reinject() failed>: [4628 java] <UDP(17) in so 9e934ceda1c13379 50826943645358435 50826943645358435 ag> 2025-08-26 22:03:23.280267+0100 0x3b2c00 Default 0x0 0 0 kernel: cfil_service_inject_queue:4472 CFIL: sosend() failed 22 The issue seems to be rooted in the built-in Application Firewall, as disabling it “fixes” the issue. It doesn’t seem to matter that the process is on the “allow” list. We’re using Gradle 7.6.4, 8.0.2 and 8.14.1 in various repositories, so the version doesn’t seem to matter, neither does which repo we use. The most reliable way to reproduce is to run two Gradle builds at the same time or very quickly after each other. We would really appreciate a fix for this as it really negatively affects the developer experience. I've raised FB19916240 for this. Many thanks,
1
1
223
4w
VNOP_MONITOR+vnode_notify() operation details
After perusing the sources of Apple's SMB and NFS clients' implementation of VNOP_MONITOR, my understanding of how VNOP_MONITOR+vnode_notify() operate is as follows: A user-space process advertises an interest in monitoring a file or directory via kqueue(2)/kevent(2). VFS calls the filesystem's implementation of VNOP_MONITOR. VNOP_MONITOR forwards the commencing or terminating of monitoring events request to the filesystem server. Network filesystem client nodes call vnode_notify() to notify the underlying VFS of a filesystem event, e.g. file/directory creation/removal, etc. What I'm still vague about is how does the server communicate back to client nodes that an event of interest has occurred? I'd appreciate being enlightened on the operation of `VNOP_MONITOR+vnode_notify()' in a network filesystem setting.
1
0
88
Aug ’25
Support for Multi-Homed IPv6 Networks, esp. RFC 8028
Hi everyone, I’m running a dual-homed IPv6-mostly LAN where two on-link routers advertise distinct global Provider-Assigned prefixes (one per ISP). On Linux, the host stack appears to follow RFC 8028. It keeps one default route per prefix, and packets appear to leave through a router that recognises their source address and pass ISP BCP 38 (https://datatracker.ietf.org/doc/bcp38/) checks. On macOS Sequoia, I'm only seeing a single un-scoped default route. As a result, traffic sourced from prefix B often exits via router A and is dropped upstream. Questions: Is the single-default-per-interface model in macOS an intentional design choice or simply legacy behaviour that has not yet been updated to RFC 8028? Does the kernel perform any hidden next-hop selection that isn’t reflected in netstat -rn output? Are there any road-map items for fully adopting RFC 8028 in macOS? As a bonus, I'd be very interested in any info you might be able to provide on the status of implementation/support for https://datatracker.ietf.org/doc/html/rfc8978 (Reaction of IPv6 Stateless Address Autoconfiguration (SLAAC) to Flash-Renumbering Events).
2
0
62
Jul ’25
No KDKs available for macOS 26.0 Developer Beta 2 and later
As of now, there is no Kernel Debug Kit (KDK) available for macOS 26.0 Developer Betas after the first build. Kernel Debug Kits are crucial for understanding panics and other bugs within custom Kernel Extensions. Without the KDK for the corresponding macOS version, tools like kmutil fail to recognize a KDK and certain functions are disabled. Additionally, as far as I am aware, a KDK for one build of macOS isn't able to be used on a differing build. Especially since this is a developer beta, where developers are updating their software to function with the latest versions of macOS, I'd expect a KDK to be available for more than one build.
2
0
307
Jul ’25
Hardlinks reported as non-existing on macOS Sequoia for 3rd party FS
After creating a hardlink on a distributed filesystem of my own via: % ln f.txt hlf.txt Neither the original file, f.txt, nor the hardlink, hlf.txt, are immediately accessible, e.g. via cat(1) with ENOENT returned. A short time later though, both the original file and the hardlink are accessible. Both files can be stat(1)ed though, which confirms that vnop_getattr returns success for both files. Dtruss(1) indicates it's the open(2) syscall that fails: % sudo dtruss -f cat hlf.txt 2038/0x4f68: open("hlf.txt\0", 0x0, 0x0) = -1 Err#2 ;ENOENT 2038/0x4f68: write_nocancel(0x2, "cat: \0", 0x5) = 5 0 2038/0x4f68: write_nocancel(0x2, "hlf.txt\0", 0x7) = 7 0 2038/0x4f68: write_nocancel(0x2, ": \0", 0x2) = 2 0 2038/0x4f68: write_nocancel(0x2, "No such file or directory\n\0", 0x1A) = 26 0 Dtrace(1)ing my KEXT no longer works on macOS Sequoia, so based on the diagnostics print statements I inserted into my KEXT, the following sequence of calls is observed: vnop_lookup(hlf.txt) -&gt; EJUSTRETURN ;ln(1) vnop_link(hlf.txt) -&gt; KERN_SUCCESS ;ln(1) vnop_lookup(hlf.txt) -&gt; KERN_SUCCESS ;cat(1) vnop_open(/) ; I expected to see vnop_open(hlf.txt) here instead of the parent directory. Internally, hardlinks are created in vnop_link via a call to vnode_setmultipath with cache_purge_negatives called on the destination directory. On macOS Monterey for example, where the same code does result in hardlinks being accessible, the following calls are made: vnop_lookup(hlf.txt) -&gt; EJUSTRETURN ;ln(1) vnop_link(hlf.txt) -&gt; KERN_SUCCESS ;ln(1) vnop_lookup(hlf.txt) -&gt; KERN_SUCCESS ;cat(1) vnop_open(hlf.txt) -&gt; KERN_SUCCESS ;cat(1) Not sure how else to debug this. Perusing the kernel sources for uses of VISHARDLINK, VNOP_LINK and vnode_setmultipath call sites did not clear things up for me. Any pointers would be greatly appreciated.
3
0
223
Jul ’25
Kext loads well after launchd and early os_log entries rarely appear in unified log
Is there a way to ensure a kernel extension in the Auxiliary Kernel Collection loads (and runs its start routines) before launchd? I'm emitting logs via os_log_t created with an os_log_create (custom subsystem/category) in both my KMOD's start function and the IOService::start() function. Those messages-- which both say "I've been run"-- inconsistently show up in log show --predicate 'subsystem == "com.bluefalconhd.pandora"' --last boot, which makes me think they are running very early. However, I also record timestamps (using mach_absolute_time, etc.) and expose them to user space through an IOExternalMethod. The results (for the most recent boot): hayes@fortis Pandora/tests main % build/pdtest Pandora Metadata: kmod_start_time: Time: 2025-07-22 14:11:32.233 Mach time: 245612546 Nanos since boot: 10233856083 (10.23 seconds) io_service_start_time: Time: 2025-07-22 14:11:32.233 Mach time: 245613641 Nanos since boot: 10233901708 (10.23 seconds) user_client_init_time: Time: 2025-07-22 14:21:42.561 Mach time: 14893478355 Nanos since boot: 620561598125 (620.56 seconds) hayes@fortis Pandora/tests main % ps -p 1 -o lstart= Tue Jul 22 14:11:27 2025 Everything in the kernel extension appears to be loading after launchd (PID 1) starts. Also, the kext isn't doing anything crazy which could cause that kind of delay. For reference, here's the Info.plist: <?xml version="1.0" encoding="UTF-8" ?> <!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd"> <plist version="1.0"> <dict> <key>CFBundleExecutable</key> <string>Pandora</string> <key>CFBundleIdentifier</key> <string>com.bluefalconhd.Pandora</string> <key>CFBundleName</key> <string>Pandora</string> <key>CFBundlePackageType</key> <string>KEXT</string> <key>CFBundleVersion</key> <string>1.0.7</string> <key>IOKitPersonalities</key> <dict> <key>Pandora</key> <dict> <key>CFBundleIdentifier</key> <string>com.bluefalconhd.Pandora</string> <key>IOClass</key> <string>Pandora</string> <key>IOMatchCategory</key> <string>Pandora</string> <key>IOProviderClass</key> <string>IOResources</string> <key>IOResourceMatch</key> <string>IOKit</string> <key>IOUserClientClass</key> <string>PandoraUserClient</string> </dict> </dict> <key>OSBundleLibraries</key> <dict> <key>com.apple.kpi.dsep</key> <string>24.2.0</string> <key>com.apple.kpi.iokit</key> <string>24.2.0</string> <key>com.apple.kpi.libkern</key> <string>24.2.0</string> <key>com.apple.kpi.mach</key> <string>24.2.0</string> </dict> </dict> </plist> My questions are: A. Why don't the early logs (from KMOD's start function and IOService::start) consistently appear in the unified log, while logs later in IOExternalMethods do? B. How can I force this kext to load earlier-- ideally before launchd? Thanks in advance for any guidance!
0
0
65
Jul ’25
FSKit caching by kernel and performance
I've faced with some performance issues developing my readonly filesystem using fskit. For below screenshot: enumerateDirectory returns two hardcoded items, compiled with release config 3000 readdirsync are done from nodejs. macos 15.5 (24F74) I see that getdirentries syscall takes avg 121us. Because all other variables are minimised, it seems like it's fskit<->kernel overhead. This itself seems like a big number. I need to compare it with fuse though to be sure. But what fuse has and fskit seams don't (I checked every page in fskit docs) is kernel caching. Fuse supports: caching lookups (entry_timeout) negative lookups (entry_timeout) attributes (attr_timeout) readdir (via opendir cache_readdir and keep_cache) read and write ops but thats another topic. And afaik it works for both readonly and read-write file systems, because kernel can assume (if client is providing this) that cache is valid until kernel do write operations on corresponding inodes (create, setattr, write, etc). Questions are: is 100+us reasonable overhead for fskit? is there any way to do caching by kernel. If not currently, any plans to implement? Also, additional performance optimisation could be done by providing lower level api when we can operate with raw inodes (Uint64), this will eliminate overhead from storing, removing and retrieving FSItems in hashmap.
2
1
173
Jul ’25
OpenDirectory module causes bootloop (kernel panic) on restart
With macOS 15, and DSPlugin support removal we searched for an alternative method to be able to inject users/groups into the system dynamically. We tried to write an OpenDirectory XPC based module based on the documentation and XCode template which can be found here: https://developer.apple.com/library/archive/releasenotes/NetworkingInternetWeb/RN_OpenDirectory/chapters/chapter-1.xhtml.html It is more or less working, until I restart the computer: then macOS kernel panics 90% of the time. When the panic occurs, our code does not seem to get run at all, I only see my logs in the beginning of main() when the machine successfully starts. I have verified this also by logging to file. Also tried replacing the binary with eg a shell script, or a "return 0" empty main function, that also triggers the panic. But, if I remove my executable (from /Library/OpenDirectory/Modules/com.quest.vas.xpc/Contents/MacOS/com.quest.vas), that saves the day always, macOS boots just fine. Do you have an idea what can cause this behavior? I can share the boot logs for the boot loops and/or panic file. Do you have any other way (other than OpenDirectory module) to inject users/groups into the system dynamically nowadays? (MDM does not seem a viable option for us)
3
0
153
Jul ’25
FSKit questions and clarifications
I work on EdenFS, an open-source Virtual Filesystem that runs on macOS, Linux, and Windows. My team is very interested in using FSKit as the basis for EdenFS on macOS, but have found the documentation to be lacking and contains some mixed messaging on the future of FSKit. Below are a few questions that don’t seem to be fully covered by the current documentation: Does FSKit support process attribution? Each FUSE request provides a requester Process ID (and other information) through the fuse_in_header structure. Does FSKit pass similar information along for each request? Does the reclaimItem API function similarly to FUSE’s forget operation? If not, what are the differences? See #1 below for why forget/reclaimItem matters to us. Is Apple committed to releasing and supporting FSKit? Is there any timeline for release that we can plan around? Does FSKit have known performance/scalability limitations? We provide alternative methods that clients can use to make bulk requests to EdenFS, but some clients will necessarily be unable to use those and stress the default filesystem APIs. Throughput (on the order of tens of thousands of filesystem requests per minute) and request size are the main concerns, followed closely by directory size restrictions. Why we’re interested in FSKit As mentioned above, my team supports EdenFS on 3 platforms. On Linux, we utilize FUSE; on Windows, we utilize ProjectedFS; and on macOS, we’ve utilized a few different solutions in the past. We first utilized the macFUSE kext, which was great while it lasted. Due to (understandable) changes in supporting kernel extensions, we were forced to move to NFS version 3. NFS has been lackluster in comparison (and our initial investigations show that NFS version 4(.2) would be similar). We have had numerous scalability and reliability issues, some listed below: NFS does not provide a forget API similar to FUSE. EdenFS is forced to remember all file handles that have been loaded because the kernel never informs us when all references to that file handle have been dropped. We can hackily infer that a file handle should never be referenced again in some cases, but a large number of file handles end up being remembered forever. Many of our algorithms scale with the number of file handles that Eden has to consider, and therefore performance issues are inevitable after some time. NFS does not provide information about clients (requesters). We cannot tell which processes are sending EdenFS requests. This attribution is important due to issue #1. We are forced to work with tool owners to modify their applications to be VFS-friendly. If we can’t track down which tools are behaving poorly, they will continue to load excess file handles and cause performance issues. NFS “Server connections interrupted:” dialog during heavy load. Under heavy load, either EdenFS or system-wide, our users experience this dialog pop-up and are confused as to how they should respond (Ignore or Disconnect All). They become blocked in their work, and will be further blocked if they click “Disconnect All” as that unmounts their EdenFS mount. This forces them to restart EdenFS or reboot their laptop to remediate the issue. The above issues make us extremely motivated to use FSKit and partner with Apple to flesh out the final version of the FSKit API. Our use case likely mirrors what other user-space filesystems will be looking for in the FSKit API (albeit at a larger scale than most), and we’re willing to collaborate to work out any issues in the current FSKit offerings.
4
0
1.6k
Jun ’25