Develop kernel-resident device drivers and kernel extensions using Kernel.

Posts under Kernel tag

49 Posts

Post

Replies

Boosts

Views

Activity

Subsequent expansion of same archive fails due to name collision
Extracting an archive into the same directory on my custom filesystem more than once fails with the following message: Unable to finish expanding 'misc.tar.xz' into 'extractme'. Could not move 'misc' into destination directory. I.e. initial extraction succeeds with archive contents extracted into extractme/misc. Subsequent extraction fails to rename extractme.sb-db71cd27-lFjN1f/misc to extractme/misc 2. This behaviour is observed on macOS Monterey and Ventura. It does work as expected on macOS Sonoma though. Dtrace(1)-ing the archive being extracted over smbfs results in the following sequence of calls being made: 2 -> smbfs_vnop_lookup AUHelperService-2163 -> extractme/misc 2 nameiop:0 2 <- smbfs_vnop_lookup AUHelperService-2163 -> extractme/misc 2 -> 2 ;ENOENT 2 -> smbfs_vnop_lookup AUHelperService-2163 -> extractme.sb-db71cd27-lFjN1f/misc nameiop:0x2 ;DELETE 2 <- smbfs_vnop_lookup AUHelperService-2163 -> extractme.sb-db71cd27-lFjN1f/misc -> 0 2 -> smbfs_vnop_lookup AUHelperService-2163 -> extractme/misc 2 nameiop:0x3 ;RENAME 2 <- smbfs_vnop_lookup AUHelperService-2163 -> extractme/misc 2 -> EJUSTRETURN 1 -> smbfs_vnop_rename AUHelperService-2163 -> extractme.sb-db71cd27-lFjN1f/misc -> extractme/nil 2 <- smbfs_vnop_rename AUHelperService-2163 -> extractme.sb-db71cd27-lFjN1f/misc -> extractme/nil -> 0 2 -> smbfs_vnop_lookup AUHelperService-2163 -> TheRooT/extractme/misc 2 nameiop:0 3 <- smbfs_vnop_lookup AUHelperService-2163 -> TheRooT/extractme/misc 2 -> 0 ;Successful lookup What I don't understand is what causes vnop_lookup to be called for misc to be removed from the temporary directory and renamed into 'misc 2' and placed in the destination directory, 'extractme' via vnop_rename? I had a look at smbfs_vnop_lookup and rename and didn't see anything that would cause 'misc 2' to come into being. Based on the output of the dtrace(1) script running on my custom filesystem, there are no vnop_lookup and vnop_rename calls being made to remove the 'misc' directory from the temporary directory and to rename it to 'misc 2' and place it in the destination directory at extractme. Archive extraction proceeds no further after extracting the archive contents into the temporary directory. What am I missing?
1
0
613
Oct ’24
How to enable PCIDriverKit Bus Leader? (and Memory Space enable?)
I am porting a working kernel extension IOKit driver to a DriverKit system extension. Our device is a PCI device accessed through Thunderbolt. The change from IOPCIFamily to PCIDriverKit has some differences in approach, though. Namely, in IOKit / IOPCIFamily, this was the correct way to become Bus Leader: mPCIDevice->setBusLeadEnable(true); // setBusMasterEnable(..) deprecated in OS 12.4 but now, PCIDriverKit's IOPCIDevice does not have that function. Instead I am doing the following: // Set Bus Leader and Memory Space enable uint16_t commandRegister = 0; ivars->mPCIDevice->ConfigurationRead16(kIOPCIConfigurationOffsetCommand, &commandRegister); commandRegister |= (kIOPCICommandBusLead | kIOPCICommandMemorySpace); ivars->mPCIDevice->ConfigurationWrite16(kIOPCIConfigurationOffsetCommand, commandRegister); But I am not convinced this is working (I am still experiencing unexpected errors when attempting to DMA from our device, using the same steps that work for the kernel extension). The only hint I can find in the online documentation is here, which reads: Note The endpoint driver is responsible for enabling the Memory Space Enable and Bus Master Enable settings each time it configures the PCI device. When a crash occurs, or when the system unloads your driver, the system disables these features. ...but that does not state directly how to enable bus leader status. What is the "PCIDriverKit approved" way to become bus leader? Is there a way to verify/confirm that a device is bus leader? (This would be helpful to prove that bus leadership is not the issue for DMA errors, as well as to confirm that bus leadership was granted). Thanks in advance!
1
0
702
Oct ’24
Reasonable time for fix to easy-to-reproduce kernel panic?
Since I haven't heard so much as a peep from Apple on this, I thought I'd take a poll here on how long I could expect an easily reproducible (albeit possibly obscure) kernel panic to be fixed. I was under the impression that kernel panics were a big deal but it's been almost 2 months since I updated from macOS 14 to macOS 15.0 dev beta 7 / public beta 5 when I originally came across and reported a panic triggered while playing StarCraft II. I've been able to consistently trigger panics playing certain (maybe all) Co-op maps in SC2 and since my first report Aug 22, I've filed 8 additional bug reports, each automatically generated after hitting yet another panic. (I'm not sure exactly who is able to view these but for what it's worth, these are the reports I've filed so far: FB14886510, FB14905773, FB14960435, FB15304609, FB15391195, FB15467943, FB15468127, FB15491485, FB15491684.) A few other people have reported the issue to SC2's developer, Blizzard, and apparently Blizzard has acknowledged they're aware of the problem so it's safe to rule out the possibility of a hardware defect or other issue specific only to my computer. The logs point the blame at the AppleDCP driver, although I suppose the problem could technically be in the DCP firmware instead. Regardless, Apple's code is clearly at fault here. I'll admit the importance of a video game isn't exactly like keeping the power on at a hospital but I don't know why it would be deemed particularly unimportant either. At 53 days in, am I wrong to expect this to have been fixed by now or is Apple really being that slow?
1
1
704
Oct ’24
Kernel Development Kit Missing
Hello, It seems like the Kernel Debug Kit for macOS 15.0.1 (24A348) is missing from the list of downloads at developer.apple.com. It would be great if you could add them to the list of available downloads. When trying to rebuild the kernel it fails with the following error message: Error Domain=KMErrorDomain Code=34 "Missing Developer Kit: As of macOS 13.0, you will need to install a KDK matching your build 24A348 to rebuild kernel collections." UserInfo={NSLocalizedDescription=Missing Developer Kit: As of macOS 13.0, you will need to install a KDK matching your build 24A348 to rebuild kernel collections.} But my macOS version is 15.0.1. Is there a workaround for this?
1
0
916
Oct ’24
What are the locking rules for VFS and VNOP operations?
I'm observing all sorts of race conditions occurring in various VNOPs my custom filesystem implements. I'm inclined to attribute this to my implementation not following the locking rules expected by the system of a 3rd party filesystem as well as it should. I've looked at how locking is done in Apple's own implementation of Samba and NFS clients. The Samba client uses read/write locks to protect its node from data races. While the NFS client uses mutex locks for the same purpose. I realised that I don't have a clear model in my head of how locking should be done properly. Thus my question, what are the locking rules for VFS and VNOP operations? Thanks.
4
0
748
Oct ’24
vnop_lookup returning ENOENT aborts rm(1)
When recursively removing a directory with a large number of entries that resides on my custom filesystem, rm(1) aborts with ENOENT. % rm -Rv /Volumes/myfs/linux-kernel/linux-6.10.6 [...] /Volumes/myfs/linux-kernel/linux-6.10.6/include/drm/bridge/aux-bridge.h /Volumes/myfs/linux-kernel/linux-6.10.6/include/drm/bridge/dw_hdmi.h rm: fts_read: No such file or directory I'm observing the following sequence of calls being made. 2024-09-17 17:58:25 vnops.c:281 myfs_vnop_lookup: rm-936 -> dw_hdmi.h ;initial lookup call 2024-09-17 17:58:25 vnops.c:315 myfs_vnop_lookup: -> cache_lookup(dw_hdmi.h) 2024-09-17 17:58:25 vnops.c:317 myfs_vnop_lookup: <- cache_lookup(dw_hdmi.h) -> 0 ;cache miss 2024-09-17 17:58:25 rpc.c:431 myfsLookup: rm-936 -> dw_hdmi.h ;do remote lookup 2024-09-17 17:58:25 rpc.c:500 myfsLookup: -> myfs_lookup_rpc(dw_hdmi.h) 2024-09-17 17:58:25 rpc.c:502 myfsLookup: <- myfs_lookup_rpc(dw_hdmi.h) -> 0 ;file found and added to vfs cache 2024-09-17 17:58:25 vnops.c:281 myfs_vnop_lookup: rm-936 -> dw_hdmi.h ;subsequent lookup call 2024-09-17 17:58:25 vnops.c:315 myfs_vnop_lookup: -> cache_lookup(dw_hdmi.h) 2024-09-17 17:58:25 vnops.c:317 myfs_vnop_lookup: <- cache_lookup(dw_hdmi.h) -> -1 ;cache hit 2024-09-17 17:58:25 vnops.c:1478 myfs_vnop_remove: -> myfs_unlink(dw_hdmi.h) ;unlink sequence 2024-09-17 17:58:25 rpc.c:1992 myfs_unlink: -> myfs_unlink_rpc(dw_hdmi.h) 2024-09-17 17:58:25 rpc.c:1994 myfs_unlink: <- myfs_unlink_rpc(dw_hdmi.h) -> 0 ;remote unlink succeeded 2024-09-17 17:58:25 vnops.c:1480 myfs_vnop_remove: <- myfs_unlink(dw_hdmi.h) -> 0 2024-09-17 17:58:25 vnops.c:1487 myfs_vnop_remove: -> cache_purge(dw_hdmi.h) 2024-09-17 17:58:25 vnops.c:1489 myfs_vnop_remove: <- cache_purge(dw_hdmi.h) 2024-09-17 17:58:25 vnops.c:1499 myfs_vnop_remove: -> vnode_recycle(dw_hdmi.h) 2024-09-17 17:58:25 vnops.c:1501 myfs_vnop_remove: <- vnode_recycle(dw_hdmi.h) 2024-09-17 17:58:27 vnops.c:281 myfs_vnop_lookup: fseventsd-101 -> dw_hdmi.h ;another lookup; why? 2024-09-17 17:58:27 vnops.c:315 myfs_vnop_lookup: -> cache_lookup(dw_hdmi.h) 2024-09-17 17:58:27 vnops.c:317 myfs_vnop_lookup: <- cache_lookup(dw_hdmi.h) -> 0 2024-09-17 17:58:27 rpc.c:431 myfsLookup: fseventsd-101 -> dw_hdmi.h 2024-09-17 17:58:27 rpc.c:500 myfsLookup: -> myfs_lookup_rpc(dw_hdmi.h) 2024-09-17 17:58:27 rpc.c:502 myfsLookup: <- myfs_lookup_rpc(dw_hdmi.h) -> ENOENT(2) 2024-09-17 17:58:27 vnops.c:371 myfs_vnop_lookup: SET(NNEGNCENTRIES): dw_hdmi.h 2024-09-17 17:58:27 vnops.c:373 myfs_vnop_lookup: ENOENT(2) <- shouldAddToNegativeNameCache(dw_hdmi.h) I checked the value of vnode's v_iocount when vnop_remove and vnop_reclaim are being called. Each vnop_remove and followed by vnop_reclaim with v_iocount set to 1 in both calls, as expected. What I don't understand is why after removing the file is there another lookup call being made, which returns ENOENT to rm(1), which causes it to abort. Any pointers on what could be amiss there would be much appreciated.
3
0
669
Sep ’24