Hello everyone,
We are migrating our KEXT for a Thunderbolt storage device to a DEXT based on IOUserSCSIParallelInterfaceController.
We've run into a fundamental issue where the driver's behavior splits based on the I/O source: high-level I/O from the file system (e.g., Finder, cp) is mostly functional (with a minor ls -al sorting issue for Traditional Chinese filenames), while low-level I/O directly to the block device (e.g., diskutil) fails or acts unreliably. Basic read/write with dd appears to be mostly functional.
We suspect that our DEXT is failing to correctly register its full device "personality" with the I/O Kit framework, unlike its KEXT counterpart. As a result, low-level I/O requests with special attributes (like cache synchronization) sent by diskutil are not being handled correctly by the IOUserSCSIParallelInterfaceController framework of our DEXT.
Actions Performed & Relevant Logs
1. Discrepancy: diskutil info Shows Different Device Identities for DEXT vs. KEXT
For the exact same hardware, the KEXT and DEXT are identified by the system as two different protocols.
KEXT Environment:
Device Identifier: disk5
Protocol: Fibre Channel Interface
...
Disk Size: 66.0 TB
Device Block Size: 512 Bytes
DEXT Environment:
Device Identifier: disk5
Protocol: SCSI
SCSI Domain ID: 2
SCSI Target ID: 0
...
Disk Size: 66.0 TB
Device Block Size: 512 Bytes
2. Divergent I/O Behavior: Partial Success with Finder/cp vs. Failure with diskutil
-
High-Level I/O (Partially Successful): In the DEXT environment, if we operate on an existing volume (e.g.,
/Volumes/GammaCarry), file copy operations usingFinderorcpsucceed. Furthermore, the logs we've placed in our single I/O entry point,UserProcessParallelTask_Impl, are triggered.- Side Effect: However, running
ls -alon such a volume shows an incorrect sorting order for files with Traditional Chinese names (they appear before.and..).
- Side Effect: However, running
-
Low-Level I/O (Contradictory Behavior): In the DEXT environment, when we operate directly on the raw block device (
/dev/disk5):diskutil partitionDisk ...-> Fails 100% of the time with the error:Error: -69825: Wiping volume data to prevent future accidental probing failed.ddcommand -> Basic read/write operations appear to work correctly (a write can be immediately followed by a read within the same DEXT session, and the data is correct).
3. Evidence of Cache Synchronization Failure (Non-deterministic Behavior)
The success of the dd command is not deterministic. Cross-environment tests prove that its write operations are unreliable:
-
First Test:
- In the DEXT environment, write a file with random data to
/dev/disk5usingdd. - Reboot into the KEXT environment.
- Read the data back from
/dev/disk5usingdd. The result is a file filled with all zeros.
- Conclusion: The write operation only went to the hardware cache, and the data was lost upon reboot.
- In the DEXT environment, write a file with random data to
-
Second Test:
- In the DEXT environment, write the same random file to
/dev/disk5usingdd. - Key Variable: Immediately after, still within the DEXT environment, read the data back once for verification. The content is correct!
- Reboot into the KEXT environment.
- Read the data back from
/dev/disk5. This time, the content is correct!
- Conclusion: The additional read operation in the second test unintentionally triggered a hardware cache flush. This proves that the
dd(in our DEXT) write operation by itself does not guarantee synchronization, making its behavior unreliable.
- In the DEXT environment, write the same random file to
Our Problem
Based on the observations above, we have the conclusion:
-
High-Level Path (triggered by
Finder/cp): When an I/O request originates from the high-level file system, the framework seems to enter a fully-featured mode. In this mode, all SCSI commands, includingREAD/WRITE,INQUIRY, andSYNCHRONIZE CACHE, are correctly packaged and dispatched to ourUserProcessParallelTask_Implentry point. Therefore, Finder operations are mostly functional. -
Low-Level Path (triggered by
dd/diskutil): When an I/O request originates from the low-level raw block device layer:- The most basic
READ/WRITEcommands can be dispatched (which is whyddappears to work). - However, critical management commands, such as
INQUIRYandSYNCHRONIZE CACHE, are not being correctly dispatched or handled. This leads to the incorrect device identification indiskutil infoand the failure ofdiskutil partitionDiskdue to its inability to confirm cache synchronization.
- The most basic
We would greatly appreciate any guidance, suggestions, or insights on how to resolve this discrepancy. Specifically, what is the recommended approach within DriverKit to ensure that a DEXT based on IOUserSCSIParallelInterfaceController can properly declare its capabilities and handle both high-level and low-level I/O requests uniformly?
Thank you.
Charles