[DEXT Migration Issue] IOUserSCSIParallelInterfaceController fails to handle low-level I/O from `diskutil`

Hello everyone,

We are migrating our KEXT for a Thunderbolt storage device to a DEXT based on IOUserSCSIParallelInterfaceController.

We've run into a fundamental issue where the driver's behavior splits based on the I/O source: high-level I/O from the file system (e.g., Finder, cp) is mostly functional (with a minor ls -al sorting issue for Traditional Chinese filenames), while low-level I/O directly to the block device (e.g., diskutil) fails or acts unreliably. Basic read/write with dd appears to be mostly functional.

We suspect that our DEXT is failing to correctly register its full device "personality" with the I/O Kit framework, unlike its KEXT counterpart. As a result, low-level I/O requests with special attributes (like cache synchronization) sent by diskutil are not being handled correctly by the IOUserSCSIParallelInterfaceController framework of our DEXT.

Actions Performed & Relevant Logs

1. Discrepancy: diskutil info Shows Different Device Identities for DEXT vs. KEXT

For the exact same hardware, the KEXT and DEXT are identified by the system as two different protocols.

KEXT Environment:

   Device Identifier:         disk5
   Protocol:                  Fibre Channel Interface
   ...
   Disk Size:                 66.0 TB
   Device Block Size:         512 Bytes

DEXT Environment:

   Device Identifier:         disk5
   Protocol:                  SCSI
   SCSI Domain ID:            2
   SCSI Target ID:            0
   ...
   Disk Size:                 66.0 TB
   Device Block Size:         512 Bytes

2. Divergent I/O Behavior: Partial Success with Finder/cp vs. Failure with diskutil

  • High-Level I/O (Partially Successful): In the DEXT environment, if we operate on an existing volume (e.g., /Volumes/MyVolume), file copy operations using Finder or cp succeed. Furthermore, the logs we've placed in our single I/O entry point, UserProcessParallelTask_Impl, are triggered.

    • Side Effect: However, running ls -al on such a volume shows an incorrect sorting order for files with Traditional Chinese names (they appear before . and ..).
  • Low-Level I/O (Contradictory Behavior): In the DEXT environment, when we operate directly on the raw block device (/dev/disk5):

    • diskutil partitionDisk ... -> Fails 100% of the time with the error: Error: -69825: Wiping volume data to prevent future accidental probing failed.
    • dd command -> Basic read/write operations appear to work correctly (a write can be immediately followed by a read within the same DEXT session, and the data is correct).

3. Evidence of Cache Synchronization Failure (Non-deterministic Behavior)

The success of the dd command is not deterministic. Cross-environment tests prove that its write operations are unreliable:

  • First Test:

    1. In the DEXT environment, write a file with random data to /dev/disk5 using dd.
    2. Reboot into the KEXT environment.
    3. Read the data back from /dev/disk5 using dd. The result is a file filled with all zeros.
    • Conclusion: The write operation only went to the hardware cache, and the data was lost upon reboot.
  • Second Test:

    1. In the DEXT environment, write the same random file to /dev/disk5 using dd.
    2. Key Variable: Immediately after, still within the DEXT environment, read the data back once for verification. The content is correct!
    3. Reboot into the KEXT environment.
    4. Read the data back from /dev/disk5. This time, the content is correct!
    • Conclusion: The additional read operation in the second test unintentionally triggered a hardware cache flush. This proves that the dd (in our DEXT) write operation by itself does not guarantee synchronization, making its behavior unreliable.

Our Problem

Based on the observations above, we have the conclusion:

  1. High-Level Path (triggered by Finder/cp): When an I/O request originates from the high-level file system, the framework seems to enter a fully-featured mode. In this mode, all SCSI commands, including READ/WRITE, INQUIRY, and SYNCHRONIZE CACHE, are correctly packaged and dispatched to our UserProcessParallelTask_Impl entry point. Therefore, Finder operations are mostly functional.

  2. Low-Level Path (triggered by dd/diskutil): When an I/O request originates from the low-level raw block device layer:

    • The most basic READ/WRITE commands can be dispatched (which is why dd appears to work).
    • However, critical management commands, such as INQUIRY and SYNCHRONIZE CACHE, are not being correctly dispatched or handled. This leads to the incorrect device identification in diskutil info and the failure of diskutil partitionDisk due to its inability to confirm cache synchronization.

We would greatly appreciate any guidance, suggestions, or insights on how to resolve this discrepancy. Specifically, what is the recommended approach within DriverKit to ensure that a DEXT based on IOUserSCSIParallelInterfaceController can properly declare its capabilities and handle both high-level and low-level I/O requests uniformly?

Thank you.

Charles

So, let me start with a clarification here:

In the DEXT environment, when we operate directly on the raw block device (/dev/disk5):

The "raw" block device here would be "/dev/rdisk5", not "/dev/disk5". Your actually interacting with the "cached" device, not the "raw" device.

That leads to here:

First Test:

  1. In the DEXT environment, write a file with random data to /dev/disk5 using dd.
  2. Reboot into the KEXT environment.
  3. Read the data back from /dev/disk5 using dd. The result is a file filled with all zeros.

Conclusion: The write operation only went to the hardware cache, and the data was lost upon reboot.

What actually happened in your DEXT for #1? Are you sure you received ANY I/O command? Also, have you tried this test entirely in the KEXT environment? In particular, what happens if you pull power from the machine (instead of cleanly shutting down)?

On KEXT side, I think what's actually happening here is the following:

  • The write hits the UBC (Universal Buffer Cache) without actually generating an I/O commands.

  • The shutdown process flushes everything out of the cache and the KEXT ensures everything is "finalized" to disk before it allows terminations.

I'd expect this to generally work the same in a DEXT, but if you're DEXT isn't ensuring everything is flushed during teardown, then that could cause data loss.

Second Test:

  1. In the DEXT environment, write the same random file to /dev/disk5 using dd.
  2. Key Variable: Immediately after, still within the DEXT environment, read the data back once for verification. The content is correct!

I'd be curious to see how the command stream differed here, but my guess about what happened here is that the system had to commit the write to disk before it triggered the read. However, it's also possible that the UBC returned the data you'd written directly from the cache, while also issuing a "flush" on it's own.

One thing I'd suggest testing here is exactly how your DEXT behaves if you run "diskutil eject". I think that should behave similar to what happens at shutdown, without the headache of trying to log/debug across reboots.

However, that leads to here:

diskutil partitionDisk ... -> Fails 100% of the time with the error: Error: -69825: Wiping volume data to prevent future accidental probing failed

Something different is going on here. In general, diskutil use the rdisk node, which means you're getting direct I/O requests without the write cache being involved.

The error code itself basically means "wiping failed", but that does narrow the scope enough that I'm sure this targeted the rdisk, not the disk. Do any writes make it your disk when you get this error?

Our code uses that error in two places, once for errors coming out of "pwrite" and the other if something goes wrong when setting up the initial buffers that are used the I/O source.

In terms of that second path, take a look at the code in IOMediaBSDClient.cpp, particularly dkioctl. IOMediaBSDClient is where things like ioctl() calls are "converted" into IOKit's architecture and, as you'll see, most of them are just reading properties off their underlying IOMedia object, all of which were propagated "up" from your controller. If you're not getting any writes, then I would start by comparing the IOMedia configuration of KEXT vs. DEXT and see if you can find anything that's wrong.

__
Kevin Elliott
DTS Engineer, CoreOS/Hardware

Hi Kevin,

We've encountered an I/O issue with our storage DEXT and were hoping for your expert insight.

Our storage DEXT works correctly when handling I/O requests directed to the buffered device node (/dev/disk5).

However, when I/O requests target the raw device node (/dev/rdisk5), the operation fails with an Input/output error if the transfer size is 128KB or larger. In contrast, all transfers of 64KB or smaller succeed consistently.

Detailed Debugging Steps & Observations:

After a full reboot of macOS to ensure a clean test environment, we performed a series of tests exclusively on the /dev/rdisk5 node.

A 64KB transfer succeeds:

$ sudo dd if=/dev/zero of=/dev/rdisk5 bs=64k count=1
1+0 records in
1+0 records out
65536 bytes transferred in 0.003461 secs (18935568 bytes/sec)

A 128KB transfer consistently fails:

$ sudo dd if=/dev/zero of=/dev/rdisk5 bs=128k count=1
dd: /dev/rdisk5: Input/output error
1+0 records in
0+0 records out
0 bytes transferred in 0.000247 secs (0 bytes/sec)

Log Analysis:

  • When the 64KB transfer succeeds, our DEXT's core I/O handling logic is triggered, and we can see detailed logs for the WRITE(10) command being prepared, including the CDB content and a reported SGCount of 1.
[UserProcessParallelTask_Impl] fRequestedTransferCount  = 65536
[UserProcessParallelTask_Impl] CDB                      = 0x2a 0x00 ... 0x00 0x80 0x00
...
[AME_DAS_SCSI_Prepare]     > Opcode (from CDB[0]): 0x2a : WRITE (10)
[AME_DAS_SCSI_Prepare]     > DataLength: 0x00010000
[AME_DAS_SCSI_Prepare]     > SGCount: 1
  • When the 128KB transfer fails, we see absolutely no logs from our core I/O handling function. The I/O request seemingly disappears before it reaches the core part of our DEXT that prepares the SCSI command.

Related Code Implementation:

We have implemented the UserGetDMASpecification method in our DEXT to report our hardware's DMA capability limits to the system. The current code, where we explicitly set maxTransferSize to 64KB, is shown below.

kern_return_t
IMPL (DRV_MAIN_CLASS_NAME, UserGetDMASpecification)
{
    // The maximum size of a single DMA transfer. 64KB is a safe value.
    *maxTransferSize = 64 * 1024;
    
    // Standard 4-byte alignment.
    *alignment = 4;
    
    // We explicitly tell macOS that our hardware can ONLY handle
    // 32-bit physical addresses.
    *numAddressBits = 32;
    *segmentType = kDMAOutputSegmentHost64;

    Log("Reporting DMA Constraints: maxTransferSize=%llu, alignment=%u, numAddressBits=%u",
        *maxTransferSize, *alignment, *numAddressBits);
    
    return kIOReturnSuccess;
}

This implementation appears to be consistent with the 64KB threshold behavior we are observing.

Our Questions:

  1. Based on our test results and the UserGetDMASpecification implementation, what is the expected behavior of the DriverKit framework when an I/O request larger than 64KB is sent to the rdisk node?
  2. Is the maxTransferSize reported by UserGetDMASpecification the reason why the I/O request is rejected before reaching our DEXT's core logic?
  3. What is the recommended "best practice" within the DriverKit framework for handling raw I/O requests that are larger than maxTransferSize? Should we implement the request splitting logic within the DEXT ourselves, or should I/O Kit automatically handle splitting the request for us as long as we correctly report our limits?

Thank you again!

Charles

Hi Kevin,

We conducted a test in our legacy KEXT environment, and the results indicate a behavioral difference between the KEXT and DriverKit frameworks in handling I/O requests.

Test Scenario (KEXT Environment):

We executed a 10MB raw I/O write command to an rdisk node:

sudo dd if=/dev/zero of=/dev/rdisk7 bs=10m count=1

Observations:

This command is a single 10MB write request. Our KEXT logs show that the driver did not receive one 10MB request, but instead received 20 sequential WRITE(10) commands.

Upon analysis, we confirmed that the size of each request was 512KB.

  • Evidence 1: DataLength

    • The DataLength for each request in the log is 0x00080000, which is 524,288 bytes (512KB).
  • Evidence 2: CDB Content

    • In the CDB of the first request, the transfer length is 0x0400 blocks (1024 blocks). 1024 blocks * 512 bytes/block = 512KB.
    • The LBAs of subsequent requests were sequential (LBA 0, LBA 1024, LBA 2048, ...).

Partial log for reference:

[AME_DAS_SCSI_Prepare] --- [KEXT FINAL DUMP] AME_SCSI_REQUEST_DAS packet for BufferID 46 ---
[AME_DAS_SCSI_Prepare]     Opcode (from CDB[0]): 0x2a (WRITE (10))
[AME_DAS_SCSI_Prepare]     CDB (hex): 2a 00 00 00 00 00 00 04 00 00
[AME_DAS_SCSI_Prepare]     DataLength: 0x00080000
...
[AME_DAS_SCSI_Prepare] --- [KEXT FINAL DUMP] AME_SCSI_REQUEST_DAS packet for BufferID 47 ---
[AME_DAS_SCSI_Prepare]     Opcode (from CDB[0]): 0x2a (WRITE (10))
[AME_DAS_SCSI_Prepare]     CDB (hex): 2a 00 00 00 04 00 00 04 00 00
[AME_DAS_SCSI_Prepare]     DataLength: 0x00080000

Our Conclusion:

This test shows that in the KEXT environment, the I/O Kit storage stack splits a large raw I/O request that exceeds an internal limit into multiple smaller chunks (512KB in this case) before sending them to the driver.

This is different from the behavior we observed in our DEXT. In the DEXT, after we declared a maxTransferSize of 64KB via UserGetDMASpecification, any rdisk request larger than this limit is rejected by the framework, not split.

Our Updated Questions:

  1. Is this behavior in DriverKit an intentional design change?
  2. Is the request-splitting functionality of I/O Kit no longer a part of the DriverKit framework, with the expectation that DEXT developers should now implement the logic for splitting large raw I/O requests?

Thank you for your clarification.

Best, Charles

We conducted a test in our legacy KEXT environment, and the results indicate a behavioral difference between the KEXT and DriverKit frameworks in handling I/O requests.

So, the first thing to understand is that IOKit and DriverKit are NOT fundamentally different/separate technologies. The best way to understand DriverKit is that it's implemented as a very specialized user client built on to our existing IOKit infrastructure. Note that this dynamic is quite direct- for example, a DEXT's "IOKitPersonalities" dictionary doesn't just "look like" an IOKit match dictionary, it IS an IOKit matching dictionary. How DEXT loading/matching actually works is:

  1. The matching dictionary is added into the kernels KEXT matching set just like any KEXT would be.

  2. When hardware is attached, that matching dictionary is used to match and load an in kernel driver in EXACTLY the same way ANY other KEXT would be.

  3. Once the kernel driver finishes loading, the system then uses the DEXT keys inside that matching dictionary to create your DEXT process, which then launches and proceeds with it's own initialization.

...at which point the DEXT then operates by controlling and manipulating it's underlying IOKit support driver. Understanding this dynamic is critical because the basic assumption your making here is that there is a "I/O Kit storage stack" and a "DriverKit storage stack":

This test shows that in the KEXT environment, the I/O Kit storage stack splits a large raw I/O request that exceeds an internal limit into multiple smaller chunks (512KB in this case) before sending them to the driver.

This is different from the behavior we observed in our DEXT. In the DEXT, after we declared a maxTransferSize of 64KB via UserGetDMASpecification, any rdisk request larger than this limit is rejected by the framework, not split.

...but that assumption is false. Both of these case are actually using EXACTLY the same "I/O Kit storage stack". That then leads to here:

  1. Is the request-splitting functionality of I/O Kit no longer a part of the DriverKit framework, with the expectation that DEXT developers should now implement the logic for splitting large raw I/O requests?

I think you've misunderstood what's causing the difference you're seeing here. The difference here isn't caused by IOKit vs. DriverKit, it's caused by different property configurations causing different behavior. Putting that another way, I believe that if you configured your KEXT such that it exported exactly the same property configuration as your DEXT is, then you'd see exactly the same behavior as your DEXT.

I'm not sure what you'll find, but the next step here is to get an IORegistryExplorer snapshot of both cases (KEXT and DEXT) and the compare the property configuration of both cases. I believe you'll find that one or more of the keys you specified through UserReportHBAConstraints is different than you're expecting, which is then leading to the failure.

Related to that point, returning to here:

In the DEXT, after we declared a maxTransferSize of 64KB via UserGetDMASpecification, any rdisk request larger than this limit is rejected by the framework, not split.

maxTransferSize is the maximum size of the entire transfer you're willing to receive, not the limitation of an individual DMA request. In most cases, I'd expect it to be the same or larger than:

kIOMaximumSegmentCount<Read/Write>Key * kIOMaximumSegmentByteCount<Read/Write>Key

However, setting it to less than kIOMaximumSegmentByteCount<Read/Write>Key would probably create exactly the error you're seeing. The high level storage stack would segment that transfer based on kIOMaximumSegmentByteCount<Read/Write>Key but the low level storage stack would then fail the request for exceeding maxTransferSize.

__
Kevin Elliott
DTS Engineer, CoreOS/Hardware

Hi Kevin,

Thank you again for your guidance. We have an important update based on your advice.

Following your suggestion to "compare property configurations," we analyzed our KEXT's source code, as we could not find the relevant properties in the KEXT's IORegistryExplorer snapshot. We then attempted to replicate the KEXT's behavior in our DEXT, which has led us to a final, specific contradiction that we hope you can help us resolve.

Here is a summary of our findings and final questions:

1 - KEXT Source Code Contains setProperty Calls

We discovered that our KEXT's Start() function programmatically sets several IOBlockStorageCharacteristics using setProperty. The key value is:

// In KEXT's Start() function:
#define MAX_IO_LENGTH (512 * 1024) // 512KB
setProperty(kIOMaximumByteCountWriteKey, (UInt64)MAX_IO_LENGTH, 64);
// ... plus other similar properties.

2 - DEXT Fails Despite Replicating KEXT's Configuration

Based on this finding and your previous advice, we implemented the following in our DEXT:

  • We implemented SetupHBAConstraints() and call the UserReportHBAConstraints virtual function.
  • Inside it, we populated the dictionary with the exact same values found in our KEXT's source, including setting kIOMaximumByteCountWriteKey to 512KB.
  • We also kept our UserGetDMASpecification's maxTransferSize at 64KB, as this reflects our hardware's actual single DMA transaction limit.

However, a dd bs=128k command to the rdisk node still fails with an Input/output error.

3 - The Final Contradiction: KEXT's "Hidden" Splitting Behavior

This is the core of our confusion. When we test the KEXT with a large I/O (dd bs=10m), our driver logs show that IOKit is, in fact, splitting the request into 64KB chunks, not the 512KB size we specified in the KEXT's source code.

This leads to our final questions:

1 - Why does IOKit split I/O into 64KB chunks for our KEXT, even when kIOMaximumByteCountWriteKey is set to 512KB? Is there another, higher-priority property or mechanism (perhaps related to Protocol Characteristics) that is forcing this smaller split size?

2 - In the DEXT environment, this 64KB splitting behavior is not being triggered. Instead, it appears the upper layers of IOKit are respecting our 512KB hint (by not splitting a 128KB request), while the lower layers reject it for exceeding the 64KB maxTransferSize. How can we correctly replicate the KEXT's robust "64KB splitting" behavior in our DEXT?

We are now unable to explain the discrepancy between the KEXT's declared properties and its actual I/O splitting behavior. Any clarification you can provide would be immensely helpful.

Best regards, Charles

We are now unable to explain the discrepancy between the KEXT's declared properties and its actual I/O splitting behavior. Any clarification you can provide would be immensely helpful.

So, I believe this is all handled by IOBlockStorageDriver using a class called IOBreaker. You can find the start of this in IOBlockStorageDriver::breakUpRequest(), which then proceeds into IOBreaker. Take a look at the code and see how that correlates with what you're seeing.

Following your suggestion to "compare property configurations," we analyzed our KEXT's source code, as we could not find the relevant properties in the KEXT's IORegistryExplorer snapshot. We then attempted to replicate the KEXT's behavior in our DEXT, which has led us to a final, specific contradiction that we hope you can help us resolve.

If you want me to take a look at this, I'll need to see snapshots from both configurations. Please use IORegistryExplorer.app to save out the files, upload both files to a bug, then post the bug number back here.

*I hate trying to read ioreg text traces.

__
Kevin Elliott
DTS Engineer, CoreOS/Hardware

Hi Kevin,

Thank you for the detailed explanation and the pointer to IOBlockStorageDriver::breakUpRequest() and IOBreaker. That is incredibly helpful information.

As you requested, I have filed a Feedback report containing the .ioreg snapshots from both the KEXT and DEXT configurations.

A sysdiagnose log, generated by Feedback Assistant during the submission process, is also attached.

The Feedback ID is: FB20924370

We look forward to your analysis of the registry files.

Best regards,

Charles

We could not find the relevant properties in the KEXT's IORegistryExplorer snapshot.

Looking at the snapshots you sent, I've listed the configuration of both drivers. Note that the first section lists the properties of the direct driver itself, while the second is the actual IOSCSIPeripheralDeviceType00, the parent IOBlockStorageServices. Here are the two configurations:

(1) KEXT configuration:

KEXT (subclass of IOSCSIParallelInterfaceController):

IOMaximumSegmentAddressableBitCount = 0x20
IOMaximumSegmentCountRead = 0x81
IOMaximumSegmentCountWrite = 0x81
IOMaximumByteCountRead = 0x80000
IOMaximumByteCountWrite = 0x80000
IOMinimumSegmentAlignmentByteCount = 0x4

IOSCSIPeripheralDeviceType00:
IOMaximumBlockCountRead = 0x400
IOMaximumBlockCountWrite = 0x400
IOMaximumByteCountWrite = 0x80000
IOMaximumByteCountRead = 0x80000

(2) DEXT Configuration

DEXT:
IOMaximumSegmentAddressableBitCount = 0x40
IOMaximumSegmentCountRead = 0x81
IOMaximumSegmentCountWrite = 0x81
IOMinimumSegmentAlignmentByteCount = 0

IOSCSIPeripheralDeviceType00:
IOMaximumBlockCountRead = Oxffff
IOMaximumBlockCountWrite = Oxffff

So, first off, IOMaximumBlock is "Oxffff" because that property can't be controlled by DriverKit. I don't know the EXACT reason for that, but I think it's because it acted as an unnecessarily complex "override" of the actual hardware configuration (see below).

Next, there are actually two issues:

  1. You've failed to define kIOMaximumSegmentByteCount* keys, as UserReportHBAConstraints basically "requires".

  2. maxTransferSize is too small for the configuration you’re actually creating.

Breaking that down in detail, above I said:

maxTransferSize is the maximum size of the entire transfer you're willing to receive, not the limitation of an individual DMA request. In most cases, I'd expect it to be the same or larger than:

kIOMaximumSegmentCount<Read/Write>Key * kIOMaximumSegmentByteCount<Read/Write>Key

That's because the expected flow here is the following:

  1. IOBreaker in IOBlockStorageDriver segments the transfer into using those two factors.

  2. Your DEXT receives one call to UserProcessParallelTask which passes in a single large physical segment.

  3. Your DEXT breaks that physical segment into individual transfers that match whatever your hardware requires.

  4. Once you've finished processing the entire physical segment you complete the task.

That leads me to your comment here:

// The maximum size of a single DMA transfer. 64KB is a safe value.

That limit doesn't really make sense on any hardware that DriverKit supports. The DART and the MMU (see here for more background) basically allow arbitrarily large chunks of non-contiguous memory to be mapped into the PCI address space. To the extent there is a limit, it's much, much larger than kilobytes.

That leads back to the comment I made in #2 about receiving a single segment, which comes from this comment in the documentation:

"Next, it sets the metadata for this I/O operation. The kernel generates one long physical segment with fBufferIOVMAddr as its start address, as seen here in the call to a hypothetical..."

That behavior DOESN'T actually come from the SCSI layer, but actually comes from IODMACommand. I discuss this in more detail here, but the summary is that the fact IODMACommand.PrepareForDMA() returns a segment list doesn't mean that it actually "can" do that. Even the kernel itself, it appears to only be possible when using some of our more "obscure" IOMemory subclasses, none of which are available through DriverKit.

__
Kevin Elliott
DTS Engineer, CoreOS/Hardware

[DEXT Migration Issue] IOUserSCSIParallelInterfaceController fails to handle low-level I/O from &#96;diskutil&#96;
 
 
Q