Hi Kevin,
I'm starting this new thread to focus on alignment optimization and recalibrating our HBA constraints.
Following up on your suggestion about UserReportHBAConstraints and alignment optimization, here are our current DEXT settings:
Via UserReportHBAConstraints():
kIOMaximumSegmentCountRead/WriteKey: 129kIOMaximumSegmentByteCountRead/WriteKey: 65,536 (64 KB)kIOMinimumSegmentAlignmentByteCountKey: 4 byteskIOMaximumSegmentAddressableBitCountKey: 32kIOMinimumHBADataAlignmentMaskKey: 0
Via SetProperties() (additional injection):
kIOMaximumByteCountRead/WriteKey: 524,288 (512 KB)kIOMaximumBlockCountRead/WriteKey: 1,024
We inherited the segment count (129) and max I/O length (512 KB) from our legacy KEXT, which were originally calculated based on a 4 KB segment size (Max I/O 512 KB / 4 KB + 1 = 129). The current alignment value of 4 was essentially a placeholder, as the legacy hardware didn't enforce strict page-level alignment.
Given that our testing is on Apple Silicon, we are considering increasing kIOMinimumSegmentAlignmentByteCountKey to 16,384 (16 KB) to match the native page size. However, I have two specific questions regarding this:
-
Stripe Size vs. Page Size: Our RAID stripe size is typically larger than 16 KB (e.g., 64 KB or 128 KB). Should we be aligning the system to the RAID stripe size for hardware efficiency, or is it more critical to stick to the 16 KB page size to optimize the IOMMU/DART mapping overhead in DriverKit?
-
Recalibration: If we increase the alignment to 16 KB, should we also adjust the
kIOMaximumSegmentByteCountto match (i.e., 16 KB), or is it better to keep it at 64 KB to allow fewer, larger segments per I/O?
We suspect that the 38% gain we saw in 4 KB Random Reads might improve even further if we fix this alignment bottleneck. Looking forward to your thoughts.
Best regards,
Charles