vnop_strategy unexpectedly zero-extends files
On implementing vnop_mmap, vnop_strategy and other related VNOPs as suggested in https://developer.apple.com/forums/thread/756358, my vnop_strategy routine ends up zero-extending files.
The end result is shown below.
Original file:
% hexdump -Cs 0x012085a0 progit.pdf
012085a0 30 30 20 6e 20 0a 74 72 61 69 6c 65 72 0a 3c 3c |00 n .trailer.<<|
012085b0 20 2f 53 69 7a 65 20 33 35 30 39 0a 2f 52 6f 6f | /Size 3509./Roo|
012085c0 74 20 32 20 30 20 52 0a 2f 49 6e 66 6f 20 31 20 |t 2 0 R./Info 1 |
012085d0 30 20 52 0a 3e 3e 0a 73 74 61 72 74 78 72 65 66 |0 R.>>.startxref|
012085e0 0a 31 38 38 33 38 33 39 30 0a 25 25 45 4f 46 0a |.18838390.%%EOF.|
012085f0
New file:
% hexdump -Cs 0x012085a0 /Volumes/myfs/progit_cp.pdf
012085a0 30 30 20 6e 20 0a 74 72 61 69 6c 65 72 0a 3c 3c |00 n .trailer.<<|
012085b0 20 2f 53 69 7a 65 20 33 35 30 39 0a 2f 52 6f 6f | /Size 3509./Roo|
012085c0 74 20 32 20 30 20 52 0a 2f 49 6e 66 6f 20 31 20 |t 2 0 R./Info 1 |
012085d0 30 20 52 0a 3e 3e 0a 73 74 61 72 74 78 72 65 66 |0 R.>>.startxref|
012085e0 0a 31 38 38 33 38 33 39 30 0a 25 25 45 4f 46 0a |.18838390.%%EOF.|
012085f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
0120c000
The size of the original file is 0x012085f0 or 18908656 in decimal.
The size of the new file is 0x0120c000 or 18923520 in decimal.
The difference between the two files is 14864 bytes, which is filled with zeros.
The blocks of code in my vnop_write that set IO_HEADZEROFILL and IO_TAILZEROFILL are never entered into.
The last cluster_write call is passed a uio whose offset is 18874368 and resid is 34288:
2024-07-24 11:34:05 vnops.c:497 myfs_vnop_write: uio_offset(uio): 18874368 uio_resid(uio): 34288
2024-07-24 11:34:05 vnops.c:498 myfs_vnop_write: -> cluster_write(vp, uio, originalEOF: 18874368, newEOF: 18908656, zero_head_off: 0, zero_tail_off: 18908656, lflag: 0x100001)
2024-07-24 11:34:05 vnops.c:500 myfs_vnop_write: <- cluster_write -> 0
The last vnop_strategy call is handed down a buf_count of 49152 bytes however, which is then used to create a new uio which is passed to myfs_write routine:
2024-07-24 11:34:05 vnops.c:670 myfs_vnop_strategy: uio_offset(uio): 18874368 uio_resid(uio): 49152
2024-07-24 11:34:05 vnops.c:671 myfs_vnop_strategy: -> myfs_write
2024-07-24 11:34:05 vnops.c:673 myfs_vnop_strategy: <- myfs_write -> 49152
The difference between 49152 and 34288 of 14864 bytes is what ends up being zero-extended.
dtrace(1)'ing the operation revealed the following sequence of calls being made.
About half-way through into its operation, cluster_write calls my vnop_strategy:
4 -> cluster_write proc: cp -> cluster_write(vp: progit_cp.pdf, uio, originalEOF: 8388608, newEOF: 8912896, zero_head_off: 0, zero_tail_off: 8912896, lflag: 0x100001)
4 -> myfs_vnop_strategy proc: cp file: progit_cp.pdf b_bufsize: 0 b_bcount: 65536 b_resid: 0
kernel.release.t8101`0xfffffe0014bfb610+0xcec
kernel.release.t8101`0xfffffe0014bff2f0+0x300
kernel.release.t8101`0xfffffe0014c00bd0+0x54c
kernel.release.t8101`0xfffffe0014bfd898+0x724
kernel.release.t8101`cluster_write_ext+0x2a4
kernel.release.t8101`cluster_write+0x1c
myfs`myfs_vnop_write+0xb18
kernel.release.t8101`VNOP_WRITE+0x84
It then picks it up where it left off:
4 -> cluster_write proc: cp -> cluster_write(vp: progit_cp.pdf, uio, originalEOF: 8912896, newEOF: 9437184, zero_head_off: 0, zero_tail_off: 9437184, lflag: 0x100001)
And eventually makes the final cluster_write call:
4 -> cluster_write proc: cp -> cluster_write(vp: progit_cp.pdf, uio, originalEOF: 18874368, newEOF: 18908656, zero_head_off: 0, zero_tail_off: 18908656, lflag: 0x100001)
Which is followed by another series of vnop_strategy calls:
4 -> myfs_vnop_strategy proc: cp file: progit_cp.pdf b_bufsize: 0 b_bcount: 65536 b_resid: 0
kernel.release.t8101`0xfffffe0014bfb610+0xcec
kernel.release.t8101`0xfffffe0014bff2f0+0x300
kernel.release.t8101`0xfffffe0014c00bd0+0x54c
kernel.release.t8101`0xfffffe0014bfd898+0x724
kernel.release.t8101`cluster_write_ext+0x2a4
kernel.release.t8101`cluster_write+0x1c
myfs`myfs_vnop_write+0xb18
kernel.release.t8101`VNOP_WRITE+0x84
dtrace(1)'ing the same operation on smbfs shows a different sequence of calls.
cluster_write runs start through finish:
4 -> cluster_write proc: cp -> cluster_write(vp: progit_cp.pdf, uio, originalEOF: 0, newEOF: 524288, zero_head_off: 0, zero_tail_off: 524288, lflag: 0x100001)
...
4 -> cluster_write proc: cp -> cluster_write(vp: progit_cp.pdf, uio, originalEOF: 18874368, newEOF: 18908656, zero_head_off: 0, zero_tail_off: 18908656, lflag: 0x100001)
smbfs`smbfs_vnop_write+0x354
With vnop_strategy invoked through the cluster_push path when the file gets closed:
2 -> smbfs_vnop_strategy proc: cp file: progit_cp.pdf b_bufsize: 0 b_bcount: 8388608 b_resid: 0
kernel.release.t8101`0xfffffe0026cdb610+0xcec
kernel.release.t8101`0xfffffe0026cdf2f0+0x300
kernel.release.t8101`cluster_push_err+0x2e0
smbfs`smbfs_vnop_close+0x204
I don't understand why my filesystem behaves as described above.
Perusing the source code of both the relevant parts of Darwin/XNU and SMBClient did not clarify things for me.
A nudge in the right direction would be greatly appreciated.