Posts

Post not yet marked as solved
0 Replies
167 Views
Not really a question. As part of porting other platform code, FreeBSD and Linux, there is a #define macro used to specify module parameters. It is desirable for these new sysctl to show automatically when "upstream" adds them. (without having to manually maintain a list) This is usually done with "Linker Sets" but they are not available in kexts, mostly due to __mh_execute_header. I took a different approach with: #define ZFS_MODULE_PARAM(scope_prefix, name_prefix, name, type, perm, desc) \ SYSCTL_DECL( _kstat_zfs_darwin_tunable_ ## scope_prefix); \ SYSCTL_##type( _kstat_zfs_darwin_tunable_ ## scope_prefix, OID_AUTO, name, perm, \ &name_prefix ## name, 0, desc) ; \ __attribute__((constructor)) void \ _zcnst_sysctl__kstat_zfs_darwin_tunable_ ## scope_prefix ## _ ## name (void) \ { \ sysctl_register_oid(&sysctl__kstat_zfs_darwin_tunable_ ## scope_prefix ## _ ## name ); \ } \ __attribute__((destructor)) void \ _zdest_sysctl__kstat_zfs_darwin_tunable_ ## scope_prefix ## _ ## name (void) \ { \ sysctl_unregister_oid(&sysctl__kstat_zfs_darwin_tunable_ ## scope_prefix ## _ ## name ); \ } Ie, when macro is used, I use __attribute__((constructor)) on a function named after the sysctl, which is then called automatically on kext load, and each one of those functions, call sysctl_register_oid(). And likewise for destructor / unregister. So far it works quite well. Any known drawbacks? I've not tested it on M1.
Posted
by lundman.
Last updated
.
Post not yet marked as solved
6 Replies
363 Views
Having a peculiar issue trying to support the use of O_EXCL. (Fail if O_CREAT and file exists). It will fail the first time, then if the call is repeated, it works as expected. It is not entirely clear how macOS should handle O_EXCL, it has been mentioned that vnop_create() should always return EEXIST - does that mean even in the success case, it should return EEXIST instead of 0? That seems odd. Output of test program is: # (1) Create the file with (O_WRONLY|O_CREAT). open okay write okay close okay 86 -rw-r----- 1 501 0 29 Jan 12 17:08 /Volumes/BOOM/teest.out Deleting /Volumes/BOOM/teest.out # (2) Try creating with (O_WRONLY|O_CREAT|O_EXCL). writef: Stale NFS file handle 436207628 87 ---------- 1 501 wheel 0 0 "Jul 9 07:53:53 2037" "Jan 12 17:09:02 2022" "Jan 12 17:09:02 2022" "Jan 1 09:00:00 1970" 1048576 0 0 /Volumes/BOOM/teest.out So, since the file is deleted in between the tests, O_EXCL shouldn't really kick in here, and yet, something goes wrong. The nfs server sends ESTALE to the nfs client. The dtrace stack is: Stack: kernel.development`nfsrv_setattr+0x7c6 kernel.development`nfssvc_nfsd+0xbdc kernel.development`nfssvc+0x106 kernel.development`unix_syscall64+0x2ba kernel.development`hndl_unix_scall64+0x16 Result: 0 259014 nfsrv_setattr: entry 0 259014 mac_vnode_check_open:entry 0 259015 hook_vnode_check_open:return 2 nfsd 0 259015 mac_vnode_check_open:return 2 nfsd 0 229396 nfsrv_rephead:entry 0 1 2 3 4 5 6 7 8 9 a b c d e f 0123456789abcdef 0: 46 00 00 00 F... So, nfssrv_setattr() replies with 0x46/70 (ESTALE) seemingly because the call hook_vnode_check_open() returns 2 (ENOENT). Why though, the file was removed, I verified the cache has no entry. Then created again, confirmed it IS in the cache. <zfs`zfs_vnop_remove (zfs_vnops_osx.c:1700)> zfs_vnop_remove error 0: checking cache: NOTFOUND <zfs`zfs_vnop_create (zfs_vnops_osx.c:1427)> *** zfs_vnop_create: with 1: EXCL <zfs`zfs_create (zfs_vnops_os.c:660)> zfs_create: zp is here 0x0 <zfs`zfs_vnop_create (zfs_vnops_osx.c:1458)> ** zfs_vnop_create created id 82 <zfs`zfs_vnop_create (zfs_vnops_osx.c:1475)> zfs_vnop_create error -1: checking cache: FOUND I am having issues finding where the code for hook_vnode_check_open comes from anyway? The failure call in nfs server is: if (!error && mac_vnode_check_open(ctx, vp, FREAD | FWRITE)) { error = ESTALE; } So uh, why? If I let the test run again, this time the file exists, it returns EEXIST as expected. If I run the first test twice, ie, without O_EXCL, both work. So it seems to only go wrong with O_EXCL, and file doesn't exist. It is curious as to why nfs server figures out that exclusive is set, then clears va_mode? case NFS_CREATE_EXCLUSIVE: exclusive_flag = 1; if (vp == NULL) { VATTR_SET(vap, va_mode, 0); But doesn't use exclusive_flag until after calling VNOP_CREATE(), and it doesn't pass it either.
Posted
by lundman.
Last updated
.
Post not yet marked as solved
0 Replies
266 Views
Ever since 10.15.5 (I think it was) brought in the new proc_lock_ APIs it has been quite easy to deadlock namei() lookups and mount at the same time. Stack 1 *1000 unix_syscall64 + 698 (kernel.development + 9558170) [0xffffff8000b1d89a] *1000 lstat64 + 47 (kernel.development + 4947279) [0xffffff80006b7d4f] *1000 fstatat_internal + 327 (kernel.development + 4944567) [0xffffff80006b72b7] *1000 nameiat + 117 (kernel.development + 4919557) [0xffffff80006b1105] *1000 namei + 3857 (kernel.development + 4813841) [0xffffff8000697411] *1000 lookup + 1842 (kernel.development + 4817810) [0xffffff8000698392] *1000 lookup_handle_found_vnode + 677 (kernel.development + 4814677) [0xffffff8000697755] *1000 vfs_busy + 79 (kernel.development + 4847775) [0xffffff800069f89f] *1000 IORWLockRead + 738 (kernel.development + 3527154) [0xffffff800055d1f2] Stack 2 1000 mount + 10 (libsystem_kernel.dylib + 41114) [0x7fff72fc109a] *1000 hndl_unix_scall64 + 22 (kernel.development + 1622534) [0xfffff f800038c206] *1000 unix_syscall64 + 698 (kernel.development + 9558170) [0xfffff f8000b1d89a] *1000 mount + 78 (kernel.development + 4901838) [0xffffff80006ac bce] *1000 __mac_mount + 1330 (kernel.development + 4903186) [0xfff fff80006ad112] *1000 mount_common + 4860 (kernel.development + 4897964) [0xffffff80006abcac] *1000 checkdirs + 115 (kernel.development + 4901059) [0xffffff80006ac8c3] *1000 proc_iterate + 892 (kernel.development + 8110892) [0xffffff80009bc32c] *1000 checkdirs_callback + 139 (kernel.development + 4901547) [0xffffff80006acaab] *1000 IORWLockWrite + 1240 (kernel.development + 3528664) [0xffffff800055d7d8] The mount call will vfs_busy() then wait for proc_dirs_lock_exclusive() (IORWLockWrite). Whereas stat will grab proc_dirs_lock_share() in namei(), then because it needs to cross mountpoint, it calls lookup_traverse_mountpoints() which calls vfs_busy(). Classic A-B, B-A deadlock. Having a hard to time to 1) avoid it, or 2) detect it will happen, since everything is opaque, settings like NOCROSSMNT is not something I can set.
Posted
by lundman.
Last updated
.
Post not yet marked as solved
2 Replies
555 Views
So what is the current status of symbolication on the M1? When I trigger something like: panic(cpu 5 caller 0xfffffe0027b72dc8): Break 0xC472 instruction exception from kernel. Ptrauth failure with DA key resulted in 0xbffffe16708b1aa0 at pc 0xfffffe002763c748, lr 0xfffffe00266449d4 (saved state: 0xfffffe30b4fc3470) OS version: 20E241 Kernel version: Darwin Kernel Version 20.4.0: Thu Apr 22 21:46:41 PDT 2021; root:xnu-7195.101.2~1/RELEASE_ARM64_T8101 Fileset Kernelcache UUID: 0B829878C98BF0B6E3AF7BF571B60BF2 Kernel UUID: 1DC99FEF-0771-3229-974C-9B18710700AE KernelCache slide: 0x000000001f764000 KernelCache base: 0xfffffe0026768000 Kernel slide: 0x00000000202a4000 Kernel text base: 0xfffffe00272a8000 Kernel text exec base: 0xfffffe0027370000 Panicked task 0xfffffe166ef76730: 251 pages, 1 threads: pid 1007: zfs Panicked thread: 0xfffffe166acb1980, backtrace: 0xfffffe30b4fc2b80, tid: 10850 lr: 0xfffffe00273be920 fp: 0xfffffe30b4fc2bf0 lr: 0xfffffe00266449d4 fp: 0xfffffe30b4fc3800 lr: 0xfffffe002650ab60 fp: 0xfffffe30b4fc3830 lr: 0xfffffe002650fad4 fp: 0xfffffe30b4fc3900 lr: 0xfffffe002650dc88 fp: 0xfffffe30b4fc39e0 lr: 0xfffffe0026517798 fp: 0xfffffe30b4fc3a10 Kernel Extensions in backtrace: org.openzfsonosx.zfs(2.0)[EB1A7CDB-C33F-3E0A-A7C2-316765670F52]@0xfffffe002641c000-0xfffffe0026647fff It would be nice to be able to look those symbols up. But both atos and lldb give "clearly not the correct symbols" for kext, and kernel; atos -o /Library/Extensions/zfs.kext/Contents/MacOS/zfs -arch arm64e -l 0xfffffe002641c000 0xfffffe00266449d4 0xfffffe002650ab60 0xfffffe002650fad4 0xfffffe002650dc88 0xfffffe0026517798 0xfffffe002763f82c ZSTD_compressBlock_btopt (in zfs) + 140 dsl_dataset_get_holds (in zfs) (dsl_userhold.c:677) ldi_open_by_name (in zfs) (ldi_osx.c:1906) hkdf_sha512 (in zfs) (hkdf.c:162) handle_unmap_iokit (in zfs) (ldi_iokit.cpp:2008) vmem_init.initial_default_block (in zfs) + 12695596 Almost so random it could be ASLR. Annoyingly keepsyms=1 does not work here (or with this type of crash?) and debug=x0144 is ignored (it just boots again).
Posted
by lundman.
Last updated
.
Post not yet marked as solved
2 Replies
311 Views
This bug report is from Catalina, but we have confirmed it happens in BigSur as well, it is just tedious to do kext work in BigSur. The following process: zpool create mypool disk1 chown -R lundman /Volumes/mypool chown: /Volumes/mypool/.Spotlight-V100/Store-V2: No such file or directory chown: /Volumes/mypool/.Spotlight-V100/VolumeConfiguration.plist: No such file or directory chown: /Volumes/mypool/.fseventsd: No such file or directory Create a new filesystem, mount, try to chown -R and get errors. The names of files that error stay the same for subsequent chown runs, but different may fail if I re-create the filesystem. Then do: ssh localhost chown -R lundman /Volumes/mypool So ssh to the exact same machine, and chown runs fine. It does something differently if I'm on the UI, vs, if I'm ssh'ed in (ssh on same UI or remote, ssh fixes it). The errored files stat just fine, and you can chown it just fine. (without -R). Even after doing a working chown -R over ssh, the UI chown -R will still fail. Digging as deep as I can with dtrace, I have traced it to lookup:return 2 chown namei:return 2 chown vn_open_auth:return 2 chown So it isn't even reaching VNOP_LOOKUP() in my filesystem yet. (But perhaps readdir could be returning something bad?) So triggering a panic when it is about to return ENOENT: dtrace -** 'lookup:return {printf("%d %s", arg1,execname); if (execname =="chown" &amp;&amp; arg1 == 2 &amp;&amp; val++ == 10) { printf("This one"); panic()}}' : mach_kernel : trap_from_kernel + 0x26 : mach_kernel : _lookup + 0x208 : mach_kernel : _namei + 0xea6 : mach_kernel : _nameiat + 0x75 : mach_kernel : _fstatat_internal + 0x147 : mach_kernel : _stat64 + 0x2f frame #13: 0xffffff800489ff88 kernel.development`lookup(ndp=unavailable) at vfs_lookup.c:1457:1 [opt] (lldb) p *ndp (nameidata) $1 = { ni_dirp = 140556031248840 ni_segflg = UIO_USERSPACE64 ni_op = OP_SETATTR ni_startdir = 0x0000000000000000 ni_rootdir = 0xffffff801f23d700 ni_usedvp = 0x0000000000000000 ni_vp = 0x0000000000000000 ni_dvp = 0xffffff801f552700 ni_pathlen = 1 ni_next = 0xffffff8077d4bc1a no value available ni_pathbuf = { [0] = '.' [1] = 'f' [2] = 's' [3] = 'e' [4] = 'v' [5] = 'e' [6] = 'n' [7] = 't' [8] = 's' [9] = 'd' [10] = '\0' [255] = '\0' } ni_loopcnt = 0 ni_cnd = { cn_nameiop = 0 cn_flags = 1097792 cn_context = 0xffffff80262c2120 cn_ndp = 0xffffff8077d4bbc8 cn_pnbuf = 0xffffff8077d4bc10 ".fseventsd" cn_pnlen = 256 cn_nameptr = 0xffffff8077d4bc10 ".fseventsd" cn_namelen = 10 cn_hash = 1753311157 cn_consume = 0 } ni_flag = 0 ni_ncgeneration = 0 } (lldb) p *ndp-ni_cnd.cn_context (vfs_context) $2 = { vc_thread = 0xffffff80206b8550 vc_ucred = 0xffffff80254d1490 (lldb) p *ndp-ni_dvp v_name = 0xffffff801f23b500 "Volumes" (lldb) frame variable (int) wantparent = 6 (int) docache = 1 Nothing stands out to my green eyes, but it is annoying that I can not see most variables. It is time to boot kernel.debug instead. But unfortunately, the chown -R does not happen with booting kernel.debug! D'oh. Tested re-creating and running chown -R 4 times before it had a panic with xnu_debug/xnu-6153.101.5/osfmk/kern/thread.c:2535 Assertion failed: io_tier IO_NUM_PRIORITIES called from _apfs_vnop_strategy() - probably unrelated. Don't think I've come across a problem with my filesystem that changed depending on if I had ssh'ed in. Using UI vs ssh presumably changes context? But it must be related to my code, since it doesn't happen with hfs.
Posted
by lundman.
Last updated
.
Post not yet marked as solved
0 Replies
332 Views
The userland code can pass an fd (file-descriptor) into the kernel to do some IO on (file_vnode_withvid() + vn_rdwr(), but the "other platforms" can just access the equivalent of fp-fp_glob-fg_offset; to know what offset we should start from. I believe that all those structs are opaque. I don't see a method for accessing offset of procfd/fp/fp_glob. There are various functions like fill_fileinfo(), but looks like none of the *info functions are exported. I was wondering if I can end up in vn_read() with FOF_OFFSET in flags, as that seems to set uio_offset to the fg_offset, and issue a zero-length read, but don't think I can get there from a fd. Has to come from fo_read() which is not exported. Any other ideas? Obviously, since I pass the fd from userland, I can also pass the offset - and I will probably end up doing that, it would just be a smaller "change" if I could find the offset from the kernel.
Posted
by lundman.
Last updated
.
Post not yet marked as solved
2 Replies
507 Views
I've been working hard trying to get rid of all the kernel functions that we aren't allowed to call, and now have only a handful left. Loads fine on Intel, but not on arm64e. 2: Could not use 'net.lundman.zfs' because: Failed to bind '_cpu_number' in 'net.lundman.zfs' (at offset 0x3c0 in __DATA_CONST, __got) as could not find a kext which exports this symbol For arm64e: 6 symbols not found in any library kext: _vnop_getnamedstream_desc _vnop_removenamedstream_desc _kmem_alloc _vnop_makenamedstream_desc _kmem_free _cpu_number The documentation suggest I should use kmem_alloc(), and it is certainly in the t8101 kernel. I suppose it is in com.apple.kpi.unsupported - does that mean I'm not allowed to call them, or I should use some other method to allocate memory? The dependency list is: keyOSBundleLibraries/key dict keycom.apple.iokit.IOStorageFamily/key string1.6/string keycom.apple.iokit.IOAVFamily/key string1.0.0/string keycom.apple.kpi.bsd/key string8.0.0/string keycom.apple.kpi.iokit/key string8.0.0/string keycom.apple.kpi.libkern/key string10.0/string keycom.apple.kpi.mach/key string8.0.0/string keycom.apple.kpi.unsupported/key string8.0.0/string /dict (I think for namedstream issues, perhaps that has been removed on arm, so can just go without). cpu_number() I can probably live without, mostly used to spread out used locks semi-randomly. But I gotsa get me some memory! Lund
Posted
by lundman.
Last updated
.
Post not yet marked as solved
0 Replies
197 Views
Having issues calling kauth&#92;&#95;cred&#92;&#95;getgroups() as non-root cred_t from BigSur. Get panic: 0xffffffa843a737b0 : 0x0 0xffffffa843a738e0 : 0xffffff7fa5ab889e net.lundman.zfs : _dsl_load_user_sets + 0xbe > 126 ret = kauth_cred_getgroups((kauth_cred_t)cr, gids, &count); I see nothing suspicious with the arguments either: (lldb) p *cr (cred_t) $4 = { &#9;cr_link = { &#9;&#9;le_next = 0xffffff868f0ac370 &#9;&#9;le_prev = 0xffffff80056582d0 &#9;} &#9;cr_ref = 52 &#9;cr_posix = { &#9;&#9;cr_uid = 501 &#9;&#9;cr_ruid = 501 &#9;&#9;cr_svuid = 501 &#9;&#9;cr_ngroups = 16 &#9;&#9;cr_groups = { &#9;&#9;&#9;[0] = 20 &#9;&#9;&#9;[1] = 12 &#9;&#9;&#9;[2] = 61 &#9;&#9;&#9;[3] = 79 &#9;&#9;&#9;[4] = 80 &#9;&#9;&#9;[5] = 81 &#9;&#9;&#9;[6] = 98 &#9;&#9;&#9;[7] = 701 &#9;&#9;&#9;[8] = 33 &#9;&#9;&#9;[9] = 100 &#9;&#9;&#9;[10] = 204 &#9;&#9;&#9;[11] = 250 &#9;&#9;&#9;[12] = 395 &#9;&#9;&#9;[13] = 398 &#9;&#9;&#9;[14] = 399 &#9;&#9;&#9;[15] = 400 &#9;&#9;} &#9;&#9;cr_rgid = 20 &#9;&#9;cr_svgid = 20 &#9;&#9;cr_gmuid = 501 &#9;&#9;cr_flags = 2 &#9;} &#9;cr_label = 0xffffff868fdb41c0 &#9;cr_audit = { &#9;&#9;as_aia_p = 0xffffff934aef0a18 &#9;&#9;as_mask = (am_success = 12288, am_failure = 12288) &#9;} } (lldb) p gids (gid_t [16]) $1 = { &#9;[0] = 0 &#9;[1] = 0 &#9;[2] = 0 &#9;[3] = 0 &#9;[4] = 0 &#9;[5] = 0 &#9;[6] = 0 &#9;[7] = 0 &#9;[8] = 0 &#9;[9] = 0 &#9;[10] = 0 &#9;[11] = 0 &#9;[12] = 0 &#9;[13] = 0 &#9;[14] = 0 &#9;[15] = 0 } (lldb) p count (int) $2 = 16 Works every time if I am root, but will panic as non-root. Stack having NULL is also odd. Runs on Catalina and before.
Posted
by lundman.
Last updated
.
Post not yet marked as solved
2 Replies
319 Views
We use kextsymboltool when compiling our kext, but kexts will not load on Big Sur, generally it simply claims it can not find the Plugins/module.kext, or if I try to codesign Plugins/module.kext I get; ... because file does not have a __LINKEDIT segment I'm guessing there has been some slight changes to make kextsymbtoltool.c up to date, anything I can handle now, or must I wait for XNU source?
Posted
by lundman.
Last updated
.