I am facing a similar issue trying to port a numerical-relativity code to Apple Silicon (https://spectre-code.org). Your post helped me clarify the issue...thank you kindly!
Here's what's going on. TL;DR — on arm64 macs, executables turn out to fail if they are larger than 2GB.
macOS, for efficiency reasons, removed standard libraries from the disk, such as libSystem.B.dylib (which contains the c runtime), libc++, etc. Instead, they live in a “shared cache” that’s loaded into memory and then linked by dyld when the executable is run. However, if your executable is large enough that it extends into the address range where the shared cache is stored (it’s always loaded at a fixed address range), then the shared cache is no longer available, because you’ve overwritten part of the shared region it lives in. In that case, the linker refuses to use the shared linker cache, which means you can't link against the c runtime in libSystem, leading to the error you received.
/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/mach/shared_region.h says what address the shared region begins at, and its size. Look for SHARED_REGION_BASE_ARM64 and SHARED_REGION_SIZE_ARM64: these say the size and starting address of the shared region containing the shared linker cache. On ARM64, the region starts at address 0x180000000, which is 6GB above 0x0. But on arm64, your executable gets loaded starting at 0x100000000 (4GB above 0x0), to ensure for security reasons that if you accidentally store a pointer in a 32-bit variable, when it gets converted back to 64 bits, the converted result is less than 4GB above address 0x0, which will crash the app. This leaves just 2GB for an executable to fit between the starting address and the start of the shared region.
Your example above uses global variables to extend the executable size to exceed 2GB, and that triggers the error. Reducing the array size enough will lower the executable size enough that it does not overlap with the shared region, which means dyld will use the shared cache and thus will be able to link against the c runtime in libSystem, letting the program run normally.
This problem won't occur on Intel Macs, because on intel the SHARED_REGION_BASE is at a much larger address (0x7FFF00000000).
My problem is that for my numerical-relativity code, it's not so clear how to get our executable size below 2GB. :(
So my follow-up question to your original post is the following: is there any way to run an executable larger than 2GB in size on M1 Macs? I.e., is there a workaround to the issue your example uncovered?
Anyway, I hope this helps! I'd be happy to discuss further if you have any questions, @TomDuff!