arm64 apple HFA alignment.

Question

Created Dec ’20

Replies 2

Boosts 0

Views 1.3k

Participants 2

Hello, I am porting an app to arm64 apple using this ABI differences from the standard arm64 https://developer.apple.com/documentation/xcode/writing_arm64_code_for_apple_platforms

However, I found out that HFA arguments are aligned to 4 bytes on stack, when standard arm64 convention requires 8 bytes: developer.arm.com/documentation/ihi0055/latest
"If the argument is an HFA or an HVA then the NSRN is set to 8 and the size of the argument is rounded up to the nearest multiple of 8 bytes."

Code Block Cstruct Vector3 
{
    float x;
    float y;
    float z;    
};
float stdcall testVector3(
    Vector3 v1, 
    float   f1,
    float   f2,
    float   f3,
    float   f4,
    float   f5,
    float   f6,
    float   f7,
    Vector3 v2,
    float   f8,
    float   f9,
    float f10, 
    float f11,
    float f12,
    float f13)

so for such method I was expecting f6 and later arguments on the stack, but v2 to have 16 byte size (according to arm64 abi), however, I see that it takes 12 bytes and there is no padding between v2 and f8.

Code Block  thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 3.1
    frame #0: 0x00000001000038f8 a.out`nativeCall_PInvoke_Vector3Arg_Unix(Vector3, float, float, float, float, float, float, float, Vector3, float, float, float, float, float, float)
a.out`nativeCall_PInvoke_Vector3Arg_Unix:
->  0x1000038f8 <+0>:  sub    sp, sp, #0x80             ; =0x80
    0x1000038fc <+4>:  stp    x29, x30, [sp, #0x70]
    0x100003900 <+8>:  add    x29, sp, #0x70            ; =0x70
    0x100003904 <+12>: ldr    w8, [x29, #0x10]
(lldb) memory read -s4 -f float -c20  $sp
0x16fdff9b0: 6 // float f6
0x16fdff9b4: 7 // float f7
0x16fdff9b8: 4 // Vector.x
0x16fdff9bc: 5 // Vector.y
0x16fdff9c0: 6 // Vector.z
0x16fdff9c4: 8 // float f8, where is padding?
0x16fdff9c8: 9 // float f9

Is it an expected behavior? Is it documented somewhere?

Boost

Answer 1

sandreenko OP

Dec ’20

Note that for non-HFA structs we have padding:

Code Block struct SmallStruct <- takes 8 bytes on the stack.
{
    short s; 
};
//attribute((noinline))
int stdcall callWithSmallStruct(int i1, int i2, int i3, int i4, int i5, int i6, int i7, int i8, SmallStruct s, int i9, int i10, int i11)
{
    if (i9 != 9  i10 != 10 || i11 != 11)
    {
        printf("%d, %d, %d, %d, %d, %d, %d, %d, %d. %d, %d, %d\n", i1,i2,i3,i4,i5,i6,i7,i8,(int)s.s,i9,i10,i11);
        return 101;
    }
    return 100;
}
struct BigStruct  <- takes 16 bytes.
{
    int x;
    int y;
    int z;
};

0

Answer 2

Engineer OP

Apple

Sep ’21

Hi, only just noticed this question, but divergence from the AAPCS is documented on the Apple page you reference:

When passing arguments to functions, Apple platforms diverge from the ARM64 standard ABI in the following ways:

Function arguments may consume slots on the stack that are not multiples of 8 bytes. If the total number of bytes for stack-based arguments is not a multiple of 8 bytes, insert padding on the stack to maintain the 8-byte alignment requirements.

0