simd provides types and functions for small vector and matrix computations.

simd Documentation

Posts under simd tag

9 Posts
Sort by:
Post not yet marked as solved
7 Replies
322 Views
In my app I need to solve a set of 3 equations. But I really struggle, because these 3 equations contain 4 unknown. So the result would be some function of a fith variable. Very simple Example: u = cos(s) v = sin(s) 1 = t I really don‘t know a method how to solve it. I think it can‘t be solved numerically ,because the result is a function. I thought about to convert the this functions into Fourier series and solve than the set of equations. But I am not sure if this works either. Does anyone know any framework for this. I have looked on the the internet but found nothing.
Posted
by
Post not yet marked as solved
3 Replies
324 Views
OS: MacOS 12.2.1 Hardwear: MacBook Pro 2020, M1 Metal: 2.4 Xcode: 13.2.1 Here is my test computer kernel,which read input buffer with simdgroup_load adn write output buffer with simdgroup_store kernel void fun( const device half * Src [[ buffer(0) ]], constant uint4 & SrcShape [[ buffer(1) ]], device half * Dst [[ buffer(2) ]], constant uint4 & DstShape [[ buffer(3) ]], const device half * Weight [[ buffer(4) ]], ushort3 threadgroup_position_in_grid [[ threadgroup_position_in_grid ]], ushort3 thread_position_in_threadgroup [[ thread_position_in_threadgroup ]], ushort3 threads_per_threadgroup [[ threads_per_threadgroup ]], ushort3 thread_position_in_grid [[ thread_position_in_grid ]]) { const int SrcSlices = (int)SrcShape[0]; const int SrcHeight = (int)SrcShape[1]; const int SrcWidth = (int)SrcShape[2]; const int DstSlices = (int)DstShape[0]; const int DstHeight = (int)DstShape[1]; const int DstWidth = (int)DstShape[2]; const int Kernel_X = 3; const int KernelElemNum = 3 * 3; const int N_Pack = 8; // test only 1 thread if(thread_position_in_grid.z != 0 || thread_position_in_grid.y != 0 || thread_position_in_grid.x * N_Pack != 0) return; simdgroup_half8x8 sgMatY; simdgroup_load(sgMatY, Src); simdgroup_store(sgMatY, Dst); } It's a simple shader, however output buffer only save the first 2 values from input buffer, the other 62 values are ALL ZERO Here is the result from Xcode Metal Capture How can I debug or fix it?
Posted
by
Post not yet marked as solved
0 Replies
387 Views
In my AR app, I rotate my node around some position in the real world. In order to do that, I use the pivot or simdPivot attribute of the node to rotate around that position. Otherwise it would rotate around the node's center which is elsewhere and the whole node would move around instead of just rotating in place. Up to here all is fine. The problem is that in ARKit, using a pivot also moves the camera to the pivot's location. I can compensate by changing the node's position in the opposite direction. But then, if I continue this way, the user will never be in the correct "real" position inside the node, but in some "other" location and only due to the pivot he will "see" the correct location. This is a problem, because later on I will need to catch collisions between the user and elements in the scene. I could continue compensating for the pivot's location, but that's a real pain and not so elegant. I thought that I could use the pivot, rotate, and then reset the pivot. But it turns out that doing so will not keep the rotation that I set before. Rather, the rotation will behave as if the pivot was never created and will give me the exact behavior I didn't want to achieve. So let's say I do this in my code: let translation = simd_float4x4(SCNMatrix4MakeTranslation(0,0,-2)) node?.simdPivot = translation let rotationMatrix = simd_float4x4(SCNMatrix4MakeRotation(.pi/4, 0, 1, 0)) node?.simdTransform = rotationMatrix (I found that setting the simdRotation attribute didn't rotate the node so I set the simdTransform instead). The above code will correctly rotate the node at position (0, 0, -2) But then if I reset the pivot as such: node?.simdPivot = matrix_identity_float4x4 Now the node will be rotated around (0, 0, 0). Is there any other way to resolve this? Can I move my node in any other way without having to always compensate for the pivot's location so that the user's location will always correspond to his real location inside the node? Or is there another way to remove the pivot, but leave the rotation that I accomplished with the pivot and freeze the node in its new position/rotation? Thanks!
Posted
by
Post not yet marked as solved
1 Replies
376 Views
I have a metal compute kernel for dense matrix mutiply, and I'd like to optimize it with simdgroup_float8x8 and simdgroup_half8x8. However, it seems no one apply them in Metal. Can you give me some more demo on how to use them excpet that in Metal Shading Language Specification Version 2.4. Thanks!
Posted
by
Post marked as solved
4 Replies
302 Views
Hello, Could I please ask how to include the simd framework function simd_incircle(::::)? I have imported simd but cannot find this function, linked: https://developer.apple.com/documentation/accelerate/1646495-simd_incircle Details: Swift 5.2 Xcode 13.0 Thank you
Posted
by
Post marked as solved
9 Replies
838 Views
I have two triangels (T1,T2) and their vertecies. I want to know the line at which the triangles intersect. For the vertecies I use SIMD3. It would be great if someone could help me with my problem.
Posted
by
Post marked as solved
1 Replies
385 Views
I have plane from which I know a point and the normal. Both are SIMD3. I want to calculate a 2d coordinate system (directions of x and y in 3d space) , which makes scence to the User. The Planes can also be parallel to xy and so on. (But that is Easy). I also would be very thankful if there is away to get points which are on the plane to the 2D formt. I hope you could understand my english.
Posted
by
Post not yet marked as solved
0 Replies
586 Views
I was build SSE performance work on mac intel. But I found the SSE4.1 version of performance in xcode 12.4 is not as good as xcode 10.1, so I checked the assembly of my code. The one _mm_mul_epi() was translated into three pmuludq, which is the SSE2 instruction.This was normal when compiling on xcode 10.1 and _mm_mul_epi() was translated into pmuldq. Does anyone know how to fix this issue?
Posted
by
Post not yet marked as solved
2 Replies
719 Views
Hello, I am porting an app to arm64 apple using this ABI differences from the standard arm64 https://developer.apple.com/documentation/xcode/writing_arm64_code_for_apple_platforms However, I found out that HFA arguments are aligned to 4 bytes on stack, when standard arm64 convention requires 8 bytes: developer.arm.com/documentation/ihi0055/latest "If the argument is an HFA or an HVA then the NSRN is set to 8 and the size of the argument is rounded up to the nearest multiple of 8 bytes." struct Vector3 { &#9;&#9;float x; &#9;&#9;float y; &#9;&#9;float z;&#9;&#9; }; float __stdcall testVector3( &#9;&#9;Vector3 v1, &#9;&#9;float&#9; f1, &#9;&#9;float&#9; f2, &#9;&#9;float&#9; f3, &#9;&#9;float&#9; f4, &#9;&#9;float&#9; f5, &#9;&#9;float&#9; f6, &#9;&#9;float&#9; f7, &#9;&#9;Vector3 v2, &#9;&#9;float&#9; f8, &#9;&#9;float&#9; f9, &#9;&#9;float f10, &#9;&#9;float f11, &#9;&#9;float f12, &#9;&#9;float f13) so for such method I was expecting f6 and later arguments on the stack, but v2 to have 16 byte size (according to arm64 abi), however, I see that it takes 12 bytes and there is no padding between v2 and f8. thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 3.1 &#9;&#9;frame #0: 0x00000001000038f8 a.out`nativeCall_PInvoke_Vector3Arg_Unix(Vector3, float, float, float, float, float, float, float, Vector3, float, float, float, float, float, float) a.out`nativeCall_PInvoke_Vector3Arg_Unix:>&#9;0x1000038f8 <+0>:&#9;sub&#9;&#9;sp, sp, #0x80&#9;&#9;&#9;&#9;&#9;&#9; ; =0x80 &#9;&#9;0x1000038fc <+4>:&#9;stp&#9;&#9;x29, x30, [sp, #0x70] &#9;&#9;0x100003900 <+8>:&#9;add&#9;&#9;x29, sp, #0x70&#9;&#9;&#9;&#9;&#9;&#9;; =0x70 &#9;&#9;0x100003904 <+12>: ldr&#9;&#9;w8, [x29, #0x10] (lldb) memory read -s4 -f float -c20&#9;$sp 0x16fdff9b0: 6 // float f6 0x16fdff9b4: 7 // float f7 0x16fdff9b8: 4 // Vector.x 0x16fdff9bc: 5 // Vector.y 0x16fdff9c0: 6 // Vector.z 0x16fdff9c4: 8 // float f8, where is padding? 0x16fdff9c8: 9 // float f9 Is it an expected behavior? Is it documented somewhere?
Posted
by