xcode sse problem

I was build SSE performance work on mac intel. But I found the SSE4.1 version of performance in xcode 12.4 is not as good as xcode 10.1, so I checked the assembly of my code. The one _mm_mul_epi() was translated into three pmuludq, which is the SSE2 instruction.This was normal when compiling on xcode 10.1 and _mm_mul_epi() was translated into pmuldq. Does anyone know how to fix this issue?

  • the right compiler is Apple LLVM version 10.0.0 (clang-1000.11.45.5), while the bad one is Apple clang version 12.0.0 (clang-1200.0.32.29), and it seems go wrong when compiler is Apple LLVM version 10.0.1 (clang-1001.0.46.4)

Add a Comment