I work for a third party library vendor and we are pretty sure our customers will ask us for a Bitcode version of our library.
We are very performance sensitive though, and we've optimized critical sections of our library in NEON assembly. When we compile to LLVM bitcode, we are assuming that NEON assembly will be encapsulated. Is that correct?
We're also worried about writing code in assembly though. Bitcode seems to leave open the possibility that our application could end up running on a platform who's capabilities we did not optimize for in advance. It seems like it might be a good idea to make sure our software paths are working well. Would it also be a good idea to attempt to drop out of NEON directly and use the LLVM SIMD intrinsics? I believe Accelerate was also tried and found to be too high level for our stuff. We're also multiplatform.
One small correction: The option name to include bitcode is "-fembed-bitcode".
As far as the question about whether Neon intrinsics are preferable to writing assembly code, I will say "yes", at least in most cases. If you use intrinsics, the compiler can optimize the code to run well on different processors, and it is generally easier to maintain C code than assembly. But, if you need to hand-optimize the assembly to get the performance you want, you can certainly do that and it should "just work" for iOS even when you have bitcode for the other parts of your code.