SPM Performance

I've been toying around with the new Swift Package Manager a bit and found that I'm unable to garner the performance that I would expect and trying to identify if I’ve made an error in my understanding or design.


Initially, I was working on a rudimentary flood fill algorithm and found that I could break it apart into multiple Swift modules and thought this would be a grand time to try out the new SPM. :-)


I created a `Stack`, `FloodFill`, and `FloodFillApp` module. The FloodFill module depends on the Stack module, and the FloodFillApp application then calls upon the FloodFill module to perform the algorithm. Very linear, quite simple. Prior to breaking it into separate modules the algorithm was taking around 0.04s to complete. Upon breaking it apart into separate modules and then compiling the FloodFillApp using `swift build -c release` I found that suddenly the algorithm was completing in around 16s! If left in debug mode, it takes around 120s to finish.


The only thing I can come up with is that the `-whole-module-optimization` doesn’t cross these interdependent module boundaries and therefore can’t identify the same optimizations, leading to worse performance. I just didn’t quite expect such a remarkable contrast. Of course, this is just speculation. The only solid evidence I have supporting this is that when I copy the files from the other two modules into the FloodFillApp (and obviously remove dependencies/imports/etc), the performance returns to 0.04s.


Is there a way I can improve this performance and still maintain separate modules? Other thoughts?

I don’t believe what I’m about to post is a solution, however, I did come across this and wanted to share. This is very likely a contributing cause to my pain and one that (hopefully!) will be addressed soon.


I’ve found a way to dramatically improve the performance while maintaining separate modules albeit at a significant cost. The `Stack` module uses generics and is the only module that does so. I hadn’t realized that this was going to play a contributing role in finding a partial solution, but apparently this is an issue, if not the root cause.


I stumbled across a decorator called `@_specialize` which, according to the Swift docs/Generics.rst file, aides in instances such as these. From the Generics.rst file:


“@_specialize currently acts as a hint to the optimizer, which generates type checks and code to dispatch to the specialized routine without affecting the signature of the generic function. The intention is to support efforts at evaluating the performance of specialized code. The performance impact is not guaranteed and is likely to change with the optimizer. This attribute should only be used in conjunction with rigorous performance analysis. Eventually, a similar attribute could be defined in the language, allowing it to be exposed as part of a function's API. That would allow direct dispatch to specialized code without type checks, even across modules.”


This is an internal function that takes an explicit concrete type. This does, and only for the specified concrete type, increase the performance significantly. For instance, if I have a method `push(_ value: T)`, I could say `@_specialize(Int)` and if my Stack conforms to `Int`, the performance gains take effect. If any other type is sent to the Stack at this point, or the explicit concrete specialization of the Stack changes, it causes the same terrible performance characteristics as before. The algorithm now executes in 0.09s, over double that of the fully optimized single-module version, however this does allow for separate modules. This performance is what I had initially expected from compiling the release build as separate modules.


Again, I don’t believe this to be an adequate solution in this instance, as I want to repurpose modules like the Stack elsewhere without such constraints. I’d still love to hear any thoughts or suggestions as to how best to proceed in cases like this; or if there is something obvious that I’m missing. :-)

SPM Performance
 
 
Q