MPSNNReduceFeatureChannelsArgumentMax Catalyst issue

I'm using CoreML for image segmentation. I have a VNCoreMLRequest to run a model that returns a MLMultiArray. To accelerate processing the model output, I use MPSNNReduceFeatureChannelsArgumentMax to reduce the multiarray output to a 2D array which I then convert to a grayscale image.

This works great on iOS, but when running on Mac as a catalyst build, the output 2D array is all zeros.

I'm running Version 12.2 beta 2 (12B5025f) on an iMac Pro. I'm not seeing any runtime errors. MPSNNReduceFeatureChannelsArgumentMax appears to not work on Mac Catalyst.
I'm able to reduce the channels directly on the cpu by looping through all the array dimensions but it's very slow. This proves the model output works, just the metal reduce features fails.
Anyone else using CoreML and Catalyst?

Here's the bit of code that doesn't work:

Code Block // create a buffer and pass the inputs through the filter to the outputs
let buffer = self.queue.makeCommandBuffer()
let filter = MPSNNReduceFeatureChannelsArgumentMax(device: self.device)
filter.encode(commandBuffer: buffer!, sourceImage: probs, destinationImage: classes)
// add a callback to handle the buffer's completion and commit the buffer
buffer?.addCompletedHandler({ (_buffer) in
let argmax = try! MLMultiArray(shape: [1, softmax.shape[1], softmax.shape[2]], dataType: .float32)
classes.readBytes(argmax.dataPointer,
dataLayout: .featureChannelsxHeightxWidth,
imageIndex: 0)
// unmap the discrete segmentation to RGB pixels
guard var mask = codesToMask(argmax) else {
return
}
// display image in view
DispatchQueue.main.async {
self.imageView.image = mask
}
})