Keras on Mac (M4) is giving inconsistent results compared to running on NVIDIA GPUs

I have seen inconsistent results for my Colab machine learning notebooks running locally on a Mac M4, compared to running the same notebook code on either T4 (in Colab) or a RTX3090 locally.

To illustrate the problems I have set up a notebook that implements two simple CNN models that solves the Fashion-MNIST problem. https://colab.research.google.com/drive/11BhtHhN079-BWqv9QvvcSD9U4mlVSocB?usp=sharing For the good model with 2M parameters I get the following results:

    1. T4 (Colab, JAX): Test accuracy: 0.925
    1. 3090 (Local PC via ssh tunnel, Jax): Test accuracy: 0.925
    1. Mac M4 (Local, JAX): Test accuracy: 0.893
    1. Mac M4 (Local, Tensorflow): Test accuracy: 0.893

That is, I see a significant drop in performance when I run on the Mac M4 compared to the NVIDIA machines, and it seems to be independent of backend. I however do not know how to pinpoint this to either Keras or Apple’s METAL implementation. I have reported this to Keras: https://colab.research.google.com/drive/11BhtHhN079-BWqv9QvvcSD9U4mlVSocB?usp=sharing but as this can be (likely is?) an Apple Metal issue, I wanted to report this here as well.

On the mac I am running the following Python libraries:

keras 3.9.1
tensorflow 2.19.0
tensorflow-metal 1.2.0
jax 0.5.3
jax-metal 0.1.1
jaxlib 0.5.3
Keras on Mac (M4) is giving inconsistent results compared to running on NVIDIA GPUs
 
 
Q