GPU clock speed on stays at about 450mhz when pegged at 100% when using tensorflow metal with M1-Pro

I am running a test model on my MBP M1 pro and the GPU clock speed never goes above ~450mhz (GPU cores are 100%). Using other apps that peg the GPU I can see the clock speed is about 1.3ghz.

Is this is an issue with tf-metal or am I doing something wrong?

FR

Answers

Could you please share the training script you are using?

Please note that as I increase the batch size the clock frequency goes up batches finish faster (obviously).

import tensorflow as tf
from tensorflow.keras.datasets import mnist
from tensorflow.keras.utils import to_categorical
from tensorflow.keras import layers
from tensorflow.keras import models
model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.Flatten())
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(10, activation='softmax'))

(train_images, train_labels), (test_images, test_labels) = mnist.load_data()
train_images = train_images.reshape((60000, 28, 28, 1))
train_images = train_images.astype('float32') / 255
test_images = test_images.reshape((10000, 28, 28, 1))
test_images = test_images.astype('float32') / 255
train_labels = to_categorical(train_labels)
test_labels = to_categorical(test_labels)
model.compile(optimizer='rmsprop',
              loss='categorical_crossentropy',
              metrics=['accuracy'])        
                    
model.fit(train_images, train_labels, epochs=5, batch_size=32)
test_loss, test_acc = model.evaluate(test_images, test_labels)
test_acc