Recurrent layers on tensorflow-met… | Apple Developer Forums

Recurrent layers on tensorflow-metal on M1

Just having this error (that kills the kernel) trying to train a simple LSTM layer with Keras on the latest release (and previous too) on GPU on a M1.

2021-09-06 10:24:43.223160: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:176] None of the MLIR Optimization Passes are enabled (registered 2)

2021-09-06 10:24:43.223669: W tensorflow/core/platform/profile_utils/cpu_utils.cc:128] Failed to get CPU frequency: 0 Hz

Epoch 1/10

2021-09-06 10:24:43.672828: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.

2021-09-06 10:24:43.826331: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.

The code:

import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Embedding, Dense
from tensorflow.keras.datasets import imdb
from tensorflow.keras.preprocessing import sequence

max_features = 10000
maxlen = 500
batch_size = 32

print('Loading data...')
(input_train, y_train), (input_test, y_test) = imdb.load_data(num_words=max_features)

print('Pad sequences (samples x time)')
input_train = sequence.pad_sequences(input_train, maxlen=maxlen)
input_test = sequence.pad_sequences(input_test, maxlen=maxlen)

model = Sequential()
model.add(Embedding(max_features, 32))
model.add(LSTM(32))
model.add(Dense(1, activation='sigmoid'))
model.summary()

model.compile(optimizer='rmsprop', loss='binary_crossentropy',
             metrics=['acc'])


history = model.fit(input_train, y_train,
                   epochs=10,
                   batch_size=128,
                   validation_split=0.2)

I tried with python 3.8 and 3.9 (different environments) and the lastest release

numpy                     1.19.5           py39h1f3b974_2    conda-forge
python                    3.9.7           h54d631c_0_cpython    conda-forge
tensorflow-macos          2.5.0                    pypi_0    pypi  
tensorflow-metal          0.1.2                    pypi_0    pypi

With CPU, adding with tf.device('/CPU:0'): just before the fit works well.

Also, tried with a SimpleRNN and have the same problem.

Someone have the same problem?