Upgrading tensorflow-macos and tensorflow-metal breaks Conv2d with groups arg

Today I upgraded tensorflow-macos to 2.9.0 and tensorflow-metal to 0.5.0, and found my old notebook failed to run.

It ran well with tensorflow-macos 2.8.0 and tensorflow-metal 0.4.0.

Specifically, I found that the groups arg of Conv2d layer was the cause.

Here is a demo:

import tensorflow as tf
from tensorflow import keras as tfk

# tf.config.set_visible_devices([], 'GPU')

Xs = tf.random.normal((32, 64, 48, 4))
ys = tf.random.normal((32,))

tf.random.set_seed(0)

model = tfk.Sequential([
    tfk.layers.Conv2D(
        filters=16,
        kernel_size=(4, 3), 
        groups=4, # groups arg
        activation='relu',
    ),
    tfk.layers.Flatten(),
    tfk.layers.Dense(1, activation='sigmoid'),
])

model.compile(
    loss=tfk.losses.BinaryCrossentropy(),
    metrics=[
        tfk.metrics.BinaryAccuracy(),
    ],
)

model.fit(Xs, ys, epochs=2, verbose=1)

The error is:

W tensorflow/core/framework/op_kernel.cc:1745] OP_REQUIRES failed at xla_ops.cc:296 : UNIMPLEMENTED: Could not find compiler for platform METAL: NOT_FOUND: could not find registered compiler for platform METAL -- check target linkage

Removing groups arg would make the code run again.

Training on CPU, by uncommenting line 4, gives different error:

'apple-m1' is not a recognized processor for this target (ignoring processor)

LLVM ERROR: 64-bit code requested on a subtarget that doesn't support it!

And removing groups arg also would make training on CPU work. However I didn't test training on CPU before the upgrade.

My device is a MacBook Pro 14' running macOS 12.4.

Post not yet marked as solved Up vote post of wangcheng Down vote post of wangcheng
251 views

Replies

Hi @wangcheng

Thanks for reporting the problem. I am seeing the same behavior locally and will investigate what has caused this issue.