Issues and Errors when runing tensorflow on GPU, but not CPU.

Question

windlove OP

Created Sep ’22

Replies 3

Boosts 0

Views 1.4k

Participants 4

I was following the text classification tutorial below, and using tensorflow-macos 2.9.0. on Macbook M1.

https://www.tensorflow.org/text/tutorials/text_classification_rnn.

However, it ran into three issues,

If GPU was enabled, the model fit was extremely slow, while disabling GPU would make the model fit faster.
Error when fitting the model with GPU enabled. The model still kept running after showing the following messages, but very very slow.

W tensorflow/core/common_runtime/forward_type_inference.cc:332] Type inference failed. This indicates an invalid graph that escaped type checking. Error message: INVALID_ARGUMENT: expected compatible input types, but input 1:
type_id: TFT_OPTIONAL
args {
  type_id: TFT_PRODUCT
  args {
    type_id: TFT_TENSOR
    args {
      type_id: TFT_INT32
    }
  }
}
 is neither a subtype nor a supertype of the combined inputs preceding it:
type_id: TFT_OPTIONAL
args {
  type_id: TFT_PRODUCT
  args {
    type_id: TFT_TENSOR
    args {
      type_id: TFT_FLOAT
    }
  }
}

Two results were supposed to be identical, but they were identical when GPU was disabled. When GPU is enabled, they were not.

When GPU is enabled

To confirm that this works as expected, evaluate a sentence twice. First, alone so there's no padding to mask:

1/1 [==============================] - 1s 1s/step
[0.00808082]

Now, evaluate it again in a batch with a longer sentence. The result should be identical:

1/1 [==============================] - 16s 16s/step
[-0.01561341]

When GPU is disabled

(First run as above)
1/1 [==============================] - 1s 1s/step
[-0.0032991]

(second run)
1/1 [==============================] - 0s 71ms/step
[-0.0032991]

Boost

Answer 1

vasileiosgk OP

Oct ’22

I face exactly the same problem as I mention on my post https://developer.apple.com/forums/thread/709142

Still waiting for this issue to be addressed.

0

Answer 2

gianrond OP

Nov ’22

Hello,

I have the exact same issue when training any RNN (I tried both LSTMs and GRUs) model with my MBP16 with M1 Max 32C.

I get the same exception/warning, and the performance with the GPU is horrible. Disabling the CPU results in no warning, and better performance. I don't have the same issue with CNNs.

All of this is with the latest tensorflow-macos (2.10) and tensorflow-metal 0.6.

0

Answer 3

shrinix OP

Aug ’24

Hi Apple support,

Checking to see if there any resolution to this problem?

I have the same problem (see below for details) when training an LSTM-based seq2seq model with M3 PRO and tensorflow 2.15.1 and tensor flow-metal 1.1.0.

The error message goes away if I upgrade to tensorflow 2.16 or 2.17 but the training accuracy goes down drastically so I had to switch back to 2.15.

With tensorflow 2.15, training accuracy and inference are much better with CPU (i.e. uninstalling metal) than with GPU. But it takes 3x longer training with CPU.

There is not benefit to using M3 PRO GPU.if it is faster but produces poor results. Probably the error message below is responsible for the poor results?

2024-08-06 15:27:22.619169: W tensorflow/core/common_runtime/type_inference.cc:339] Type inference failed. This indicates an invalid graph that escaped type checking. Error message: INVALID_ARGUMENT: expected compatible input types, but input 1: type_id: TFT_OPTIONAL args { type_id: TFT_PRODUCT args { type_id: TFT_TENSOR args { type_id: TFT_INT32 } } } is neither a subtype nor a supertype of the combined inputs preceding it: type_id: TFT_OPTIONAL args { type_id: TFT_PRODUCT args { type_id: TFT_TENSOR args { type_id: TFT_FLOAT } } } for Tuple type infernce function 0 while inferring type of node 'cond_36/output/_23'

0