Hi, I am attempting to run my tensorflow model on the GPU on my M1 Mac (M1 Max Macbook Pro). I am getting the following exception:
InvalidArgumentError: Cannot assign a device for operation transformer/encoder/embedding/embedding_lookup: Could not satisfy explicit device specification '' because the node {{colocation_node transformer/encoder/embedding/embedding_lookup}} was colocated with a group of nodes that required incompatible device '/job:localhost/replica:0/task:0/device:GPU:0'. All available devices [/job:localhost/replica:0/task:0/device:CPU:0, /job:localhost/replica:0/task:0/device:GPU:0].
Colocation Debug Info:
Colocation group had the following types and supported devices:
Root Member(assigned_device_name_index_=2 requested_device_name_='/job:localhost/replica:0/task:0/device:GPU:0' assigned_device_name_='/job:localhost/replica:0/task:0/device:GPU:0' resource_device_name_='/job:localhost/replica:0/task:0/device:GPU:0' supported_device_types_=[CPU] possible_devices_=[]
AssignSubVariableOp: GPU CPU
RealDiv: GPU CPU
Sqrt: GPU CPU
AssignVariableOp: GPU CPU
UnsortedSegmentSum: GPU CPU
Identity: GPU CPU
StridedSlice: CPU
Const: GPU CPU
NoOp: GPU CPU
Mul: GPU CPU
Shape: GPU CPU
_Arg: GPU CPU
ResourceScatterAdd: GPU CPU
Unique: GPU CPU
ReadVariableOp: GPU CPU
AddV2: GPU CPU
ResourceGather: GPU CPU
Colocation members, user-requested devices, and framework assigned devices, if any:
transformer_encoder_embedding_embedding_lookup_24108 (_Arg) framework assigned device=/job:localhost/replica:0/task:0/device:GPU:0
adam_adam_update_readvariableop_resource (_Arg) framework assigned device=/job:localhost/replica:0/task:0/device:GPU:0
adam_adam_update_readvariableop_2_resource (_Arg) framework assigned device=/job:localhost/replica:0/task:0/device:GPU:0
transformer/encoder/embedding/embedding_lookup (ResourceGather)
transformer/encoder/embedding/embedding_lookup/Identity (Identity)
Adam/Adam/update/Unique (Unique) /job:localhost/replica:0/task:0/device:GPU:0
Adam/Adam/update/Shape (Shape) /job:localhost/replica:0/task:0/device:GPU:0
Adam/Adam/update/strided_slice/stack (Const) /job:localhost/replica:0/task:0/device:GPU:0
Adam/Adam/update/strided_slice/stack_1 (Const) /job:localhost/replica:0/task:0/device:GPU:0
Adam/Adam/update/strided_slice/stack_2 (Const) /job:localhost/replica:0/task:0/device:GPU:0
Adam/Adam/update/strided_slice (StridedSlice) /job:localhost/replica:0/task:0/device:GPU:0
Adam/Adam/update/UnsortedSegmentSum (UnsortedSegmentSum) /job:localhost/replica:0/task:0/device:GPU:0
Adam/Adam/update/mul (Mul) /job:localhost/replica:0/task:0/device:GPU:0
Adam/Adam/update/ReadVariableOp (ReadVariableOp)
Adam/Adam/update/mul_1 (Mul) /job:localhost/replica:0/task:0/device:GPU:0
Adam/Adam/update/AssignVariableOp (AssignVariableOp) /job:localhost/replica:0/task:0/device:GPU:0
Adam/Adam/update/ResourceScatterAdd (ResourceScatterAdd) /job:localhost/replica:0/task:0/device:GPU:0
Adam/Adam/update/ReadVariableOp_1 (ReadVariableOp)
Adam/Adam/update/mul_2 (Mul) /job:localhost/replica:0/task:0/device:GPU:0
Adam/Adam/update/mul_3 (Mul) /job:localhost/replica:0/task:0/device:GPU:0
Adam/Adam/update/ReadVariableOp_2 (ReadVariableOp)
Adam/Adam/update/mul_4 (Mul) /job:localhost/replica:0/task:0/device:GPU:0
Adam/Adam/update/AssignVariableOp_1 (AssignVariableOp) /job:localhost/replica:0/task:0/device:GPU:0
Adam/Adam/update/ResourceScatterAdd_1 (ResourceScatterAdd) /job:localhost/replica:0/task:0/device:GPU:0
Adam/Adam/update/ReadVariableOp_3 (ReadVariableOp)
Adam/Adam/update/Sqrt (Sqrt) /job:localhost/replica:0/task:0/device:GPU:0
Adam/Adam/update/mul_5 (Mul) /job:localhost/replica:0/task:0/device:GPU:0
Adam/Adam/update/add (AddV2) /job:localhost/replica:0/task:0/device:GPU:0
Adam/Adam/update/truediv (RealDiv) /job:localhost/replica:0/task:0/device:GPU:0
Adam/Adam/update/AssignSubVariableOp (AssignSubVariableOp) /job:localhost/replica:0/task:0/device:GPU:0
Adam/Adam/update/group_deps/NoOp (NoOp) /job:localhost/replica:0/task:0/device:GPU:0
Adam/Adam/update/group_deps/NoOp_1 (NoOp) /job:localhost/replica:0/task:0/device:GPU:0
Adam/Adam/update/group_deps (NoOp) /job:localhost/replica:0/task:0/device:GPU:0
[[{{node transformer/encoder/embedding/embedding_lookup}}]] [Op:__inference_train_function_27918]
The code I am running is from here (it is a chatbot demo using transformers from Tensorflow itself: https://github.com/tensorflow/examples/blob/tflmm/v0.2.4/community/en/transformer_chatbot.ipynb
I suspect the issue is here:
StridedSlice: CPU
However, I have seen from some other stack traces that have been posted for similar issues that this operation is in fact enabled for GPU - so I am not sure what the issue is.
Thanks!