how to fix this error when training CNN model?

Question

Created Apr ’22

Replies 1

Boosts 0

Participants 2

Error is here InvalidArgumentError: Cannot assign a device for operation model_1/conv2d_1/Conv2D/ReadVariableOp: Could not satisfy explicit device specification '' because the node {{colocation_node model_1/conv2d_1/Conv2D/ReadVariableOp}} was colocated with a group of nodes that required incompatible device '/job:localhost/replica:0/task:0/device:GPU:0'. All available devices [/job:localhost/replica:0/task:0/device:CPU:0, /job:localhost/replica:0/task:0/device:GPU:0]. Colocation Debug Info: Colocation group had the following types and supported devices: Root Member(assigned_device_name_index_=2 requested_device_name_='/job:localhost/replica:0/task:0/device:GPU:0' assigned_device_name_='/job:localhost/replica:0/task:0/device:GPU:0' resource_device_name_='/job:localhost/replica:0/task:0/device:GPU:0' supported_device_types_=[CPU] possible_devices_=[] ResourceApplyAdaMax: CPU ReadVariableOp: GPU CPU _Arg: GPU CPU

Colocation members, user-requested devices, and framework assigned devices, if any: model_1_conv2d_1_conv2d_readvariableop_resource (_Arg) framework assigned device=/job:localhost/replica:0/task:0/device:GPU:0 adamax_adamax_update_resourceapplyadamax_m (_Arg) framework assigned device=/job:localhost/replica:0/task:0/device:GPU:0 adamax_adamax_update_resourceapplyadamax_v (_Arg) framework assigned device=/job:localhost/replica:0/task:0/device:GPU:0 model_1/conv2d_1/Conv2D/ReadVariableOp (ReadVariableOp) Adamax/Adamax/update/ResourceApplyAdaMax (ResourceApplyAdaMax) /job:localhost/replica:0/task:0/device:GPU:0

 [[{{node model_1/conv2d_1/Conv2D/ReadVariableOp}}]] [Op:__inference_train_function_5897]

Boost

Answer 1

Frameworks Engineer OP

Apple

Jun ’22

Hi @Blooo513

Thanks for reporting the issue! This shows that the ResourceApplyAdaMax op is missing the GPU registration at the moment. We are working on including this in the tensorflow-metal. We will update here once it is available. Until then either using another optimizer or running this on the CPU can circumvent this issue.

0