Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.

Question

suprateembanerjee OP

Created Oct ’21

Replies 44

Boosts 9

Views 36k

Participants 59

Device: MacBook Pro 16 M1 Max, 64GB running MacOS 12.0.1.

I tried setting up GPU Accelerated TensorFlow on my Mac using the following steps:

Setup: XCode CLI / Homebrew/ Miniforge
Conda Env: Python 3.9.5
conda install -c apple tensorflow-deps
python -m pip install tensorflow-macos
python -m pip install tensorflow-metal
brew install libjpeg
conda install -y matplotlib jupyterlab
In Jupyter Lab, I try to execute this code:

 from tensorflow.keras import layers
from tensorflow.keras import models
model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.Flatten())
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(10, activation='softmax'))
model.summary()

The code executes, but I get this warning, indicating no GPU Acceleration can be used as it defaults to a 0MB GPU. Error:

 Metal device set to: Apple M1 Max
2021-10-27 08:23:32.872480: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:305] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2021-10-27 08:23:32.872707: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:271] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: <undefined>)

Anyone has any idea how to fix this? I came across a bunch of posts around here related to the same issue but with no solid fix. I created a new question as I found the other questions less descriptive of the issue, and wanted to comprehensively depict it. Any fix would be of much help.

Boost

Answer 1

SteadyFa OP

Oct ’21

ME TOO！！！！

2

Answer 2

dkjdjdfdskln OP

Oct ’21

me too....

1

Answer 3

YukiFujisawa OP

Oct ’21

# pip uninstall tensorflow-metal

1

Answer 4

dkga OP

Nov ’21

MacOS with AMD GPU here. I am using tensorflow for metal as soon as it was launched, with GPU acceleration. Sometimes I get the same message (Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.). But, it still uses the GPU. You can check that by opening Activity Monitor, then pressing Cmd + 3 and Cmd + 4, which shows you GPU and CPU usage.

One potential problem for you, assuming it still is not benefitting from acceleration, could be that you are using Python 3.9. If I recall correctly, tensorflow for metal requires Python 3.8. That is what I am using, without any problems.

Hope the above can help you & others!

4

Answer 5

yawarkhalid OP

Nov ’21

ME TOO! And my kernal in jupyter also dies.

0

Answer 6

yawarkhalid OP

Nov ’21

Okay I don't know if you guys faced this issue but with me the kernel also died and the GPU wasn't being used. And I discovered the issue and fixed it. Im on the M1 Macbook Air.

This arises with the latest tensorflow-metal package ver. 0.2.0.

Just install the last version 0.1.2. of metal and the GPU will be utlized. You will still get the warning though but the training will run utlizing the GPU.

1

Answer 7

jsvnyc OP

Nov ’21

I have this same issue with my new MacBook Pro 14 with M1 Max fully loaded.

I've tried creating a clean python 3.8 and 3.9 installation following instructions here and elsewhere. Tried downgrading my tensor flow-metal package. Just about every possible combination in clean environments and new installations.

Bottom line is that while the GPU "works", it runs about 5x slower than running on pure CPU. Aka if I uninstall tensorflow-metal package the same training that took say 11 sec with the package installed takes only about 2.5 sec without the metal package. You can also replicate same results with forcing tensorflow to run on CPU with the metal package installed.

Looking at Activity Monitor during run suggests that the M1 Max GPU is in fact loaded with the package installed. It just performs horribly poorly, in fact so badly as to be unusable. My working assumption is that this is not the intended performance, but a bug.

What's concerning is that no maintainer in any of the forums, whether it be tensorflow/keras, or whether it be Apple forums, has really acknowledged that this is a bug. Perhaps there's confusion between the different manifestation of the bug in Intel vs. older M1 vs. newer M1 Pro/Max, as well as the different operating systems involved.

So let me be unambiguously clear: none of the stuff listed above here, or in other threads, involving reinstallation, downgrading packages, etc. makes this work on my M1 Max properly.

7

Answer 8

EricLau OP

Nov ’21

It is not working on M1 Max too although I have upgraded the latest version of MacOS 12 Monteray.

0

Answer 9

bilalkhann16 OP

Dec ’21

I got this same error on the M1 Macbook Air and solved it by changing the tensorflow-metal version to 0.1.1.

0

Answer 10

JIAOJIAO-MEI OP

Dec ’21

Metal device set to: Apple M1

systemMemory: 16.00 GB maxCacheSize: 5.33 GB

2021-12-13 19:59:56.135942: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:305] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support. 2021-12-13 19:59:56.136049: I tensorflow/core/common_runtime/pluggåable_device/pluggable_device_factory.cc:271] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: )

0

Answer 11

dbolshak OP

Dec ’21

I also faced this issue. My OS version: Monterey (12.1) M1 Max CPU and 64GB RAM I could not solve the problem by re-installing.

0

Answer 12

ker2x OP

Dec ’21

It's a perfectly normal and harmless message on a M1. I have it too and my model & code works just fine.

 2021-12-20 23:19:04.025952: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:305] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2021-12-20 23:19:04.026364: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:271] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: <undefined>)
Metal device set to: Apple M1
 
systemMemory: 8.00 GB
maxCacheSize: 2.67 GB
 
__________________________________________________________________________________________________
2021-12-20 23:19:04.413489: W tensorflow/core/platform/profile_utils/cpu_utils.cc:128] Failed to get CPU frequency: 0 Hz
Epoch 1/10
2021-12-20 23:19:04.723827: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
32/32 [==============================] - ETA: 0s - loss: 0.0256 - accuracy: 0.9605 - mae: 0.0933 - mse: 0.02562021-12-20 23:19:24.073636: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
32/32 [==============================] - 20s 608ms/step - loss: 0.0256 - accuracy: 0.9605 - mae: 0.0933 - mse: 0.0256 - val_loss: 0.0100 - val_accuracy: 0.9855 - val_mae: 0.0650 - val_mse: 0.0100
Epoch 2/10
32/32 [==============================] - 19s 585ms/step - loss: 0.0079 - accuracy: 0.9787 - mae: 0.0568 - mse: 0.0079 - val_loss: 0.0063 - val_accuracy: 0.9869 - val_mae: 0.0534 - val_mse: 0.0063
Epoch 3/10
32/32 [==============================] - 18s 575ms/step - loss: 0.0060 - accuracy: 0.9700 - mae: 0.0506 - mse: 0.0060 - val_loss: 0.0045 - val_accuracy: 0.9776 - val_mae: 0.0438 - val_mse: 0.0045
Epoch 4/10
....

1

Answer 13

raymondjiii OP

Jan ’22

Metal device set to: AMD Radeon Pro 5600M

2022-01-13 17:02:36.447465: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. 2022-01-13 17:02:36.448221: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:305] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support. 2022-01-13 17:02:36.448581: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:271] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: )

Prior to running my model:

print("Num GPUs Available: ", len(tf.config.experimental.list_physical_devices('GPU')))

Num GPUs Available: 1

I was excited to see an tensorflow-apple version 2.7 but STILL does not work.

1

Answer 14

hkayan OP

Mar ’22

This issue still persists.

1

Answer 15

phipag OP

Apr ’22

Same issue here.

1

	from tensorflow.keras import layers
	from tensorflow.keras import models
	model = models.Sequential()
	model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)))
	model.add(layers.MaxPooling2D((2, 2)))
	model.add(layers.Conv2D(64, (3, 3), activation='relu'))
	model.add(layers.MaxPooling2D((2, 2)))
	model.add(layers.Conv2D(64, (3, 3), activation='relu'))
	model.add(layers.Flatten())
	model.add(layers.Dense(64, activation='relu'))
	model.add(layers.Dense(10, activation='softmax'))
	model.summary()

	Metal device set to: Apple M1 Max
	2021-10-27 08:23:32.872480: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:305] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
	2021-10-27 08:23:32.872707: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:271] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: <undefined>)

	2021-12-20 23:19:04.025952: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:305] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
	2021-12-20 23:19:04.026364: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:271] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: <undefined>)
	Metal device set to: Apple M1

	systemMemory: 8.00 GB
	maxCacheSize: 2.67 GB

	__________________________________________________________________________________________________
	2021-12-20 23:19:04.413489: W tensorflow/core/platform/profile_utils/cpu_utils.cc:128] Failed to get CPU frequency: 0 Hz
	Epoch 1/10
	2021-12-20 23:19:04.723827: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
	32/32 [==============================] - ETA: 0s - loss: 0.0256 - accuracy: 0.9605 - mae: 0.0933 - mse: 0.02562021-12-20 23:19:24.073636: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
	32/32 [==============================] - 20s 608ms/step - loss: 0.0256 - accuracy: 0.9605 - mae: 0.0933 - mse: 0.0256 - val_loss: 0.0100 - val_accuracy: 0.9855 - val_mae: 0.0650 - val_mse: 0.0100
	Epoch 2/10
	32/32 [==============================] - 19s 585ms/step - loss: 0.0079 - accuracy: 0.9787 - mae: 0.0568 - mse: 0.0079 - val_loss: 0.0063 - val_accuracy: 0.9869 - val_mae: 0.0534 - val_mse: 0.0063
	Epoch 3/10
	32/32 [==============================] - 18s 575ms/step - loss: 0.0060 - accuracy: 0.9700 - mae: 0.0506 - mse: 0.0060 - val_loss: 0.0045 - val_accuracy: 0.9776 - val_mae: 0.0438 - val_mse: 0.0045
	Epoch 4/10
	....