Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.

Question

suprateembanerjee OP

Created Oct ’21

Replies 44

Boosts 9

Views 37k

Participants 59

Device: MacBook Pro 16 M1 Max, 64GB running MacOS 12.0.1.

I tried setting up GPU Accelerated TensorFlow on my Mac using the following steps:

Setup: XCode CLI / Homebrew/ Miniforge
Conda Env: Python 3.9.5
conda install -c apple tensorflow-deps
python -m pip install tensorflow-macos
python -m pip install tensorflow-metal
brew install libjpeg
conda install -y matplotlib jupyterlab
In Jupyter Lab, I try to execute this code:

from tensorflow.keras import layers
from tensorflow.keras import models
model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.Flatten())
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(10, activation='softmax'))
model.summary()

The code executes, but I get this warning, indicating no GPU Acceleration can be used as it defaults to a 0MB GPU. Error:

Metal device set to: Apple M1 Max
2021-10-27 08:23:32.872480: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:305] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2021-10-27 08:23:32.872707: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:271] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: <undefined>)

Anyone has any idea how to fix this? I came across a bunch of posts around here related to the same issue but with no solid fix. I created a new question as I found the other questions less descriptive of the issue, and wanted to comprehensively depict it. Any fix would be of much help.

Answer 1

MikeBee OP

Jul ’22

Why can't this be marked as solved so the answer can appear at the top? The M1 is a UMA device -- not a NUMA. A central point of this SOC is its unified memory. This message just says that Tensorflow recognizes that it's not a non-unified memory architecture system. Then it chooses the appropriate UMA algorithms. How does this forum help if answers are buried in noise? Stack Overflow does a better job.

Answer 2

karbapi OP

Jul ’22

FOR M1 ULTRA (128GB RAM, 20c CPU, 64c GPU) on MacOS 12.5, getting the following message:

Metal device set to: Apple M1 Ultra systemMemory: 128.00 GB maxCacheSize: 48.00 GB

2022-07-22 16:44:43.488061: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:305] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.

2022-07-22 16:44:43.488273: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:271] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: )

My question is: is Why is this error coming at all? Why NUMA? Moreover, GPU has 0MB memory? How is this possible?

Python: 3.9.13 tensorflow-macos: 2.9.2 tensorflow-metal: 0.5.0

Please help. Thanks, Bapi

Answer 3

karbapi OP

Aug ’22

I intended to speedup the training process. now what is this (got during training with workers=8, use_multiprocessing=True)? STRANGE!!!! Never got it with my MBP-13 (2017, i5 core, 16GB RAM) with the same code.

Traceback (most recent call last): File "", line 1, in File "/Users/bapikar/miniforge3/envs/tf28_python38/lib/python3.8/multiprocessing/spawn.py", line 116, in spawn_main exitcode = _main(fd, parent_sentinel) File "/Users/bapikar/miniforge3/envs/tf28_python38/lib/python3.8/multiprocessing/spawn.py", line 126, in _main self = reduction.pickle.load(from_parent) File "/Users/bapikar/miniforge3/envs/tf28_python38/lib/python3.8/multiprocessing/synchronize.py", line 110, in setstate self._semlock = _multiprocessing.SemLock._rebuild(*state) FileNotFoundError: [Errno 2] No such file or directory

Answer 4

te_neon OP

Sep ’22

I bought a new M1 Macbook pro M1 and I was hoping i could use it for machine learning on GPU (with metal). But i it doesn't work on GPU at all!! It gets stuck on model.fit with the famous error message "NUMA node of platform GPU ID 0". And if i switch to CPU, then it does work actually, but the point of (deep) machine learning is to utilize the GPU, to make training faster, right?! Apple, please help us, fix this issue, take it seriously.

Answer 5

wsimpson2019 OP

Oct ’22

Was having the same issue. script would crash after message, bus error. I'm using Mac Pro (Late 2013) AMD FirePro D500 3 GB

What fixed it for me was, in my tensorflow-metal virtual env, I changed my version $pip install =Iv tensorflow-metal==0.60

Then inside my script set the following, after importing os os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'

After this I ran the script and it worked and in Activity Monitor I could see it uses the GPU

Answer 6

ClementTong OP

Jan ’23

Same problem on the M1 MacBook Air with Mac OS 12.6 (Monetary) 2023-01-24 01:32:00.798327: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:306] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support. 2023-01-24 01:32:00.798694: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:272] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: <undefined>) WARNING:tensorflow:AutoGraph could not transform <function normalize_img at 0x16a452cb0> and will run it as-is. Cause: Unable to locate the source code of <function normalize_img at 0x16a452cb0>. Note that functions defined in certain environments, like the interactive Python shell, do not expose their source code. If that is the case, you should define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.experimental.do_not_convert. Original error: could not get source code To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert

Answer 7

gruntdev OP

Jan ’23

The NUMA error message for an Apple Silicon computer is benign and can be ignored. Apple silicon memory is UMA (unified memory architecture) not NUMA

Answer 8

JakeRussell OP

Feb ’23

Experiencing the same issue February 2023, running MacOS Ventura 13.2 on 2021 M1 Pro.

Answer 9

Maximiliami OP

Feb ’23

Same here M1 Pro Ventura 13.1

Answer 10

Caeru95 OP

Feb ’23

For me the fix was matching the current version of the tensoflow-macos with tensorflor-metal If you scroll down here -> https://developer.apple.com/metal/tensorflow-plugin/ You'll reach the "Release" section And then the compatible tensorflow-macos and metal plugin are there showed, for example in order to use the metal plugin with the version 0.5 you've to manually set the version of tensorflow as 2.9.0

python -m pip install tensorflow-macos==2.90
pip install tensorflow-metal==0.5.0

Answer 11

jurajw OP

Feb ’23

New tensorflow-metal 0.7.1 was released on 8 Feb 2023 !

It is with regret to tell you that it solves nothing, the annoying and useless NUMA etc. warnings are still there, and TF training fails on "OP_REQUIRES failed at xla_ops.cc".

Answer 12

xrois OP

Feb ’23

[SOLUTION]

You should uninstall tensorflow-macos and tensorflow-metal because the main reason for this problem is the version conflict between these. After uninstalling, you should reinstall tensorflow-macos with target version 2.9 and install tensorflow-metal with target version 0.5.0 to fix this issue.

pip uninstall tensorflow-macos
pip uninstall tensorflow-metal
python -m pip install tensorflow-macos==2.9
pip install tensorflow-metal==0.5.0

Thanks 😇

Answer 13

johngage OP

Feb ’23

Thanks, xrois, I did that, but tensorflow still fails.

Here's the result;

I'm trying to follow the script in Jupyter Notebooks from VSCode, which many people must also be trying, as they learn VSCode...and tensorflow fails.

Here's the script from code.visualstudio.com: https://code.visualstudio.com/docs/datascience/data-science-tutorial#_prepare-the-data

But when I run this, following each step, tensorflow on my MacBook Air, M1, 2020, FAILS.

I followed xrois's suggestions to install 2.9 and 0.5.0, but it still failed. I'll reinstall the tensor-flow metal 0.7.1 just released, but I don't expect any resolution.

I'll report if it works.

Anyone have any success running this?

How can I report this in a verbose mode to tensorflow? Do they listen?

-------Here's the sequence----------

!python3 -m pip install tensorflow-macos==2.9 !pip3 install tensorflow-metal==0.5.0

Run !pip3 list | tensor

tensorboard 2.9.1 tensorboard-data-server 0.6.1 tensorboard-plugin-wit 1.8.1 tensorflow-estimator 2.9.0 tensorflow-macos 2.9.0 tensorflow-metal 0.5.0

Then set up model:

#run model again from keras.models import Sequential from keras.layers import Dense model = Sequential() model.add(Dense(5, kernel_initializer = 'uniform', activation = 'relu', input_dim = 5)) model.add(Dense(5, kernel_initializer = 'uniform', activation = 'relu')) model.add(Dense(1, kernel_initializer = 'uniform', activation = 'sigmoid'))

Everything looks good:

Model: "sequential_3"

Layer (type) Output Shape Param #

dense_3 (Dense) (None, 5) 30

dense_4 (Dense) (None, 5) 30

dense_5 (Dense) (None, 1) 6

================================================================= Total params: 66 Trainable params: 66 Non-trainable params: 0

then, FAILURE
now, compile the model: will accuracy be %61?

model.compile(optimizer="adam", loss='binary_crossentropy', metrics=['accuracy']) model.fit(X_train, y_train, batch_size=32, epochs=50)

RESULT: And it goes on with many code failures

Epoch 1/50 WARNING:tensorflow:AutoGraph could not transform <function Model.make_train_function..train_function at 0x298e87910> and will run it as-is. Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, export AUTOGRAPH_VERBOSITY=10) and attach the full output. Cause: closure mismatch, requested ('self', 'step_function'), but source function had () To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert WARNING: AutoGraph could not transform <function Model.make_train_function..train_function at 0x298e87910> and will run it as-is. Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, export AUTOGRAPH_VERBOSITY=10) and attach the full output. Cause: closure mismatch, requested ('self', 'step_function'), but source function had () To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert Unexpected exception formatting exception. Falling back to standard exception

Answer 14

Freddie2046 OP

Aug ’23

Guys, look into this article about the UMA: https://www.cgdirector.com/apple-unified-memory-guide/ Just like some of the people said, the UMA is the edge not causing any problem if we implement our codes correctly. In fact, the performance shall be even better on a condition, you have enough memory for your project.

Answer 15

MikeCroswell OP

Sep ’23

I'm getting the warning after running tensorflow.keras.model's Sequential() (that is, "...could not identify NUMA node on GPU...")

Still, TensorFlow seems to be working in Jupyter Notebook. running on a Apple M2, Ventura 13.5.1

For example, next I can compile the model from the Sequential call, and fit it:

history = model.fit(X_train, y_train, batch_size=100, epochs=10, verbose=2, validation_data=(X_test, y_test))

I do get a warning on the first two epochs but the subsequent others continue on without the warnings:

` ... 25/25 - 2s - loss: 0.6618 - accuracy: 0.5917 - val_loss: 0.6444 - val_accuracy: 0.7475 - 2s/epoch - 73ms/step Epoch 2/10

2023-09-22 15:01:03.984920: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:114] Plugin optimizer for device_type GPU is enabled. ... `

My versions according to conda list shows:

python 3.11.4

tensorboard 2.13.0 tensorboard-data-server 0.7.1

tensorflow 2.13.0
tensorflow-estimator 2.13.0
tensorflow-macos 2.13.0 tensorflow-metal 1.0.1

Hope it helps.

Answer 16

gluque OP

Sep ’23

Apple M1/M2 processors are endowed with an unified memory architecture (UMA) i.e. memory shared by all components including GPUs. They are not NUMA (Non-Unified Memory Architecture) where you will find different memory spaces e.g. RAM for CPUs/GPUs --> https://en.wikipedia.org/wiki/Apple_M1

Answer 17

ppimas OP

Feb ’24

Switching from python 3.09 to python 3.11 works for me in a M2 Pro

Answer 18

daleboy OP

Jun ’24

You can just add the import ResNet50 or something package imported to use GPU at the top of your code file. That's fine for me.

Answer 19

Anushanga OP

Jul ’24

Since Apple silicon use UMA(Unified Memory Architecture) you can just ignore this waring. But make sure Tensorflow works fine. In my case it didn't worked because of version mismatches in packages. Below combination working fine so far.

python_version ==3.11.9
tensorflow ==2.15.0
tensorflow-macos ==2.15.0
tensorflow-metal ==1.1.0