Post not yet marked as solved
The problem as caused by version 12.0.0 of ld which lived in my Anaconda virtual environment. ld 13.1.6 did not have the issue.
% ld -v
@(#)PROGRAM:ld PROJECT:ld64-764
BUILD 11:29:01 May 17 2022
configured to support archs: armv6 armv7 armv7s arm64 arm64e arm64_32 i386 x86_64 x86_64h armv6m armv7k armv7m armv7em
LTO support using: LLVM version 13.1.6, (clang-1316.0.21.2.5) (static support for 28, runtime is 28)
TAPI support using: Apple TAPI version 13.1.6 (tapi-1316.0.7.3)
Post not yet marked as solved
I am getting this error trying to compile AI-Feynman
ld: unsupported tapi file type '!tapi-tbd' in YAML file '/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/lib/libSystem.tbd' for architecture x86_64
I tried to generate a new .tbd file from libSystem.dylib with 'tapi stubify ...' but I can't locate the libSystem.B.dylib file.
The other .dylibs in XCode are not the right ones.
% locate libSystem.B.dylib
/Applications/Xcode.app/Contents/Developer/Platforms/AppleTVOS.platform/Library/Developer/CoreSimulator/Profiles/Runtimes/tvOS.simruntime/Contents/Resources/RuntimeRoot/usr/lib/libSystem.B.dylib
/Applications/Xcode.app/Contents/Developer/Platforms/WatchOS.platform/Library/Developer/CoreSimulator/Profiles/Runtimes/watchOS.simruntime/Contents/Resources/RuntimeRoot/usr/lib/libSystem.B.dylib
/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Library/Developer/CoreSimulator/Profiles/Runtimes/iOS.simruntime/Contents/Resources/RuntimeRoot/usr/lib/libSystem.B.dylib
Any ideas on how to generate a replacement .tbd file from a 'virtual' shared library which lives in a cache?
% otool -L /Applications/Xcode.app/Contents/Developer/Platforms/AppleTVOS.platform/Library/Developer/CoreSimulator/Profiles/Runtimes/tvOS.simruntime/Contents/Resources/RuntimeRoot/usr/lib/libSystem.dylib
/Applications/Xcode.app/Contents/Developer/Platforms/AppleTVOS.platform/Library/Developer/CoreSimulator/Profiles/Runtimes/tvOS.simruntime/Contents/Resources/RuntimeRoot/usr/lib/libSystem.dylib (architecture x86_64):
/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1311.100.3)
/usr/lib/system/libcache.dylib (compatibility version 1.0.0, current version 85.0.0)
/usr/lib/system/libcommonCrypto.dylib (compatibility version 1.0.0, current version 60191.100.1)
/usr/lib/system/libcompiler_rt.dylib (compatibility version 1.0.0, current version 103.1.0)
/usr/lib/system/libcopyfile.dylib (compatibility version 1.0.0, current version 1.0.0)
/usr/lib/system/libcorecrypto.dylib (compatibility version 1.0.0, current version 1218.100.47)
/usr/lib/system/libdispatch.dylib (compatibility version 1.0.0, current version 1325.100.36)
/usr/lib/system/libdyld.dylib (compatibility version 1.0.0, current version 1.0.0)
/usr/lib/system/libmacho.dylib (compatibility version 1.0.0, current version 994.0.0)
/usr/lib/system/libremovefile.dylib (compatibility version 1.0.0, current version 60.0.0)
/usr/lib/system/libsystem_asl.dylib (compatibility version 1.0.0, current version 392.100.2)
/usr/lib/system/libsystem_blocks.dylib (compatibility version 1.0.0, current version 79.1.0)
/usr/lib/system/libsystem_c.dylib (compatibility version 1.0.0, current version 1507.100.9)
/usr/lib/system/libsystem_collections.dylib (compatibility version 1.0.0, current version 1507.100.9)
/usr/lib/system/libsystem_configuration.dylib (compatibility version 1.0.0, current version 1163.100.19)
/usr/lib/system/libsystem_containermanager.dylib (compatibility version 1.0.0, current version 1.0.0)
/usr/lib/system/libsystem_coreservices.dylib (compatibility version 1.0.0, current version 133.0.0)
/usr/lib/system/libsystem_darwin.dylib (compatibility version 1.0.0, current version 1.0.0)
/usr/lib/system/libsystem_dnssd.dylib (compatibility version 1.0.0, current version 1557.103.1)
/usr/lib/system/libsystem_featureflags.dylib (compatibility version 1.0.0, current version 56.0.0)
/usr/lib/system/libsystem_info.dylib (compatibility version 1.0.0, current version 1.0.0)
/usr/lib/system/libsystem_m.dylib (compatibility version 1.0.0, current version 3204.80.2)
/usr/lib/system/libsystem_malloc.dylib (compatibility version 1.0.0, current version 374.100.5)
/usr/lib/system/libsystem_networkextension.dylib (compatibility version 1.0.0, current version 1.0.0)
/usr/lib/system/libsystem_notify.dylib (compatibility version 1.0.0, current version 301.0.0)
/usr/lib/system/libsystem_product_info_filter.dylib (compatibility version 1.0.0, current version 10.0.0)
/usr/lib/system/libsystem_sandbox.dylib (compatibility version 1.0.0, current version 1657.103.1)
/usr/lib/system/libsystem_sim_kernel.dylib (compatibility version 1.0.0, current version 238.100.1)
/usr/lib/system/libsystem_sim_platform.dylib (compatibility version 1.0.0, current version 238.100.1)
/usr/lib/system/libsystem_sim_pthread.dylib (compatibility version 1.0.0, current version 238.100.1)
/usr/lib/system/libsystem_trace.dylib (compatibility version 1.0.0, current version 1375.100.9)
/usr/lib/system/libunwind.dylib (compatibility version 1.0.0, current version 202.2.0)
...
(base) davidlaxer@x86_64-apple-darwin13 iot-inspector-client % ls -l /usr/lib/system
total 1720
drwxr-xr-x 4 root wheel 128 May 9 14:30 introspection
-rwxr-xr-x 1 root wheel 1617536 May 9 14:30 libsystem_kernel.dylib
-rwxr-xr-x 1 root wheel 512560 May 9 14:30 libsystem_platform.dylib
-rwxr-xr-x 1 root wheel 656656 May 9 14:30 libsystem_pthread.dylib
-rwxr-xr-x 1 root wheel 150080 May 9 14:30 wordexp-helper
Any ideas on what the linker doesn't like about
file type '!tapi-tbd' in YAML file '/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/lib/libSystem.tbd' for architecture x86_64
Post not yet marked as solved
I tried uninstalling and reinstalling CommandLineTools
% ls -ag /Library/Developer
total 0
drwxr-xr-x 4 wheel 128 May 31 18:18 .
drwxr-xr-x 72 wheel 2304 May 17 12:01 ..
drwxr-xr-x 6 wheel 192 May 31 18:17 CommandLineTools
drwxr-xr-x 8 admin 256 May 17 01:44 PrivateFrameworks
% xcrun --show-sdk-platform-path
xcrun: error: unable to lookup item 'PlatformPath' from command line tools installation
xcrun: error: unable to lookup item 'PlatformPath' in SDK '/Library/Developer/CommandLineTools/SDKs/MacOSX12.3.sdk'
(AI-Feynman) davidlaxer@x86_64-apple-darwin13 AI-Feynman % xcode-select -p
/Library/Developer/CommandLineTools
(AI-Feynman) davidlaxer@x86_64-apple-darwin13 AI-Feynman % xcrun --show-sdk-path --sdk macosx
/Library/Developer/CommandLineTools/SDKs/MacOSX12.3.sdk
(AI-Feynman) davidlaxer@x86_64-apple-darwin13 AI-Feynman % xcrun --sdk macosx10.13 --show-sdk-path
xcrun: error: SDK "macosx10.13" cannot be located
xcrun: error: SDK "macosx10.13" cannot be located
xcrun: error: unable to lookup item 'Path' in SDK 'macosx10.13'
Why the reference to macosx10.13? How do I delete the old SDK reference?
Post not yet marked as solved
The exception is generated building a list of document vectors from input documents not in model training:
E.g. -
document_vectors.append(self.embed(train_corpus[current:current + batch_size]))
The python 3.8 process grows in memory to 100GB and then generates the OOM exception.
def _embed_documents(self, train_corpus):
self._check_import_status()
self._check_model_status()
# embed documents
batch_size = 5
document_vectors = []
current = 0
batches = int(len(train_corpus) / batch_size)
extra = len(train_corpus) % batch_size
for ind in range(0, batches):
try:
__**document_vectors.append(self.embed(train_corpus[current:current + batch_size]))**__
except Exception as e:
print (e.__doc__)
print (e.message)
current += batch_size
if extra > 0:
document_vectors.append(self.embed(train_corpus[current:current + extra]))
document_vectors = self._l2_normalize(np.array(np.vstack(document_vectors)))
return document_vectors
Post not yet marked as solved
The AdamOptimizer is still causing crashes with:
tensorflow-metal 0.3.0
tensorflow-macos 2.7.0
Post not yet marked as solved
This code crashes with the 'adam' optimzer. It does work with 'SGD'.
I am running Monterey 12.1 beta, and the latest versions of tensorflow-macos and tensorflow-metal from pypi.
import tensorflow as tf
mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(10)
])
predictions = model(x_train[:1]).numpy()
tf.nn.softmax(predictions).numpy()
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
loss_fn(y_train[:1], predictions).numpy()
model.compile(optimizer = 'adam', loss = loss_fn)
model.fit(x_train, y_train, epochs=100)
Post not yet marked as solved
Hi,
Your example runs for me on Monterey 12.0.1 with Python 3.8 ... if I replace the ADAM optimizer with SGD.
model.compile( loss='sparse_categorical_crossentropy', optimizer=tf.keras.optimizers.SGD(0.001), metrics=['accuracy'], )
I've noticed ADAM crash the session.
Metal device set to: AMD Radeon Pro 5700 XT
2021-10-25 12:01:51.733970: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-10-25 12:01:51.734526: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:305] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2021-10-25 12:01:51.734764: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:271] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: <undefined>)
2021-10-25 12:01:51.902618: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:305] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2021-10-25 12:01:51.902647: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:271] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: <undefined>)
2021-10-25 12:01:52.021880: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2021-10-25 12:01:52.035650: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2021-10-25 12:01:52.081019: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2021-10-25 12:01:52.099696: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2021-10-25 12:01:52.211089: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2021-10-25 12:01:52.229341: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2021-10-25 12:01:52.237014: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2021-10-25 12:01:52.261855: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2021-10-25 12:01:52.279544: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:185] None of the MLIR Optimization Passes are enabled (registered 2)
2021-10-25 12:01:52.304527: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2021-10-25 12:01:52.324218: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
Train on 469 steps, validate on 79 steps
Epoch 1/12
469/469 [==============================] - ETA: 0s - batch: 234.0000 - size: 1.0000 - loss: 2.2622 - accuracy: 0.1993
/Users/davidlaxer/tensorflow-metal/lib/python3.8/site-packages/keras/engine/training.py:2470: UserWarning: `Model.state_updates` will be removed in a future version. This property should not be used in TensorFlow 2.0, as `updates` are applied automatically.
warnings.warn('`Model.state_updates` will be removed in a future version. '
2021-10-25 12:02:06.665054: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
469/469 [==============================] - 15s 19ms/step - batch: 234.0000 - size: 1.0000 - loss: 2.2622 - accuracy: 0.1993 - val_loss: 2.2074 - val_accuracy: 0.4665
Epoch 2/12
469/469 [==============================] - 11s 20ms/step - batch: 234.0000 - size: 1.0000 - loss: 2.1208 - accuracy: 0.3812 - val_loss: 1.9072 - val_accuracy: 0.6792
Epoch 3/12
469/469 [==============================] - 11s 21ms/step - batch: 234.0000 - size: 1.0000 - loss: 1.6169 - accuracy: 0.5601 - val_loss: 1.0289 - val_accuracy: 0.8151
Epoch 4/12
469/469 [==============================] - 12s 22ms/step - batch: 234.0000 - size: 1.0000 - loss: 1.0248 - accuracy: 0.6935 - val_loss: 0.5984 - val_accuracy: 0.8613
Epoch 5/12
469/469 [==============================] - 12s 23ms/step - batch: 234.0000 - size: 1.0000 - loss: 0.7831 - accuracy: 0.7570 - val_loss: 0.4718 - val_accuracy: 0.8799
Epoch 6/12
469/469 [==============================] - 13s 24ms/step - batch: 234.0000 - size: 1.0000 - loss: 0.6629 - accuracy: 0.7937 - val_loss: 0.4055 - val_accuracy: 0.8929
Epoch 7/12
469/469 [==============================] - 13s 24ms/step - batch: 234.0000 - size: 1.0000 - loss: 0.6024 - accuracy: 0.8123 - val_loss: 0.3660 - val_accuracy: 0.9007
Epoch 8/12
469/469 [==============================] - 13s 25ms/step - batch: 234.0000 - size: 1.0000 - loss: 0.5541 - accuracy: 0.8301 - val_loss: 0.3380 - val_accuracy: 0.9073
Epoch 9/12
469/469 [==============================] - 13s 25ms/step - batch: 234.0000 - size: 1.0000 - loss: 0.5244 - accuracy: 0.8397 - val_loss: 0.3181 - val_accuracy: 0.9121
Epoch 10/12
469/469 [==============================] - 13s 24ms/step - batch: 234.0000 - size: 1.0000 - loss: 0.4910 - accuracy: 0.8500 - val_loss: 0.2988 - val_accuracy: 0.9161
Epoch 11/12
469/469 [==============================] - 13s 25ms/step - batch: 234.0000 - size: 1.0000 - loss: 0.4683 - accuracy: 0.8570 - val_loss: 0.2857 - val_accuracy: 0.9186
Epoch 12/12
469/469 [==============================] - 14s 25ms/step - batch: 234.0000 - size: 1.0000 - loss: 0.4562 - accuracy: 0.8600 - val_loss: 0.2736 - val_accuracy: 0.9207
[1]:
<keras.callbacks.History at 0x7f8758fd9310>
[ ]:
Post not yet marked as solved
On my iMac 27" with Monterey 12.0.1 it crashes with the GPU in tensorflow-metal:
% python muzero.py
2021-10-21 08:36:21.088556: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
Metal device set to: AMD Radeon Pro 5700 XT
systemMemory: 128.00 GB
maxCacheSize: 7.99 GB
2021-10-21 08:36:21.089347: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:305] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2021-10-21 08:36:21.089966: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:271] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: <undefined>)
2021-10-21 08:36:21.753689: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:185] None of the MLIR Optimization Passes are enabled (registered 2)
2021-10-21 08:36:21.759239: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2021-10-21 08:36:34.888 python[14296:730686] -[MPSGraph adamUpdateWithLearningRateTensor:beta1Tensor:beta2Tensor:epsilonTensor:beta1PowerTensor:beta2PowerTensor:valuesTensor:momentumTensor:velocityTensor:gradientTensor:name:]: unrecognized selector sent to instance 0x600001b26220
zsh: segmentation fault python muzero.py
It runs with the CPU.
% python --version
Python 3.8.5
% pip freeze
absl-py==0.12.0
anyio==3.3.2
appnope==0.1.2
argon2-cffi==21.1.0
asttokens==2.0.5
astunparse==1.6.3
attrs==21.2.0
Babel==2.9.1
backcall==0.2.0
bleach==4.1.0
bokeh==2.3.3
cachetools==4.2.4
certifi==2021.5.30
cffi==1.14.6
charset-normalizer==2.0.6
clang==5.0
cloudpickle==2.0.0
colorama==0.4.4
cycler==0.10.0
Cython==0.29.24
debugpy==1.5.0
decorator==5.1.0
defusedxml==0.7.1
dill==0.3.4
distinctipy==1.1.5
dm-tree==0.1.6
dotmap==1.3.24
entrypoints==0.3
executing==0.8.2
flatbuffers==1.12
future==0.18.2
gast==0.4.0
gensim==3.8.3
google-auth==1.35.0
google-auth-oauthlib==0.4.6
google-pasta==0.2.0
googleapis-common-protos==1.53.0
grpcio==1.41.0
gviz-api==1.9.0
gym==0.21.0
h5py==3.1.0
hdbscan==0.8.27
icecream==2.1.1
idna==3.2
importlib-resources==5.2.2
ipykernel==6.4.1
ipython==7.28.0
ipython-genutils==0.2.0
ipywidgets==7.6.5
jedi==0.18.0
Jinja2==3.0.2
joblib==1.1.0
json5==0.9.6
jsonschema==4.0.1
jupyter-client==7.0.6
jupyter-core==4.8.1
jupyter-server==1.11.1
jupyterlab==3.1.18
jupyterlab-pygments==0.1.2
jupyterlab-server==2.8.2
jupyterlab-widgets==1.0.2
keras==2.6.0
Keras-Preprocessing==1.1.2
kiwisolver==1.3.2
llvmlite==0.37.0
Markdown==3.3.4
MarkupSafe==2.0.1
matplotlib==3.4.3
matplotlib-inline==0.1.3
memory-profiler==0.58.0
mistune==0.8.4
nbclassic==0.3.2
nbclient==0.5.4
nbconvert==6.2.0
nbformat==5.1.3
nest-asyncio==1.5.1
nmslib==2.1.1
notebook==6.4.4
numba==0.54.0
numpy==1.20.3
oauthlib==3.1.1
opt-einsum==3.3.0
packaging==21.0
pandas==1.3.3
pandocfilters==1.5.0
parso==0.8.2
pexpect==4.8.0
pickleshare==0.7.5
Pillow==8.3.2
prometheus-client==0.11.0
promise==2.3
prompt-toolkit==3.0.20
protobuf==3.18.1
psutil==5.8.0
ptyprocess==0.7.0
pyasn1==0.4.8
pyasn1-modules==0.2.8
pybind11==2.6.1
pycparser==2.20
Pygments==2.10.0
pynndescent==0.5.4
pyparsing==2.4.7
pyrsistent==0.18.0
python-dateutil==2.8.2
pytz==2021.3
PyYAML==5.4.1
pyzmq==22.3.0
requests==2.26.0
requests-oauthlib==1.3.0
requests-unixsocket==0.2.0
rsa==4.7.2
scikit-learn==1.0
scipy==1.7.1
Send2Trash==1.8.0
six==1.15.0
smart-open==5.2.1
sniffio==1.2.0
tabulate==0.8.9
tensorboard==2.6.0
tensorboard-data-server==0.6.1
tensorboard-plugin-profile==2.5.0
tensorboard-plugin-wit==1.8.0
tensorflow==2.6.0
tensorflow-consciousness==0.1
tensorflow-datasets==4.4.0
tensorflow-estimator==2.6.0
tensorflow-gan==2.1.0
tensorflow-hub==0.12.0
tensorflow-macos==2.6.0
tensorflow-metadata==1.2.0
tensorflow-metal==0.2.0
tensorflow-probability==0.14.1
tensorflow-similarity==0.13.45
tensorflow-text==2.6.0
termcolor==1.1.0
terminado==0.12.1
testpath==0.5.0
threadpoolctl==3.0.0
top2vec==1.0.26
tornado==6.1
tqdm==4.62.3
traitlets==5.1.0
typing-extensions==3.7.4.3
umap-learn==0.5.1
urllib3==1.26.7
wcwidth==0.2.5
webencodings==0.5.1
websocket-client==1.2.1
Werkzeug==2.0.2
widgetsnbextension==3.5.1
wordcloud==1.8.1
wrapt==1.12.1
zipp==3.6.0
Post not yet marked as solved
Virtual Environment
% pip list
Package Version
------------------------ ---------
absl-py 0.12.0
anyio 3.3.2
appnope 0.1.2
argon2-cffi 21.1.0
astunparse 1.6.3
attrs 21.2.0
Babel 2.9.1
backcall 0.2.0
bleach 4.1.0
bokeh 2.3.3
cachetools 4.2.4
certifi 2021.5.30
cffi 1.14.6
charset-normalizer 2.0.6
clang 5.0
cloudpickle 2.0.0
cycler 0.10.0
Cython 0.29.24
debugpy 1.5.0
decorator 5.1.0
defusedxml 0.7.1
dill 0.3.4
distinctipy 1.1.5
dm-tree 0.1.6
dotmap 1.3.24
entrypoints 0.3
flatbuffers 1.12
future 0.18.2
gast 0.4.0
gensim 3.8.3
google-auth 1.35.0
google-auth-oauthlib 0.4.6
google-pasta 0.2.0
googleapis-common-protos 1.53.0
grpcio 1.41.0
h5py 3.1.0
hdbscan 0.8.27
idna 3.2
importlib-resources 5.2.2
ipykernel 6.4.1
ipython 7.28.0
ipython-genutils 0.2.0
ipywidgets 7.6.5
jedi 0.18.0
Jinja2 3.0.2
joblib 1.1.0
json5 0.9.6
jsonschema 4.0.1
jupyter-client 7.0.6
jupyter-core 4.8.1
jupyter-server 1.11.1
jupyterlab 3.1.18
jupyterlab-pygments 0.1.2
jupyterlab-server 2.8.2
jupyterlab-widgets 1.0.2
keras 2.6.0
Keras-Preprocessing 1.1.2
kiwisolver 1.3.2
llvmlite 0.37.0
Markdown 3.3.4
MarkupSafe 2.0.1
matplotlib 3.4.3
matplotlib-inline 0.1.3
memory-profiler 0.58.0
mistune 0.8.4
nbclassic 0.3.2
nbclient 0.5.4
nbconvert 6.2.0
nbformat 5.1.3
nest-asyncio 1.5.1
nmslib 2.1.1
notebook 6.4.4
numba 0.54.0
numpy 1.20.3
oauthlib 3.1.1
opt-einsum 3.3.0
packaging 21.0
pandas 1.3.3
pandocfilters 1.5.0
parso 0.8.2
pexpect 4.8.0
pickleshare 0.7.5
Pillow 8.3.2
pip 21.2.4
prometheus-client 0.11.0
promise 2.3
prompt-toolkit 3.0.20
protobuf 3.18.1
psutil 5.8.0
ptyprocess 0.7.0
pyasn1 0.4.8
pyasn1-modules 0.2.8
pybind11 2.6.1
pycparser 2.20
Pygments 2.10.0
pynndescent 0.5.4
pyparsing 2.4.7
pyrsistent 0.18.0
python-dateutil 2.8.2
pytz 2021.3
PyYAML 5.4.1
pyzmq 22.3.0
requests 2.26.0
requests-oauthlib 1.3.0
requests-unixsocket 0.2.0
rsa 4.7.2
scikit-learn 1.0
scipy 1.7.1
Send2Trash 1.8.0
setuptools 47.1.0
six 1.15.0
smart-open 5.2.1
sniffio 1.2.0
tabulate 0.8.9
tensorboard 2.6.0
tensorboard-data-server 0.6.1
tensorboard-plugin-wit 1.8.0
tensorflow 2.6.0
tensorflow-consciousness 0.1
tensorflow-datasets 4.4.0
tensorflow-estimator 2.6.0
tensorflow-gan 2.1.0
tensorflow-hub 0.12.0
tensorflow-macos 2.6.0
tensorflow-metadata 1.2.0
tensorflow-metal 0.2.0
tensorflow-probability 0.14.1
tensorflow-similarity 0.13.45
tensorflow-text 2.6.0
termcolor 1.1.0
terminado 0.12.1
testpath 0.5.0
threadpoolctl 3.0.0
top2vec 1.0.26
tornado 6.1
tqdm 4.62.3
traitlets 5.1.0
typing-extensions 3.7.4.3
umap-learn 0.5.1
urllib3 1.26.7
wcwidth 0.2.5
webencodings 0.5.1
websocket-client 1.2.1
Werkzeug 2.0.2
wheel 0.37.0
widgetsnbextension 3.5.1
wordcloud 1.8.1
wrapt 1.12.1
zipp 3.6.0
(tensorflow-metal) (base) davidlaxer@x86_64-apple-darwin13 ~ %
Post not yet marked as solved
This code reproduces the crash:
test.txt
Also, running WITH OUT metal (just CPU) is 4X faster with 'SDG' optimizer. I can't compare the ADAM optimizer since it crashed.
In [2]: import tensorflow as tf
...:
...: mnist = tf.keras.datasets.mnist
...:
...: (x_train, y_train), (x_test, y_test) = mnist.load_data()
...: x_train, x_test = x_train / 255.0, x_test / 255.0
...:
...: model = tf.keras.models.Sequential([
...: tf.keras.layers.Flatten(input_shape=(28, 28)),
...: tf.keras.layers.Dense(128, activation='relu'),
...: tf.keras.layers.Dropout(0.2),
...: tf.keras.layers.Dense(10)
...: ])
...:
...: predictions = model(x_train[:1]).numpy()
...: tf.nn.softmax(predictions).numpy()
...:
...: loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True
...: )
...:
...: loss_fn(y_train[:1], predictions).numpy()
...:
...: model.compile(optimizer = 'adam', loss = loss_fn)
...: model.fit(x_train, y_train, epochs=100)
Epoch 1/100
2021-10-10 10:50:53.503460: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2021-10-10 10:50:53.527 python[25080:3485800] -[MPSGraph adamUpdateWithLearningRateTensor:beta1Tensor:beta2Tensor:epsilonTensor:beta1PowerTensor:beta2PowerTensor:valuesTensor:momentumTensor:velocityTensor:gradientTensor:name:]: unrecognized selector sent to instance 0x6000037975a0
zsh: segmentation fault ipython
tensorflow_metal (GPU):
% time python test.py
2021-10-10 11:34:34.602604: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
Metal device set to: AMD Radeon Pro 5700 XT
systemMemory: 128.00 GB
maxCacheSize: 7.99 GB
2021-10-10 11:34:34.603850: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:305] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2021-10-10 11:34:34.604642: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:271] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: <undefined>)
2021-10-10 11:34:35.779610: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:185] None of the MLIR Optimization Passes are enabled (registered 2)
Epoch 1/100
2021-10-10 11:34:35.929611: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
1875/1875 [==============================] - 7s 3ms/step - loss: 0.7213
Epoch 2/100
1875/1875 [==============================] - 6s 3ms/step - loss: 0.38653ms/step - loss: 0.0474
...
Epoch 100/100
1875/1875 [==============================] - 6s 3ms/step - loss: 0.0473
python test.py 721.48s user 375.56s system 173% cpu 10:31.28 total
(tensorflow-metal) (base) davidlaxer@x86_64-apple-darwin13 ~ %
tensorflow (CPU):
% time python ~/test.py
2021-10-10 11:45:44.111971: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-10-10 11:45:44.487763: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:176] None of the MLIR Optimization Passes are enabled (registered 2)
Epoch 1/100
1875/1875 [==============================] - 1s 460us/step - loss: 0.7210
Epoch 2/100
1875/1875 [==============================] - 1s 459us/step - loss: 0.3874
Epoch 3/100
1875/1875 [==============================] - 1s 459us/step - loss: 0.3233
Epoch 4/100
1875/1875 [==============================] - 1s 460us/step - loss: 0.2884
Epoch 5/100
1875/1875 [==============================] - 1s 471us/step - loss: 0.2608
Epoch 6/100
1875/1875 [==============================] - 1s 462us/step - loss: 0.2400
Epoch 7/100
...
Epoch 99/100
1875/1875 [==============================] - 1s 468us/step - loss: 0.0455
Epoch 100/100
1875/1875 [==============================] - 1s 469us/step - loss: 0.0463
python ~/test.py 181.09s user 48.20s system 246% cpu 1:32.86 total
(ai) davidlaxer@x86_64-apple-darwin13 text %
Post not yet marked as solved
I installed the latest versions of tensorflow-macos and tensorflow-metal on OS X 11.6.
Now, it no longer prints out that it's using metal or my AMD GPU.
% ipython
In [3]: import tensorflow
No supported GPU was found.
I installed the latest versions from PyPi into my existing tensorflow-metal virtual environement with:
% pip install tensorflow-macos==2.6.0
% pip install tensorflow-metal=0.2.0
What's changed? Do I need to recreate the tensorflow-metal virtual environment from scratch?
% pip show tensorflow-metal
Name: tensorflow-metal
Version: 0.2.0
Summary: TensorFlow acceleration for Mac GPUs.
Home-page: https://developer.apple.com/metal/tensorflow-plugin/
Author:
Author-email:
License: MIT License. Copyright © 2020-2021 Apple Inc. All rights reserved.
Location: /Users/davidlaxer/tensorflow-metal/lib/python3.8/site-packages
Requires: wheel, six
Required-by:
(tensorflow-metal) (base) davidlaxer@x86_64-apple-darwin13 Top2Vec % pip show tensorflow-macos
Name: tensorflow-macos
Version: 2.6.0
Summary: TensorFlow is an open source machine learning framework for everyone.
Home-page: https://www.tensorflow.org/
Author: Google Inc.
% pip show
tensorboard 2.6.0
tensorboard-data-server 0.6.1
tensorboard-plugin-profile 2.5.0
tensorboard-plugin-wit 1.8.0
tensorflow 2.6.0
tensorflow-consciousness 0.1
tensorflow-datasets 4.3.0
tensorflow-determinism 0.3.0
tensorflow-estimator 2.6.0
tensorflow-gan 2.1.0
tensorflow-hub 0.12.0
tensorflow-macos 2.6.0
tensorflow-metadata 1.1.0
tensorflow-metal 0.2.0
tensorflow-probability 0.13.0
tensorflow-similarity 0.13.45
tensorflow-text 2.6.0
Post not yet marked as solved
I was able to profile keras/tensorflow example code with a tensorflow-metal virtual environment. Please note the profile tab will only display results in Google Chrome. In Safari the Profile tab was empty.
Post not yet marked as solved
Fixed with tensorflow-metal version 0.1.2
I got this error with (tensorflow-metal) virtualenv on Big Sur with an AMD Radeon 5700 XT GPU
tensorflow-metal/lib/python3.8/site-packages/tensorflow-plugins/libmetal_plugin.dylib, 6): Symbol not found: _TF_AssignUpdateVariable
Referenced from: /Users/davidlaxer/tensorflow-metal/lib/python3.8/site-packages/tensorflow-plugins/libmetal_plugin.dylib
Expected in: flat namespace
$ nm /Users/davidlaxer/tensorflow-metal/lib/python3.8/site-packages/tensorflow-plugins/libmetal_plugin.dylib | grep _TF_AssignUpdateVariable
U _TF_AssignUpdateVariable