[MPSGraph adamUpdateWithLearningRateTensor:beta1Tensor:beta2Tensor:epsilonTensor:beta1PowerTensor:beta2PowerTensor:valuesTensor:momentumTensor:velocityTensor:gradientTensor:name:]: unrecognized selector sent to instance 0x600000eede10

Question

Created Oct ’21

Replies 22

Boosts 0

Views 8k

Participants 17

I am running tensorflow-macos and tensorflow-metal version 2.6 on Monterey Beta (21A5543b) on an iMac 27" 2021 with an AMD Radeon GPU.

I got the following error training the model VariationalDeepSemanticHashing e.g.

 2021-10-09 13:05:14.521286: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
 
2021-10-09 13:05:27.092823: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
 
2021-10-09 13:05:27.153 python[6315:1459657] -[MPSGraph adamUpdateWithLearningRateTensor:beta1Tensor:beta2Tensor:epsilonTensor:beta1PowerTensor:beta2PowerTensor:valuesTensor:momentumTensor:velocityTensor:gradientTensor:name:]: unrecognized selector sent to instance 0x600000eede10
 
[I 2021-10-09 13:05:28.157 ServerApp] AsyncIOLoopKernelRestarter: restarting kernel (1/5), keep random ports
 
kernel d25e6066-74f7-4b4a-b5e7-b2911e7501d9 restarted

https://github.com/unsuthee/VariationalDeepSemanticHashing/blob/master/Run_Experiment_Unsupervised.ipynb

Here's the repository:

https://github.com/unsuthee/VariationalDeepSemanticHashing

Boost

Answer 1

dbl001 OP

Oct ’21

This code reproduces the crash ```In [2]: import tensorflow as tf ...: ...: mnist = tf.keras.datasets.mnist ...: ...: (x_train, y_train), (x_test, y_test) = mnist.load_data() ...: x_train, x_test = x_train / 255.0, x_test / 255.0 ...: ...: model = tf.keras.models.Sequential([ ...: tf.keras.layers.Flatten(input_shape=(28, 28)), ...: tf.keras.layers.Dense(128, activation='relu'), ...: tf.keras.layers.Dropout(0.2), ...: tf.keras.layers.Dense(10) ...: ]) ...: ...: predictions = model(x_train[:1]).numpy() ...: tf.nn.softmax(predictions).numpy() ...: ...: loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True ...: ) ...: ...: loss_fn(y_train[:1], predictions).numpy() ...: ...: model.compile(optimizer = 'adam', loss = loss_fn) ...: model.fit(x_train, y_train, epochs=100) Epoch 1/100 2021-10-10 10:50:53.503460: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled. 2021-10-10 10:50:53.527 python[25080:3485800] -[MPSGraph adamUpdateWithLearningRateTensor:beta1Tensor:beta2Tensor:epsilonTensor:beta1PowerTensor:beta2PowerTensor:valuesTensor:momentumTensor:velocityTensor:gradientTensor:name:]: unrecognized selector sent to instance 0x6000037975a0 zsh: segmentation fault ipython

0

Answer 2

dbl001 OP

Oct ’21

This code reproduces the crash ...

0

Answer 3

dbl001 OP

Oct ’21

This code reproduces the crash:

test.txt

import tensorflow as tf

mnist = tf.keras.datasets.mnist

(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(input_shape=(28, 28)),
  tf.keras.layers.Dense(128, activation='relu'),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(10)
])

predictions = model(x_train[:1]).numpy()
tf.nn.softmax(predictions).numpy()

loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)

loss_fn(y_train[:1], predictions).numpy()

model.compile(optimizer = 'adam', loss = loss_fn)
model.fit(x_train, y_train, epochs=100)

Also, running WITH OUT metal (just CPU) is 4X faster with 'SDG' optimizer. I can't compare the ADAM optimizer since it crashed.

 In [2]: import tensorflow as tf
   ...: 
   ...: mnist = tf.keras.datasets.mnist
   ...: 
   ...: (x_train, y_train), (x_test, y_test) = mnist.load_data()
   ...: x_train, x_test = x_train / 255.0, x_test / 255.0
   ...: 
   ...: model = tf.keras.models.Sequential([
   ...:   tf.keras.layers.Flatten(input_shape=(28, 28)),
   ...:   tf.keras.layers.Dense(128, activation='relu'),
   ...:   tf.keras.layers.Dropout(0.2),
   ...:   tf.keras.layers.Dense(10)
   ...: ])
   ...: 
   ...: predictions = model(x_train[:1]).numpy()
   ...: tf.nn.softmax(predictions).numpy()
   ...: 
   ...: loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True
   ...: )
   ...: 
   ...: loss_fn(y_train[:1], predictions).numpy()
   ...: 
   ...: model.compile(optimizer = 'adam', loss = loss_fn)
   ...: model.fit(x_train, y_train, epochs=100)
Epoch 1/100
2021-10-10 10:50:53.503460: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2021-10-10 10:50:53.527 python[25080:3485800] -[MPSGraph adamUpdateWithLearningRateTensor:beta1Tensor:beta2Tensor:epsilonTensor:beta1PowerTensor:beta2PowerTensor:valuesTensor:momentumTensor:velocityTensor:gradientTensor:name:]: unrecognized selector sent to instance 0x6000037975a0
zsh: segmentation fault  ipython

tensorflow_metal (GPU):

 % time python test.py
2021-10-10 11:34:34.602604: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
Metal device set to: AMD Radeon Pro 5700 XT
 
systemMemory: 128.00 GB
maxCacheSize: 7.99 GB
 
2021-10-10 11:34:34.603850: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:305] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2021-10-10 11:34:34.604642: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:271] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: <undefined>)
2021-10-10 11:34:35.779610: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:185] None of the MLIR Optimization Passes are enabled (registered 2)
Epoch 1/100
2021-10-10 11:34:35.929611: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
1875/1875 [==============================] - 7s 3ms/step - loss: 0.7213
Epoch 2/100
1875/1875 [==============================] - 6s 3ms/step - loss: 0.38653ms/step - loss: 0.0474
...
 
Epoch 100/100
1875/1875 [==============================] - 6s 3ms/step - loss: 0.0473
python test.py  721.48s user 375.56s system 173% cpu 10:31.28 total
(tensorflow-metal) (base) davidlaxer@x86_64-apple-darwin13 ~ %

tensorflow (CPU):

 % time python ~/test.py
2021-10-10 11:45:44.111971: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-10-10 11:45:44.487763: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:176] None of the MLIR Optimization Passes are enabled (registered 2)
Epoch 1/100
1875/1875 [==============================] - 1s 460us/step - loss: 0.7210
Epoch 2/100
1875/1875 [==============================] - 1s 459us/step - loss: 0.3874
Epoch 3/100
1875/1875 [==============================] - 1s 459us/step - loss: 0.3233
Epoch 4/100
1875/1875 [==============================] - 1s 460us/step - loss: 0.2884
Epoch 5/100
1875/1875 [==============================] - 1s 471us/step - loss: 0.2608
Epoch 6/100
1875/1875 [==============================] - 1s 462us/step - loss: 0.2400
Epoch 7/100
...
Epoch 99/100
1875/1875 [==============================] - 1s 468us/step - loss: 0.0455
Epoch 100/100
1875/1875 [==============================] - 1s 469us/step - loss: 0.0463
python ~/test.py  181.09s user 48.20s system 246% cpu 1:32.86 total
(ai) davidlaxer@x86_64-apple-darwin13 text %

0

Answer 4

dbl001 OP

Oct ’21

Virtual Environment

  % pip list
Package                  Version
------------------------ ---------
absl-py                  0.12.0
anyio                    3.3.2
appnope                  0.1.2
argon2-cffi              21.1.0
astunparse               1.6.3
attrs                    21.2.0
Babel                    2.9.1
backcall                 0.2.0
bleach                   4.1.0
bokeh                    2.3.3
cachetools               4.2.4
certifi                  2021.5.30
cffi                     1.14.6
charset-normalizer       2.0.6
clang                    5.0
cloudpickle              2.0.0
cycler                   0.10.0
Cython                   0.29.24
debugpy                  1.5.0
decorator                5.1.0
defusedxml               0.7.1
dill                     0.3.4
distinctipy              1.1.5
dm-tree                  0.1.6
dotmap                   1.3.24
entrypoints              0.3
flatbuffers              1.12
future                   0.18.2
gast                     0.4.0
gensim                   3.8.3
google-auth              1.35.0
google-auth-oauthlib     0.4.6
google-pasta             0.2.0
googleapis-common-protos 1.53.0
grpcio                   1.41.0
h5py                     3.1.0
hdbscan                  0.8.27
idna                     3.2
importlib-resources      5.2.2
ipykernel                6.4.1
ipython                  7.28.0
ipython-genutils         0.2.0
ipywidgets               7.6.5
jedi                     0.18.0
Jinja2                   3.0.2
joblib                   1.1.0
json5                    0.9.6
jsonschema               4.0.1
jupyter-client           7.0.6
jupyter-core             4.8.1
jupyter-server           1.11.1
jupyterlab               3.1.18
jupyterlab-pygments      0.1.2
jupyterlab-server        2.8.2
jupyterlab-widgets       1.0.2
keras                    2.6.0
Keras-Preprocessing      1.1.2
kiwisolver               1.3.2
llvmlite                 0.37.0
Markdown                 3.3.4
MarkupSafe               2.0.1
matplotlib               3.4.3
matplotlib-inline        0.1.3
memory-profiler          0.58.0
mistune                  0.8.4
nbclassic                0.3.2
nbclient                 0.5.4
nbconvert                6.2.0
nbformat                 5.1.3
nest-asyncio             1.5.1
nmslib                   2.1.1
notebook                 6.4.4
numba                    0.54.0
numpy                    1.20.3
oauthlib                 3.1.1
opt-einsum               3.3.0
packaging                21.0
pandas                   1.3.3
pandocfilters            1.5.0
parso                    0.8.2
pexpect                  4.8.0
pickleshare              0.7.5
Pillow                   8.3.2
pip                      21.2.4
prometheus-client        0.11.0
promise                  2.3
prompt-toolkit           3.0.20
protobuf                 3.18.1
psutil                   5.8.0
ptyprocess               0.7.0
pyasn1                   0.4.8
pyasn1-modules           0.2.8
pybind11                 2.6.1
pycparser                2.20
Pygments                 2.10.0
pynndescent              0.5.4
pyparsing                2.4.7
pyrsistent               0.18.0
python-dateutil          2.8.2
pytz                     2021.3
PyYAML                   5.4.1
pyzmq                    22.3.0
requests                 2.26.0
requests-oauthlib        1.3.0
requests-unixsocket      0.2.0
rsa                      4.7.2
scikit-learn             1.0
scipy                    1.7.1
Send2Trash               1.8.0
setuptools               47.1.0
six                      1.15.0
smart-open               5.2.1
sniffio                  1.2.0
tabulate                 0.8.9
tensorboard              2.6.0
tensorboard-data-server  0.6.1
tensorboard-plugin-wit   1.8.0
tensorflow               2.6.0
tensorflow-consciousness 0.1
tensorflow-datasets      4.4.0
tensorflow-estimator     2.6.0
tensorflow-gan           2.1.0
tensorflow-hub           0.12.0
tensorflow-macos         2.6.0
tensorflow-metadata      1.2.0
tensorflow-metal         0.2.0
tensorflow-probability   0.14.1
tensorflow-similarity    0.13.45
tensorflow-text          2.6.0
termcolor                1.1.0
terminado                0.12.1
testpath                 0.5.0
threadpoolctl            3.0.0
top2vec                  1.0.26
tornado                  6.1
tqdm                     4.62.3
traitlets                5.1.0
typing-extensions        3.7.4.3
umap-learn               0.5.1
urllib3                  1.26.7
wcwidth                  0.2.5
webencodings             0.5.1
websocket-client         1.2.1
Werkzeug                 2.0.2
wheel                    0.37.0
widgetsnbextension       3.5.1
wordcloud                1.8.1
wrapt                    1.12.1
zipp                     3.6.0
(tensorflow-metal) (base) davidlaxer@x86_64-apple-darwin13 ~ %

0

Answer 5

hawkiyc OP

Oct ’21

Dear @dbl001,

After updating to the latest version of TensorFlow-macos and TensorFlow-metal, IDE always crashes when I am trying to train any neural network model on my device(16" MacBook Pro). However, the previous version is fine except for the NLP task. So, you can try to downgrade the frameworks on your device, it might be able to solve this issue. However, there seems no solution if you want to train NLP tasks with tf.hub and/or tf.model.official and/or hugging-face transformer.

Sincerely,

Gavin

0

Answer 6

noahc1510 OP

Oct ’21

I have the same problem. You can try to replace Adam optimizer to another one. This problem only cause on this optimizer, avoid to use it and the problems will be solved temporarily. I tested on tensorflow-macos 2.5.0 and 2.6.0 and hope it will be fix soon.

3

Answer 7

qinyuhang OP

Oct ’21

I have the same problem here. Python: 3.0.10 macOS: monterey 12.0.1 macbook pro 2018 with Radeon Pro 560X 4 GB

But before I upgrade to Monterey, The scripts works just fine with Big Sur.

0

Answer 8

Frameworks Engineer OP

Apple

Nov ’21

We were unable to reproduce this issue on Monterey Beta (21A5543b) with the provided script as it just worked without crashing. Please let us know if the issue still persists.

The CPU vs. GPU performance difference might be indeed due to a small batch size being used and not getting the full benefits of the GPU due to small utilisation. Please test using a larger batch size to see if this is the case.

0

Answer 9

aminnasiri OP

Nov ’21

I got this issue on Macos 12.1 Beta (21C5021h).

0

Answer 10

dbl001 OP

Nov ’21

This code crashes with the 'adam' optimzer. It does work with 'SGD'. I am running Monterey 12.1 beta, and the latest versions of tensorflow-macos and tensorflow-metal from pypi.

           
import tensorflow as tf
 
mnist = tf.keras.datasets.mnist
 
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
 
model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(input_shape=(28, 28)),
  tf.keras.layers.Dense(128, activation='relu'),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(10)
])
 
predictions = model(x_train[:1]).numpy()
tf.nn.softmax(predictions).numpy()
 
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
 
loss_fn(y_train[:1], predictions).numpy()
 
model.compile(optimizer = 'adam', loss = loss_fn)
model.fit(x_train, y_train, epochs=100)

1

Answer 11

JohnnyParafango OP

Nov ’21

Crashing for me with macOS 12.0.1 and Adam optimizer as soon as I call tf.keras.optimizers.Adam(1e-4) on AMD. No problem with other optimizers.

0

Answer 12

dbl001 OP

Dec ’21

The AdamOptimizer is still causing crashes with:

tensorflow-metal 0.3.0 tensorflow-macos 2.7.0

0

Answer 13

eshieh OP

Dec ’21

I too am seeing the issue on tf-metal 0.3.0 and tf-macos 2.7. on OSX 12.0.1

An important attribute of this issue is that it crashes on a 2019 16" MBP w/discrete GPU. I have an M1 mac mini and it runs fine on that platform.

As others have pointed out, other optimizers (SGD, Rmsprop) are fine and do not crash.

1

Answer 14

Kayan_Irani OP

Jan ’22

I was also having this problem on my 2019 MacBook Pro and I managed to solve the problem like this. You can circumvent this problem by creating your own implementation of Adam in keras and use that. I have made a very rough and basic implementation while referencing the research paper on Adam (https://arxiv.org/abs/1412.6980) and Creating a Custom Optimiser in Tensorflow (https://www.tensorflow.org/api_docs/python/tf/keras/optimizers/Optimizer#creating_a_custom_optimizer_2).

Please note that I have not implemented _resource_apply_sparse or any of Adam’s fancier bells and whistles (such as amsgrad). This is a simple and basic implementation of the optimiser as described in the paper I referenced above.

IMPORTANT NOTE:

The code requires that it runs in eager mode (due to self.iterations.numpy()). To implement this we add the line “tf.config.run_functions_eagerly(True)” at the top of the code.

Optimiser code:

 import tensorflow as tf
 
tf.config.run_functions_eagerly(True)
class CustomAdam(tf.keras.optimizers.Optimizer):
  def __init__(self, learning_rate=0.001,beta1 = 0.9, beta2 = 0.999, epsilon = 1e-8, name="CustomAdam", **kwargs):
    super().__init__(name, **kwargs)
    self._set_hyper("learning_rate", kwargs.get("lr", learning_rate)) # handle lr=learning_rate
    self._set_hyper("decay", self._initial_decay)
    self._set_hyper("beta_v", beta1)
    self._set_hyper("beta_s", beta2)
    self._set_hyper("epsilon", epsilon)
    self._set_hyper("corrected_v", beta1)
    self._set_hyper("corrected_s", beta2)
   
  def _create_slots(self, var_list):
    """
    One slot per model variable.
    """
    for var in var_list:
      self.add_slot(var, "beta_v")
      self.add_slot(var, "beta_s")
      self.add_slot(var, "epsilon")
      self.add_slot(var, "corrected_v")
      self.add_slot(var, "corrected_s")
       
 
  @tf.function
  def _resource_apply_dense(self, grad, var):
    """Update the slots and perform an optimization step for the model variable.
    """
 
    var_dtype = var.dtype.base_dtype
    lr_t = self._decayed_lr(var_dtype) # handle learning rate decay
     
    momentum_var1 = self.get_slot(var, "beta_v")
    momentum_hyper1 = self._get_hyper("beta_v", var_dtype)
     
    momentum_var2 = self.get_slot(var, "beta_s")
    momentum_hyper2 = self._get_hyper("beta_s", var_dtype)
     
     
    momentum_var1.assign(momentum_var1 * momentum_hyper1 + (1. - momentum_hyper1)* grad)
     
    momentum_var2.assign(momentum_var2 * momentum_hyper2 + (1. - momentum_hyper2)* (grad ** 2))
     
 
    # Adam bias-corrected estimate
     
    corrected_v = self.get_slot(var, "corrected_v")
    corrected_v.assign(momentum_var1 / (1 - (momentum_hyper1 ** (self.iterations.numpy() + 1) )))
 
    corrected_s = self.get_slot(var, "corrected_s")
    corrected_s.assign(momentum_var2 / (1 - (momentum_hyper2 ** (self.iterations.numpy() + 1) )))
 
    epsilon_hyper = self._get_hyper("epsilon", var_dtype)
     
    var.assign_add(-lr_t * (corrected_v / (tf.sqrt(corrected_s) + epsilon_hyper)))
 
  def _resource_apply_sparse(self, grad, var):
    raise NotImplementedError
 
  def get_config(self):
    base_config = super().get_config()
    return {
      **base_config,
      "learning_rate": self._serialize_hyperparameter("learning_rate"),
      "decay": self._serialize_hyperparameter("decay"),
      "beta_v": self._serialize_hyperparameter("beta_v"),
      "beta_s": self._serialize_hyperparameter("beta_s"),
      "epsilon": self._serialize_hyperparameter("epsilon"),
    }

Example usage:

 model.compile(optimizer = CustomAdam(),
       loss= ‘mse’)
 
model.fit(X, Y, epochs=10)

0

Answer 15

Marcdoxontherocks_wicked OP

Jan ’22

code-block

0

	2021-10-09 13:05:14.521286: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.

	2021-10-09 13:05:27.092823: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.

	2021-10-09 13:05:27.153 python[6315:1459657] -[MPSGraph adamUpdateWithLearningRateTensor:beta1Tensor:beta2Tensor:epsilonTensor:beta1PowerTensor:beta2PowerTensor:valuesTensor:momentumTensor:velocityTensor:gradientTensor:name:]: unrecognized selector sent to instance 0x600000eede10

	[I 2021-10-09 13:05:28.157 ServerApp] AsyncIOLoopKernelRestarter: restarting kernel (1/5), keep random ports

	kernel d25e6066-74f7-4b4a-b5e7-b2911e7501d9 restarted

	In [2]: import tensorflow as tf
	...:
	...: mnist = tf.keras.datasets.mnist
	...:
	...: (x_train, y_train), (x_test, y_test) = mnist.load_data()
	...: x_train, x_test = x_train / 255.0, x_test / 255.0
	...:
	...: model = tf.keras.models.Sequential([
	...: tf.keras.layers.Flatten(input_shape=(28, 28)),
	...: tf.keras.layers.Dense(128, activation='relu'),
	...: tf.keras.layers.Dropout(0.2),
	...: tf.keras.layers.Dense(10)
	...: ])
	...:
	...: predictions = model(x_train[:1]).numpy()
	...: tf.nn.softmax(predictions).numpy()
	...:
	...: loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True
	...: )
	...:
	...: loss_fn(y_train[:1], predictions).numpy()
	...:
	...: model.compile(optimizer = 'adam', loss = loss_fn)
	...: model.fit(x_train, y_train, epochs=100)
	Epoch 1/100
	2021-10-10 10:50:53.503460: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
	2021-10-10 10:50:53.527 python[25080:3485800] -[MPSGraph adamUpdateWithLearningRateTensor:beta1Tensor:beta2Tensor:epsilonTensor:beta1PowerTensor:beta2PowerTensor:valuesTensor:momentumTensor:velocityTensor:gradientTensor:name:]: unrecognized selector sent to instance 0x6000037975a0
	zsh: segmentation fault ipython

	% time python test.py
	2021-10-10 11:34:34.602604: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA
	To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
	Metal device set to: AMD Radeon Pro 5700 XT

	systemMemory: 128.00 GB
	maxCacheSize: 7.99 GB

	2021-10-10 11:34:34.603850: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:305] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
	2021-10-10 11:34:34.604642: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:271] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: <undefined>)
	2021-10-10 11:34:35.779610: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:185] None of the MLIR Optimization Passes are enabled (registered 2)
	Epoch 1/100
	2021-10-10 11:34:35.929611: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
	1875/1875 [==============================] - 7s 3ms/step - loss: 0.7213
	Epoch 2/100
	1875/1875 [==============================] - 6s 3ms/step - loss: 0.38653ms/step - loss: 0.0474
	...

	Epoch 100/100
	1875/1875 [==============================] - 6s 3ms/step - loss: 0.0473
	python test.py 721.48s user 375.56s system 173% cpu 10:31.28 total
	(tensorflow-metal) (base) davidlaxer@x86_64-apple-darwin13 ~ %

	% time python ~/test.py
	2021-10-10 11:45:44.111971: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA
	To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
	2021-10-10 11:45:44.487763: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:176] None of the MLIR Optimization Passes are enabled (registered 2)
	Epoch 1/100
	1875/1875 [==============================] - 1s 460us/step - loss: 0.7210
	Epoch 2/100
	1875/1875 [==============================] - 1s 459us/step - loss: 0.3874
	Epoch 3/100
	1875/1875 [==============================] - 1s 459us/step - loss: 0.3233
	Epoch 4/100
	1875/1875 [==============================] - 1s 460us/step - loss: 0.2884
	Epoch 5/100
	1875/1875 [==============================] - 1s 471us/step - loss: 0.2608
	Epoch 6/100
	1875/1875 [==============================] - 1s 462us/step - loss: 0.2400
	Epoch 7/100
	...
	Epoch 99/100
	1875/1875 [==============================] - 1s 468us/step - loss: 0.0455
	Epoch 100/100
	1875/1875 [==============================] - 1s 469us/step - loss: 0.0463
	python ~/test.py 181.09s user 48.20s system 246% cpu 1:32.86 total
	(ai) davidlaxer@x86_64-apple-darwin13 text %

	% pip list
	Package Version
	------------------------ ---------
	absl-py 0.12.0
	anyio 3.3.2
	appnope 0.1.2
	argon2-cffi 21.1.0
	astunparse 1.6.3
	attrs 21.2.0
	Babel 2.9.1
	backcall 0.2.0
	bleach 4.1.0
	bokeh 2.3.3
	cachetools 4.2.4
	certifi 2021.5.30
	cffi 1.14.6
	charset-normalizer 2.0.6
	clang 5.0
	cloudpickle 2.0.0
	cycler 0.10.0
	Cython 0.29.24
	debugpy 1.5.0
	decorator 5.1.0
	defusedxml 0.7.1
	dill 0.3.4
	distinctipy 1.1.5
	dm-tree 0.1.6
	dotmap 1.3.24
	entrypoints 0.3
	flatbuffers 1.12
	future 0.18.2
	gast 0.4.0
	gensim 3.8.3
	google-auth 1.35.0
	google-auth-oauthlib 0.4.6
	google-pasta 0.2.0
	googleapis-common-protos 1.53.0
	grpcio 1.41.0
	h5py 3.1.0
	hdbscan 0.8.27
	idna 3.2
	importlib-resources 5.2.2
	ipykernel 6.4.1
	ipython 7.28.0
	ipython-genutils 0.2.0
	ipywidgets 7.6.5
	jedi 0.18.0
	Jinja2 3.0.2
	joblib 1.1.0
	json5 0.9.6
	jsonschema 4.0.1
	jupyter-client 7.0.6
	jupyter-core 4.8.1
	jupyter-server 1.11.1
	jupyterlab 3.1.18
	jupyterlab-pygments 0.1.2
	jupyterlab-server 2.8.2
	jupyterlab-widgets 1.0.2
	keras 2.6.0
	Keras-Preprocessing 1.1.2
	kiwisolver 1.3.2
	llvmlite 0.37.0
	Markdown 3.3.4
	MarkupSafe 2.0.1
	matplotlib 3.4.3
	matplotlib-inline 0.1.3
	memory-profiler 0.58.0
	mistune 0.8.4
	nbclassic 0.3.2
	nbclient 0.5.4
	nbconvert 6.2.0
	nbformat 5.1.3
	nest-asyncio 1.5.1
	nmslib 2.1.1
	notebook 6.4.4
	numba 0.54.0
	numpy 1.20.3
	oauthlib 3.1.1
	opt-einsum 3.3.0
	packaging 21.0
	pandas 1.3.3
	pandocfilters 1.5.0
	parso 0.8.2
	pexpect 4.8.0
	pickleshare 0.7.5
	Pillow 8.3.2
	pip 21.2.4
	prometheus-client 0.11.0
	promise 2.3
	prompt-toolkit 3.0.20
	protobuf 3.18.1
	psutil 5.8.0
	ptyprocess 0.7.0
	pyasn1 0.4.8
	pyasn1-modules 0.2.8
	pybind11 2.6.1
	pycparser 2.20
	Pygments 2.10.0
	pynndescent 0.5.4
	pyparsing 2.4.7
	pyrsistent 0.18.0
	python-dateutil 2.8.2
	pytz 2021.3
	PyYAML 5.4.1
	pyzmq 22.3.0
	requests 2.26.0
	requests-oauthlib 1.3.0
	requests-unixsocket 0.2.0
	rsa 4.7.2
	scikit-learn 1.0
	scipy 1.7.1
	Send2Trash 1.8.0
	setuptools 47.1.0
	six 1.15.0
	smart-open 5.2.1
	sniffio 1.2.0
	tabulate 0.8.9
	tensorboard 2.6.0
	tensorboard-data-server 0.6.1
	tensorboard-plugin-wit 1.8.0
	tensorflow 2.6.0
	tensorflow-consciousness 0.1
	tensorflow-datasets 4.4.0
	tensorflow-estimator 2.6.0
	tensorflow-gan 2.1.0
	tensorflow-hub 0.12.0
	tensorflow-macos 2.6.0
	tensorflow-metadata 1.2.0
	tensorflow-metal 0.2.0
	tensorflow-probability 0.14.1
	tensorflow-similarity 0.13.45
	tensorflow-text 2.6.0
	termcolor 1.1.0
	terminado 0.12.1
	testpath 0.5.0
	threadpoolctl 3.0.0
	top2vec 1.0.26
	tornado 6.1
	tqdm 4.62.3
	traitlets 5.1.0
	typing-extensions 3.7.4.3
	umap-learn 0.5.1
	urllib3 1.26.7
	wcwidth 0.2.5
	webencodings 0.5.1
	websocket-client 1.2.1
	Werkzeug 2.0.2
	wheel 0.37.0
	widgetsnbextension 3.5.1
	wordcloud 1.8.1
	wrapt 1.12.1
	zipp 3.6.0
	(tensorflow-metal) (base) davidlaxer@x86_64-apple-darwin13 ~ %


	import tensorflow as tf

	mnist = tf.keras.datasets.mnist

	(x_train, y_train), (x_test, y_test) = mnist.load_data()
	x_train, x_test = x_train / 255.0, x_test / 255.0

	model = tf.keras.models.Sequential([
	tf.keras.layers.Flatten(input_shape=(28, 28)),
	tf.keras.layers.Dense(128, activation='relu'),
	tf.keras.layers.Dropout(0.2),
	tf.keras.layers.Dense(10)
	])

	predictions = model(x_train[:1]).numpy()
	tf.nn.softmax(predictions).numpy()

	loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)

	loss_fn(y_train[:1], predictions).numpy()

	model.compile(optimizer = 'adam', loss = loss_fn)
	model.fit(x_train, y_train, epochs=100)

	import tensorflow as tf

	tf.config.run_functions_eagerly(True)
	class CustomAdam(tf.keras.optimizers.Optimizer):
	def __init__(self, learning_rate=0.001,beta1 = 0.9, beta2 = 0.999, epsilon = 1e-8, name="CustomAdam", **kwargs):
	super().__init__(name, **kwargs)
	self._set_hyper("learning_rate", kwargs.get("lr", learning_rate)) # handle lr=learning_rate
	self._set_hyper("decay", self._initial_decay)
	self._set_hyper("beta_v", beta1)
	self._set_hyper("beta_s", beta2)
	self._set_hyper("epsilon", epsilon)
	self._set_hyper("corrected_v", beta1)
	self._set_hyper("corrected_s", beta2)

	def _create_slots(self, var_list):
	"""
	One slot per model variable.
	"""
	for var in var_list:
	self.add_slot(var, "beta_v")
	self.add_slot(var, "beta_s")
	self.add_slot(var, "epsilon")
	self.add_slot(var, "corrected_v")
	self.add_slot(var, "corrected_s")


	@tf.function
	def _resource_apply_dense(self, grad, var):
	"""Update the slots and perform an optimization step for the model variable.
	"""

	var_dtype = var.dtype.base_dtype
	lr_t = self._decayed_lr(var_dtype) # handle learning rate decay

	momentum_var1 = self.get_slot(var, "beta_v")
	momentum_hyper1 = self._get_hyper("beta_v", var_dtype)

	momentum_var2 = self.get_slot(var, "beta_s")
	momentum_hyper2 = self._get_hyper("beta_s", var_dtype)


	momentum_var1.assign(momentum_var1 * momentum_hyper1 + (1. - momentum_hyper1)* grad)

	momentum_var2.assign(momentum_var2 * momentum_hyper2 + (1. - momentum_hyper2)* (grad ** 2))


	# Adam bias-corrected estimate

	corrected_v = self.get_slot(var, "corrected_v")
	corrected_v.assign(momentum_var1 / (1 - (momentum_hyper1 ** (self.iterations.numpy() + 1) )))

	corrected_s = self.get_slot(var, "corrected_s")
	corrected_s.assign(momentum_var2 / (1 - (momentum_hyper2 ** (self.iterations.numpy() + 1) )))

	epsilon_hyper = self._get_hyper("epsilon", var_dtype)

	var.assign_add(-lr_t * (corrected_v / (tf.sqrt(corrected_s) + epsilon_hyper)))

	def _resource_apply_sparse(self, grad, var):
	raise NotImplementedError

	def get_config(self):
	base_config = super().get_config()
	return {
	**base_config,
	"learning_rate": self._serialize_hyperparameter("learning_rate"),
	"decay": self._serialize_hyperparameter("decay"),
	"beta_v": self._serialize_hyperparameter("beta_v"),
	"beta_s": self._serialize_hyperparameter("beta_s"),
	"epsilon": self._serialize_hyperparameter("epsilon"),
	}

	model.compile(optimizer = CustomAdam(),
	loss= ‘mse’)

	model.fit(X, Y, epochs=10)