In a tensorflow-metal virtual environment on OS X 12.1:
tensorboard 2.6.0 tensorboard-data-server 0.6.1 tensorboard-plugin-profile 2.5.0 tensorboard-plugin-wit 1.8.0 tensorflow 2.6.0 tensorflow-addons 0.14.0 tensorflow-consciousness 0.1 tensorflow-datasets 4.4.0 tensorflow-estimator 2.7.0 tensorflow-gan 2.1.0 tensorflow-hub 0.12.0 tensorflow-io-gcs-filesystem 0.22.0 tensorflow-macos 2.7.0 tensorflow-metadata 1.2.0 tensorflow-metal 0.3.0 tensorflow-probability 0.14.1 tensorflow-similarity 0.13.45 tensorflow-text 2.7.3
Running the Top2vec model: https://github.com/ddangelov/Top2Vec
import numpy as np import pandas as pd import json import os import ipywidgets as widgets from IPython.display import clear_output, display from top2vec import Top2Vec papers_prepared_df = pd.read_feather("/Users/davidlaxer/Downloads/archive/covid19_papers_processed.feather") top2vec_trained = Top2Vec(documents=papers_prepared_df.text.tolist(), embedding_model="universal-sentence-encoder", use_embedding_model_tokenizer=True, embedding_model_path="/Users/davidlaxer/Downloads/universal-sentence-encoder_4/", workers=4) 2021-12-20 06:30:52,188 - top2vec - INFO - Pre-processing documents for training /Users/davidlaxer/tensorflow-metal/lib/python3.8/site-packages/sklearn/utils/deprecation.py:87: FutureWarning: Function get_feature_names is deprecated; get_feature_names is deprecated in 1.0 and will be removed in 1.2. Please use get_feature_names_out instead. warnings.warn(msg, category=FutureWarning) 2021-12-20 06:31:57,351 - top2vec - INFO - Loading universal-sentence-encoder model at /Users/davidlaxer/Downloads/universal-sentence-encoder_4 2021-12-20 06:31:57.488459: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. 2021-12-20 06:31:57.489288: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:305] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support. 2021-12-20 06:31:57.489490: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:271] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: <undefined>) Metal device set to: AMD Radeon Pro 5700 XT 2021-12-20 06:31:59.447260: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled. 2021-12-20 06:32:00,841 - top2vec - INFO - Creating joint document/word embedding 2021-12-20 06:32:00.923838: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled. Some resource has been exhausted. For example, this error might be raised if a per-user quota is exhausted, or perhaps the entire file system is out of space. @@__init__ 2 root error(s) found. (0) RESOURCE_EXHAUSTED: OOM when allocating tensor with shape[114389,320] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator Simple allocator [[{{node EncoderDNN/EmbeddingLookup/EmbeddingLookupUnique/GatherV2}}]] Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode. [[StatefulPartitionedCall/StatefulPartitionedCall/EncoderDNN/EmbeddingLookup/EmbeddingLookupUnique/Reshape_1/_188]] Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode. (1) RESOURCE_EXHAUSTED: OOM when allocating tensor with shape[114389,320] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator Simple allocator [[{{node EncoderDNN/EmbeddingLookup/EmbeddingLookupUnique/GatherV2}}]] Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode. ...
I tried adjusting the batchsize (e.g - 500, 100, 50, 10, 5).