Hello,
I got a brand new MacBook M3 Pro and trying to configure Tensorflow w/ GPU support. I followed instructions provided at https://developer.apple.com/metal/tensorflow-plugin/ step by step. Unfortunately, even after creating/recreating/installing/uninstalling TensorFlow the problem is not getting resolved as Python crashes. I cannot get past that point to try Jupyter notebook. Here is the error ask the versions in "tf" environment. I already spent entire Saturday yesterday and so far no progress. Can someone tell me what is going on?
Python 3.11.7 (main, Dec 4 2023, 18:10:11) [Clang 15.0.0 (clang-1500.1.0.2.5)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
import tensorflow as tf
2024-01-07 11:44:04.893581: F tensorflow/c/experimental/stream_executor/stream_executor.cc:743] Non-OK-status: stream_executor::MultiPlatformManager::RegisterPlatform( std::move(cplatform)) status: INTERNAL: platform is already registered with name: "METAL"
[1] 1797 abort /opt/homebrew/bin/python3
❯ python -m pip list | grep tensorflow
tensorflow 2.15.0
tensorflow-estimator 2.15.0
tensorflow-io-gcs-filesystem 0.34.0
tensorflow-macos 2.15.0
tensorflow-metal 1.1.0
❯ python --version
Python 3.11.7
OS is Sonoma 14.2.1
Thanks
Sohail
Explore the power of machine learning and Apple Intelligence within apps. Discuss integrating features, share best practices, and explore the possibilities for your app here.
Post
Replies
Boosts
Views
Activity
Hello
I use Mac Pro M2 16GB
This is my code. It is very basic code.
`model = Sequential()
model.add(LSTM(units=50, input_shape=(X_train.shape[1], X_train.shape[2])))
model.add(Dense(units=1))
model.compile(optimizer='adam', loss='mse')
model.fit(X_train, y_train, epochs=50, batch_size=16)
train_predict = model.predict(X_train)
test_predict = model.predict(X_test)
train_predict = scaler.inverse_transform(train_predict)
y_train = scaler.inverse_transform(y_train)
test_predict = scaler.inverse_transform(test_predict)
y_test = scaler.inverse_transform(y_test)
When I try to execute this code, anaconda gives the following error
I metal_plugin/src/device/metal_device.cc:1154] Metal device set to: Apple M2
I metal_plugin/src/device/metal_device.cc:296] systemMemory: 16.00 GB
I metal_plugin/src/device/metal_device.cc:313] maxCacheSize: 5.33 GB
I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:306] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:272] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: )
I can't find any solution, could you help me
Thank you
I am working on a design that requires connecting an ios device to two audio output devices specifically headphones and a speaker. I want the audio driver to switch output device without user action. Is this manageable via ios SDK?
Hi,
there seems to be a difference in behavior when running inference on a trained Keras model using the model __call__ method vs. using the predict or predict_on_batch methods. This only happens when using the GPU for inference and it seems that for certain sequence of operations and float types the 'relu' activation doesn't work as expected and seems to do nothing.
I can replicate the problem with the following code (it would only fail with 'relu' activation and tf.float16 and tf.float32 types, while it works fine with tf.float64).
import tensorflow as tf
import numpy as np
DATA_LENGTH = 16
DENSE_WIDTH = 16
BATCH_SIZE = 8
DTYPE = tf.float32
ACTIVATION = 'relu'
def TestModel():
inputs = tf.keras.Input(DATA_LENGTH, dtype=DTYPE)
u = tf.keras.layers.Dense(DENSE_WIDTH, activation=ACTIVATION, dtype=DTYPE)(inputs)
# u = tf.maximum(u, 0.0)
output = u*tf.constant(1.0, dtype=DTYPE)
model = tf.keras.Model(inputs, output, name="TestModel")
return model
model = TestModel()
model.compile()
x = np.random.uniform(size=(BATCH_SIZE, DATA_LENGTH)).astype(DTYPE.as_numpy_dtype)
with tf.device('/GPU:0'):
out_gpu_call = model(x, training=False)
out_gpu_predict = model.predict_on_batch(x)
with tf.device('/CPU:0'):
out_cpu_call = model(x, training=False)
out_cpu_predict= model.predict_on_batch(x)
print(f'\nDTYPE {DTYPE}, ACTIVATION: {ACTIVATION}')
print("\tMean Abs. Difference GPU (__call__ vs. predict):", np.mean(np.abs(out_gpu_call - out_gpu_predict)))
print("\tMean Abs. Difference CPU (__call__ vs. predict):", np.mean(np.abs(out_cpu_call - out_cpu_predict)))
print("\tMean Abs. Difference GPU-CPU __call__:", np.mean(np.abs(out_gpu_call - out_cpu_call)))
print("\tMean Abs. Difference GPU-CPU predict():", np.mean(np.abs(out_gpu_predict - out_cpu_predict)))
The code above produces for example the following output:
DTYPE <dtype: 'float32'>, ACTIVATION: relu
Mean Abs. Difference GPU (__call__ vs. predict): 0.1955472
Mean Abs. Difference CPU (__call__ vs. predict): 0.0
Mean Abs. Difference GPU-CPU __call__: 1.3573299e-08
Mean Abs. Difference GPU-CPU predict(): 0.1955472
And the results for the GPU are:
out_gpu_call
<tf.Tensor: shape=(8, 16), dtype=float32, numpy=
array([[0.1496982 , 0. , 0. , 0.73772687, 0.26131183,
0.27757105, 0. , 0. , 0. , 0. ,
0. , 0.4164225 , 1.0367445 , 0. , 0.5860609 ,
0. ], ...
out_gpu_predict
array([[ 1.49698198e-01, -3.48425686e-01, -2.44667321e-01,
7.37726867e-01, 2.61311829e-01, 2.77571052e-01,
-2.26729304e-01, -1.06500387e-01, -3.66294265e-01,
-2.93850392e-01, -4.51043218e-01, 4.16422486e-01,
1.03674448e+00, -1.39347658e-01, 5.86060882e-01,
-2.05334812e-01], ...
Upon inspection of the results it seems that the problem is that the 'relu' activation is not setting the values < 0 to 0 when calling predict_on_batch.
When uncommenting the # u = tf.maximum(u, 0.0) line after the Dense layer there is no difference between the two calls (as should be expected).
It also happens that removing the multiplication by a constant after the Dense layer, output = u*tf.constant(1.0, dtype=DTYPE) makes the problem dissappear (even when leaving the # u = tf.maximum(u, 0.0) line commented).
This is running with the following setup:
MacBook Pro, Apple M2 Max chip, macOS Sonoma 14.2
tf version 2.15.0
tensorflow-metal 1.1.0
Python 3.10.13
Tensorflow-Metal training got an increasing loss in CNN.
But same codes run correctly after pip uninstall tensorflow-metal
I'm going to the U.S. to buy a vision pro, does anyone have any information about where they sell it? Will it be sold in Hawaii by any chance? For now, I'm thinking about New York.
I'm currently building an iOS app that requires the ability to detect a person's height with a live video stream. The new VNDetectHumanBodyPose3DRequest is exactly what I need but the observations I'm getting back are very inconsistent and unreliable. When I say inconsistent, I mean the values never seem to settle and they can fluctuate anywhere from 5 '4" to 10'1" (I'm about 6'0"). In terms of unreliable, I have once seen a value that closely matches my height but I rarely see any values that are close enough (within an inch) of the ground truth.
In terms of my code, I'm not doing any fancy. I'm first opening a LiDAR stream on my iPhone Pro 14:
guard let videoDevice = AVCaptureDevice.default(.builtInLiDARDepthCamera, for: .video, position: .back) else { return }
guard let videoDeviceInput = try? AVCaptureDeviceInput(device: videoDevice) else { return }
guard captureSession.canAddInput(videoDeviceInput) else { return }
captureSession.addInput(videoDeviceInput)
I'm then creating an output synchronizer so I can get both image and depth data at the same time:
videoDataOutput = AVCaptureVideoDataOutput()
captureSession.addOutput(videoDataOutput)
depthDataOutput = AVCaptureDepthDataOutput()
depthDataOutput.isFilteringEnabled = true
captureSession.addOutput(depthDataOutput)
outputVideoSync = AVCaptureDataOutputSynchronizer(dataOutputs: [depthDataOutput, videoDataOutput])
Finally, my delegate function that handles the synchronizer is roughly:
fileprivate func perform3DPoseRequest(cmSampleBuffer: CMSampleBuffer, depthData: AVDepthData) {
let imageRequestHandler = VNImageRequestHandler(cmSampleBuffer: cmSampleBuffer, depthData: depthData, orientation: .up)
let request = VNDetectHumanBodyPose3DRequest()
do {
// Perform the body pose request.
try imageRequestHandler.perform([request])
if let observation = request.results?.first {
if (observation.heightEstimation == .measured) {
print("Body height (ft) \(formatter.string(fromMeters: Double(observation.bodyHeight))) (m): \(observation.bodyHeight)")
...
I'd appreciate any help determining how to get accurate results from the observation's bodyHeight. Thanks!
I created a new environment on Conda and then installed TensorFlow using the command "pip install TensorFlow" on my Mac M1 Pro machine.
But TensorFlow is not working.
Hy,
I'm French developer and I downloaded the Recognizing Speech in live Audio sample code from Developer Apple website. I tried to execute data generator command after changing the local identifier from 'en_US' to 'fr' in data generator main file , but when I ran the command in Xcode, I had this error message : " Identifier 'fr' does not parse into two elements."
I checked the xml files associated to the bin archive file and the identifiers are no correct (they keep 'en-US' value).
Thanks for your help !
Running grouped convolutions on an M2 with the metal plugin I get an error. Example code:
Using TF2.11 and no metal plugin I get
import tensorflow as tf
tf.keras.layers.Conv1D(5,1,padding="same", kernel_initializer="ones", groups=5)(tf.ones((1,1,5)))
# displays
<tf.Tensor: shape=(1, 1, 5), dtype=float32, numpy=array([[[1., 1., 1., 1., 1.]]], dtype=float32)>
On TF2.14 with the plugin I received
import tensorflow as tf
tf.keras.layers.Conv1D(5,1,padding="same", kernel_initializer="ones", groups=5)(tf.ones((1,1,5)))
# displays
...
NotFoundError: Exception encountered when calling layer 'conv1d_3' (type Conv1D).
could not find registered platform with id: 0x104d8f6f0 [Op:__inference__jit_compiled_convolution_op_78]
Call arguments received by layer 'conv1d_3' (type Conv1D):
• inputs=tf.Tensor(shape=(1, 1, 5), dtype=float32)
could not find registered platform with id
I run a MiDaS CoreML model on the Device.
It run well on VisionPro Simulator and iOS RealDevice.
But crash on VisionPro device.
crash mssage:
/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShaders/MPSCore/Utility/MPSLibrary.mm:550: failed assertion `MPSKernel MTLComputePipelineStateCache unable to load function ndArrayConvolution2DA14.
Crashlog_com.moemiku.VisionMagicPhoto_2024-01-21-16-01-07.txt
Crashlog_com.moemiku.VisionMagicPhoto_2024-01-21-16-00-39.txt
I have a neural network that should run on my device with 3 different input shapes. When converting it to mlmodel or mlpackage files with fixed input size it runs on ANE.
But when converted it with EnumeratedShape it runs only on CPU.
Why?
I think that the problematic layer is the slice (which converted in the flexible model to SliceStatic), but don't understand why and if there is any way to solve it and run the Enumerated model on ANE.
Here is my code
class TestModel(torch.nn.Module):
def __init__(self):
super(TestModel, self).__init__()
self.dw1 = torch.nn.Conv2d(in_channels=641, out_channels=641, kernel_size=(5,4), groups=641)
self.pw1 = torch.nn.Conv2d(in_channels=641, out_channels=512, kernel_size=(1,1))
self.relu = torch.nn.ReLU()
self.pw2 = torch.nn.Conv2d(in_channels=512, out_channels=641, kernel_size=(1,1))
self.dw2 = torch.nn.Conv2d(in_channels=641, out_channels=641, kernel_size=(5,1), groups=641)
self.pw3 = torch.nn.Conv2d(in_channels=641, out_channels=512, kernel_size=(1,1))
self.block1_dw = torch.nn.Conv2d(in_channels=512, out_channels=512, kernel_size=(5,1), groups=512)
self.block1_pw = torch.nn.Conv2d(in_channels=512, out_channels=512, kernel_size=(1,1))
def forward(self, inputs):
x = self.dw1(inputs)
x = self.pw1(x)
x = self.relu(x)
x = self.pw2(x)
x = self.dw2(x)
x = self.pw3(x)
x = self.relu(x)
y = self.block1_dw(x)
y = self.block1_pw(y)
y = self.relu(y)
z = x[:,:,4:,:] + y
return z
ex_input = torch.rand(1, 641, 44, 4)
traced_model = torch.jit.trace(TestModel().eval(), [ex_input,])
ct_enum_inputs = [ct.TensorType(name='inputs', shape=enum_shape)]
ct_outputs = [ct.TensorType(name='out')]
mlmodel_enum = ct.convert(traced_model, inputs=ct_enum_inputs, outputs=ct_outputs, convert_to="neuralnetwork")
mlmodel.save(...)
Thanks.
I did a clean install of Python (v. 3.10), then Tensorflow & Tensorflow-Metal following exactly the process stated in Apple's plugin support page. Now, every time I run ANY python code with Tensorflow it crashes in the model.fit instruction. It does not matter what I feed into it, even code that used to run perfectly on my previous MacBook (Intel)... I've researched ad-vomitum for answers but Apple washes it's hands stating that is Tensorflow and Tensorflow does the same. Fact is that exactly the same code runs flawlessly on my Windows NVIDIA PC setup.
I purchased the m3 laptop with the hope of having the possibility to train my neural networks "on the go"... now I lost $5,000 usd, I can't make it work, and is a total disaster.
I am extremely competent in Python development and have been developing neural networks for years. So if you are going to comment, please avoid suggestions like "check your Python version" etc. - This is DEFINITIVELY due to the m3 Mac. Exact same setup is working OK on an M1-Ultra Mac Studio. It is just not portable...
Does anyone have any specific advice on how to make a proper setup of Tensorflow for the Mac M3??
WWDC22 video "Explore the machine learning development experience" provides Python code for an interesting application (real-time ML image colorization), but doesn't provide the complete Xcode project, and assumes viewer knows how to do Python in Xcode (haven't heard of such in 10 years of iOS development!).
Any pointers to either the video's example Xcode project, or how to create a suitable Xcode project capable of running Python code?
On tf version 2.11.0.
I have tried to follow on a fairly standard NN example in order to convert to a CoreML model. However, I cannot get this to work and I'm not clear where it is going wrong. It would seem to be a fairly standard task - a toy example - and I can't see why the conversion would fail.
Any help would be appreciated. I have tried the different approaches listed below, but it seems the conversion should just work.
I have also tried running the same code pinned to:
tensorflow==2.6.2
scikit-learn==0.19.2
pandas==1.1.1
And get a different sequence of errors.
The Python code I used mostly comes form this example:
https://lnwatson.co.uk/posts/intro_to_nn/
import pandas as pd
import numpy as np
import tensorflow as tf
import torch
from sklearn.model_selection import train_test_split
from tensorflow import keras
import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '1'
np.bool = np.bool_
np.int = np.int_
print("tf version", tf.__version__)
csv_url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'
col_names = ['Sepal_Length','Sepal_Width','Petal_Length','Petal_Width','Class']
df = pd.read_csv(csv_url, names = col_names)
labels = df.pop('Class')
labels = pd.get_dummies(labels)
X = df.values
y = labels.values
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.05)
X_train, X_val, y_train, y_val = train_test_split(X_train, y_train, test_size=0.2)
model = keras.Sequential()
model.add(keras.layers.Dense(16, activation='relu', input_shape=(4,)))
model.add(keras.layers.Dense(3, activation='softmax'))
model.summary()
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
model.fit(X_train, y_train,
batch_size=12,
epochs=200,
validation_data=(X_val, y_val))
import coremltools as ct
# Pass in `tf.keras.Model` to the Unified Conversion API
mlmodel = ct.convert(model, convert_to="mlprogram")
# mlmodel = ct.convert(model, source="tensorflow")
# mlmodel = ct.convert(model, convert_to="neuralnetwork")
# mlmodel = ct.convert(
# model,
# source="tensorflow",
# inputs=[ct.TensorType(name="input")],
# outputs=[ct.TensorType(name="output")],
# minimum_deployment_target=ct.target.iOS14,
# )
When using either of these 3:
mlmodel = ct.convert(model, convert_to="mlprogram")
mlmodel = ct.convert(model, source="tensorflow")
mlmodel = ct.convert(model, convert_to="neuralnetwork")
I get:
mlmodel2 = ct.convert(model, source="tensorflow")
ValueError: Const node 'sequential_5/dense_10/MatMul/ReadVariableOp' cannot have no value
ERROR:root:sequential_5/dense_11/BiasAdd/ReadVariableOp:0
ERROR:root:[ 0.34652767 0.16202268 -0.3554725 ]
Running TensorFlow Graph Passes: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:00<00:00, 28.76 passes/s]
Converting Frontend ==> MIL Ops: 8%|█████████████████ | 1/12 [00:00<00:00, 16710.37 ops/s]
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
File ~/Documents/CoreML Basic Models/NN_Keras_Iris.py:142
130 import coremltools as ct
131 # Pass in `tf.keras.Model` to the Unified Conversion API
132 # mlmodel = ct.convert(model, convert_to="mlprogram")
133
(...)
140
141 # ct.convert(mymodel(), source="tensorflow")
--> 142 mlmodel2 = ct.convert(model, source="tensorflow")
144 mlmodel = ct.convert(
145 model,
146 source="tensorflow",
(...)
153 minimum_deployment_target=ct.target.iOS14,
154 )
....
File ~/opt/anaconda3/envs/coreml_env/lib/python3.8/site-packages/coremltools/converters/mil/frontend/tensorflow/ops.py:430, in Const(context, node)
427 @register_tf_op
428 def Const(context, node):
429 if node.value is None:
--> 430 raise ValueError("Const node '{}' cannot have no value".format(node.name))
431 mode = get_const_mode(node.value.val)
432 x = mb.const(val=node.value.val, mode=mode, name=node.name)
ValueError: Const node 'sequential_5/dense_10/MatMul/ReadVariableOp' cannot have no value
Second Approach:
A different approach I tried was specifying the inout type TensorType.
However, when specifying the input and outputs I get a different error. I have tried variations on this initialiser but all produce the same error.
The variations revolve around adding input_shape, dtype=np.float32
mlmodel = ct.convert(
model,
source="tensorflow",
inputs=[ct.TensorType(name="input")],
outputs=[ct.TensorType(name="output")],
minimum_deployment_target=ct.target.iOS14,
)
t
File ~/opt/anaconda3/envs/coreml_env/lib/python3.8/site-packages/coremltools/converters/mil/frontend/tensorflow/load.py:106, in <listcomp>(.0)
104 logging.debug(msg.format(outputs))
105 outputs = outputs if isinstance(outputs, list) else [outputs]
--> 106 outputs = [i.split(":")[0] for i in outputs]
107 if _get_version(tf.__version__) < _StrictVersion("1.13.1"):
108 return tf.graph_util.extract_sub_graph(graph_def, outputs)
AttributeError: 'TensorType' object has no attribute 'split'
Hello! I'm implementing cropping an object from an image mechanism.
@MainActor static func detectObjectOnImage(image: UIImage) async throws -> UIImage {
let analyser = ImageAnalyzer()
let interaction = ImageAnalysisInteraction()
let configuration = ImageAnalyzer.Configuration([.visualLookUp])
let analysis = try await analyser.analyze(image, configuration: configuration)
interaction.analysis = analysis
return try await interaction.image(for: interaction.subjects)
}
My app supports iOS 16 and a compiler doesn't complain about the code.However when I run it on simulator with iOS 16, I'm getting "symbol not found" error on the app launch. Does anybody know what can be the issue?
Kia ora,
Been having heaps of trouble recently trying to get TensorFlow working, it just suddenly stopped and the kernel would just crash every time I try to import tf.
I've tried just about everything eg. fresh install of python, reinstalling Xcode dev tools
Below is the relevant lines of pip freeze, using python 1.10.13 btw
tensorboard==2.15.1
tensorboard-data-server==0.7.2
tensorboard-plugin-wit==1.8.1
tensorflow==2.15.0
tensorflow-estimator==2.15.0
tensorflow-io-gcs-filesystem==0.34.0
tensorflow-macos==2.15.0
tensorflow-metal==0.5.0
Below is the cell in question that is killing the kernal
import tensorflow as tf import matplotlib.pyplot as plt
import tensorflow_datasets as tfds
from tensorflow.keras.layers import Conv2D, MaxPool2D, Dense, Flatten, InputLayer, BatchNormalization, Dropout
from tensorflow.keras.losses import BinaryCrossentropy
from tensorflow.keras.optimizers.legacy import Adam
I'll be around all day so if you have anything that can help, I'll be sure to give it a go as soon as you post it and get back to you!
Looking forward to your replies.
Nga mihi,
Kane
After training my dataset, the training, validation, and testing sets all show 0% in detection accuracy and all my test photos show false negative. The dataset has 1032 photos and 2 classes, and I used Roboflow for the image annotation. For network, I choose full network. If there is any way to fix this?
Is there a way to extract the list of words recognized by the Speech framework?
I'm trying to filter out words that won't appear in the transcription output, but to do that I'll need a list of words that can appear. SFSpeechLanguageModel.Configuration can be initialized with a vocabulary, but there doesn't seem to be a way to read it, and while there are ways to create custom vocabularies, I have yet to find a way to retrieve it.
I added the Natural Language tag in case the framework might contribute to a solution
On an Apple M1 with Ventura 13.6.
I followed the steps on the Get started with tensorflow-metal page here:
https://developer.apple.com/metal/tensorflow-plugin/
python3 -m venv ~/venv-metal
source ~/venv-metal/bin/activate
python -m pip install -U pip
python -m pip install tensorflow
python -m pip install tensorflow-metal
With a clean start I also tried a pinning
python -m pip install tensorflow==2.13.0
Where Successfully installed tensorflow-metal-1.0.0
The table here suggested this should work.
https://pypi.org/project/tensorflow-metal/
But I got the same error...
Running Python code without the tensorflow import was not a problem. I found forums with similar error on Mac 1 but none of the proposed solution worked.
Is there suggested steps to get the `get started tutorial working?