Post not yet marked as solved
Hi everyone I'm trying to use createML hand action classifier to detect some simple actions, I'm having some trouble because the model only detects one hand at a time in the scene(even in the preview of the model, without any coding) and I would need both hands, is it a bug or am I doing something wrong?
Thank you in advance
Post not yet marked as solved
Describe the bug
When I convert (with coremltools framework) a scripted model which used a torch.nn.functional.upsample_bilinear() in forward() function, I get RuntimeError: PyTorch convert function for op 'uninitialized' not implemented.
?: What should I do to resolve this error? Please help.
Trace
% python3 pytorch_sandbox.py
Converting Frontend ==> MIL Ops: 47%|██▎ | 20/43 [00:00<00:00, 25123.11 ops/s]
Traceback (most recent call last):
File "pytorch_sandbox.py", line 22, in <module>
coreml_model = ct.convert(
File "/Users/user/Projects/project/lib/python3.8/site-packages/coremltools/converters/_converters_entry.py", line 326, in convert
mlmodel = mil_convert(
File "/Users/user/Projects/project/lib/python3.8/site-packages/coremltools/converters/mil/converter.py", line 182, in mil_convert
return _mil_convert(model, convert_from, convert_to, ConverterRegistry, MLModel, compute_units, **kwargs)
File "/Users/user/Projects/project/lib/python3.8/site-packages/coremltools/converters/mil/converter.py", line 209, in _mil_convert
proto, mil_program = mil_convert_to_proto(
File "/Users/user/Projects/project/lib/python3.8/site-packages/coremltools/converters/mil/converter.py", line 300, in mil_convert_to_proto
prog = frontend_converter(model, **kwargs)
File "/Users/user/Projects/project/lib/python3.8/site-packages/coremltools/converters/mil/converter.py", line 104, in __call__
return load(*args, **kwargs)
File "/Users/user/Projects/project/lib/python3.8/site-packages/coremltools/converters/mil/frontend/torch/load.py", line 50, in load
return _perform_torch_convert(converter, debug)
File "/Users/user/Projects/project/lib/python3.8/site-packages/coremltools/converters/mil/frontend/torch/load.py", line 95, in _perform_torch_convert
raise e
File "/Users/user/Projects/project/lib/python3.8/site-packages/coremltools/converters/mil/frontend/torch/load.py", line 87, in _perform_torch_convert
prog = converter.convert()
File "/Users/user/Projects/project/lib/python3.8/site-packages/coremltools/converters/mil/frontend/torch/converter.py", line 240, in convert
convert_nodes(self.context, self.graph)
File "/Users/user/Projects/project/lib/python3.8/site-packages/coremltools/converters/mil/frontend/torch/ops.py", line 74, in convert_nodes
raise RuntimeError(
RuntimeError: PyTorch convert function for op 'uninitialized' not implemented.
To Reproduce
import torch
import torch.nn as nn
import torch.nn.functional as F
import coremltools as ct
class M(nn.Module):
def __init__(self):
super(M, self).__init__()
def forward(self, x):
return F.upsample_bilinear(x, size=512)
m = M()
scripted_m = torch.jit.script(m)
example_input = torch.rand(1, 1, 64, 64)
image_input = ct.ImageType(name="input_1", shape=example_input.shape)
coreml_model = ct.convert(
scripted_m,
source='pytorch',
inputs=[image_input]
)
System environment (please complete the following information):
coremltools version: 5.1.0
OS: MacOS
macOS version: 12.1
XCode version : 13.1
How you install python: system + venv
python version: 3.8.10
any other relevant information:
torch version: 1.9.0
torchvision version: 0.10.0
Hi I have been the following WWDC21 "dynamic training on iOS" - I have been able to get the training working, with an output of the iterations etc being printed out in the console as training progresses.
However I am unable to retrieve the checkpoints or result/model once training has completed (or is in progress) nothing in the callback fires.
If I try to create a model from the sessionDirectory - it returns nil (even though training has clearly completed).
Please can someone help or provide pointers on how to access the results/checkpoints so that I can make a MlModel and use it.
var subscriptions = [AnyCancellable]()
let job = try! MLStyleTransfer.train(trainingData: datasource, parameters: trainingParameters, sessionParameters: sessionParameters)
job.result.sink { result in
print("result ", result)
}
receiveValue: { model in
try? model.write(to: sessionDirectory)
let compiledURL = try? MLModel.compileModel(at: sessionDirectory)
let mlModel = try? MLModel(contentsOf: compiledURL!)
}
.store(in: &subscriptions)
This also does not work:
job.checkpoints.sink { checkpoint in
// Process checkpoint
let model = MLStyleTransfer(trainingData: checkpoint)
}
.store(in: &subscriptions)
}
This is the printout in the console:
Using CPU to create model
+--------------+--------------+--------------+--------------+--------------+
| Iteration | Total Loss | Style Loss | Content Loss | Elapsed Time |
+--------------+--------------+--------------+--------------+--------------+
| 1 | 64.9218 | 54.9499 | 9.97187 | 3.92s |
2022-02-20 15:14:37.056251+0000 DynamicStyle[81737:9175431] [ServicesDaemonManager] interruptionHandler is called. -[FontServicesDaemonManager connection]_block_invoke
| 2 | 61.7283 | 24.6832 | 8.30343 | 9.87s |
| 3 | 59.5098 | 27.7834 | 11.7603 | 16.19s |
| 4 | 56.2737 | 16.163 | 10.985 | 22.35s |
| 5 | 53.0747 | 12.2062 | 12.0783 | 28.08s |
+--------------+--------------+--------------+--------------+--------------+
Any help would be appreciated on how to retrieve models.
Thanks
Post not yet marked as solved
I am training a simple Neural Network on my M1 Max with the following code in Tensorflow:
import tensorflow as tf
def get_and_pad_imdb_dataset(num_words=10000, maxlen=None, index_from=2):
from tensorflow.keras.datasets import imdb
# Load the reviews
(x_train, y_train), (x_test, y_test) = imdb.load_data(path='imdb.npz',
num_words=num_words,
skip_top=0,
maxlen=maxlen,
start_char=1,
oov_char=2,
index_from=index_from)
x_train = tf.keras.preprocessing.sequence.pad_sequences(x_train,
maxlen=None,
padding='pre',
truncating='pre',
value=0)
x_test = tf.keras.preprocessing.sequence.pad_sequences(x_test,
maxlen=None,
padding='pre',
truncating='pre',
value=0)
return (x_train, y_train), (x_test, y_test)
def get_imdb_word_index(num_words=10000, index_from=2):
imdb_word_index = tf.keras.datasets.imdb.get_word_index(
path='imdb_word_index.json')
imdb_word_index = {key: value + index_from for
key, value in imdb_word_index.items() if value <= num_words-index_from}
return imdb_word_index
(x_train, y_train), (x_test, y_test) = get_and_pad_imdb_dataset(maxlen=25)
imdb_word_index = get_imdb_word_index()
max_index_value = max(imdb_word_index.values())
embedding_dim = 16
model = tf.keras.Sequential([
tf.keras.layers.Embedding(input_dim = max_index_value+1, output_dim = embedding_dim, mask_zero = True),
tf.keras.layers.LSTM(units = 16),
tf.keras.layers.Dense(units = 1, activation = 'sigmoid')
])
model.compile(loss = 'binary_crossentropy', metrics = ['accuracy'], optimizer = 'adam')
history = model.fit(x_train, y_train, epochs=3, batch_size = 32)
I ran this code on Google Colab and it works perfectly fine without any problem at all.
However, on my M1 Max it just gets stucked at the very first epoch and it does not progress at all (even after a couple of hours).
This is all I get from the output after calling the .fit method:
2022-02-15 23:44:20.097795: W tensorflow/core/platform/profile_utils/cpu_utils.cc:128] Failed to get CPU frequency: 0 Hz
Epoch 1/3
2022-02-15 23:44:22.461438: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
I installed tensorflow on my machine following this guide: https://developer.apple.com/metal/tensorflow-plugin/
I am using a Conda environment with miniforge and the tensorflow related package (obtained with conda list) are:
tensorboard 2.6.0 pyhd8ed1ab_1 conda-forge
tensorboard-data-server 0.6.0 py39hfb8cd70_1 conda-forge
tensorboard-plugin-wit 1.8.0 pyh44b312d_0 conda-forge
tensorflow-deps 2.7.0 0 apple
tensorflow-estimator 2.7.0 pypi_0 pypi
tensorflow-macos 2.7.0 pypi_0 pypi
tensorflow-metal 0.3.0 pypi_0 pypi
My python version is 3.9.0
Post not yet marked as solved
Since there are only 80 class labels for existing YOLOv3 Coreml model, I want to add some more categories to be used in my app, can I do that? If this is possible, how can I do?
Post not yet marked as solved
I am a student from a non-English speaking country. I would like to get the sample code from the "Make apps smarter with Natural Language" keynote on dynamic word embeddings and custom sentence embeddings. I can't find relevant examples on google to keep learning about it. I hope the developer community can share the full text of the sample code for the Nosh app and Merch app in the video. Thanks Vivek and Doug.
Post not yet marked as solved
Good day people!
I'm currently working on my master thesis in media informatics. I'd really appreciate to discuss my topic with you guys, so I may get some interesting ideas or new information.
The goal is to implement an app, specifically designed for places like museums where the envrionment isn't perfect for AR tracking. (Darkness, no network connection, maybe exhibits made out of glass...)
Therefore, i'd like to develop a neuronal network for the new ipad pro that takes rgb-d data to predict a pose estimation in a scene for an object, so that it matches the real world object perfectly. This placed object will be a perfect 3d model replica of the real object. (hand modeled or scanned and revised)
This should allow me to place AR Content precisely over the real world object, even in difficult lightlings and stuff. Maybe it will improve occlusion, too. I can imagine that the neuronal network may also detect structures, edges and semantic coherences better than the usual approach.
My first thought was to work with CoreML, Metal, maybe Vision and ARKit. I will also try out XCode for the first time.
Maybe you guys have interesting ideas for improvement or can guide me a little bit, since i fell a bit lost at the moment.
Would you use rather point clouds or the raw depth buffer to train the model? Would you also train with edge filter images and stuff? Why or why not?
Thanks in advance, it would mean the world to me!
Kind regards, Miri :-)
Post not yet marked as solved
The tensorflow-macos repo has been archived and the last commit redirects users to the plugin page; however, this page still instructs users to install the now-archived fork. Are these instructions still up to date? Also, what is the plan long-term for Metal/M1 acceleration in Tensorflow—will the necessary changes eventually be upstreamed, if they haven't already?
Post not yet marked as solved
I want to detect an image of a dart target (https://commons.wikimedia.org/wiki/File:WA_80_cm_archery_target.svg) in my iOS app.
For that I am creating an object detector with CreateML. I am using the Transfer Learning algorithm and 114 annotated images, the validation data is set to auto.
After 2000 iterations I got the following stats: 96% Training and 0% Validation.
As I understand it, the percentages are the I/U 50% scores (= percentage of intersection over union ratios from the bounding boxes with over 50%).
If the validation data is automatically chosen from the set of images, how can its score be 0%?
Post not yet marked as solved
Hello! I’m having an issue with retrieving the trained weights from MLCLSTMLayer in ML Compute when training on a GPU. I maintain references to the input-weights, hidden-weights, and biases tensors and use the following code to extract the data post-training:
extension MLCTensor {
func dataArray<Scalar>(as _: Scalar.Type) throws -> [Scalar] where Scalar: Numeric {
let count = self.descriptor.shape.reduce(into: 1) { (result, value) in
result *= value
}
var array = [Scalar](repeating: 0, count: count)
self.synchronizeData() // This *should* copy the latest data from the GPU to memory that’s accessible by the CPU
_ = try array.withUnsafeMutableBytes { (pointer) in
guard let data = self.data else {
throw DataError.uninitialized // A custom error that I declare elsewhere
}
data.copyBytes(to: pointer)
}
return array
}
}
The issue is that when I call dataArray(as:) on a weights or biases tensor for an LSTM layer that has been trained on a GPU, the values that it retrieves are the same as they were before training began. For instance, if I initialize the biases all to 0 and then train the LSTM layer on a GPU, the biases values seemingly remain 0 post-training, even though the reported loss values decrease as you would expect.
This issue does not occur when training an LSTM layer on a CPU, and it also does not occur when training a fully-connected layer on a GPU. Since both types of layers work properly on a CPU but only MLCFullyConnectedLayer works properly on a GPU, it seems that the issue is a bug in ML Compute’s GPU implementation of MLCLSTMLayer specifically.
For reference, I’m testing my code on M1 Max.
Am I doing something wrong, or is this an actual bug that I should report in Feedback Assistant?
Post not yet marked as solved
Is it possible to do any of the following:
Export a model created using MetalPerformanceShadersGraph to a CoreML file;
Failing 1., save a trained MetalPerformanceShadersGraph model in any other way for deployment;
Import a CoreML model and use it as a part of a MetalPerformanceShadersGraph model.
Thanks!
Post not yet marked as solved
Is it possible to use SNAudioFileAnalyzer with live HLS(m3u8) stream? Maybe we need to extract somehow audio from it?
And Can we use SNAudioFileAnalyzer with real remote url? Or we can use it only with files in file system?
Post not yet marked as solved
It seems it didn't get much attention since it's lost in another thread.
tf.random is broken since 12.1.
import tensorflow as tf
x = tf.random.uniform((10,))
y = tf.random.uniform((10,))
tf.print(x)
tf.print(y)
[0.178906798 0.8810848 0.384304762 ... 0.162458301 0.64780426 0.0123682022]
[0.178906798 0.8810848 0.384304762 ... 0.162458301 0.64780426 0.0123682022]
works fine on collab, worked on 12.0, It also works fine if I disable GPU with :
tf.config.set_visible_devices([], 'GPU')
WORKAROUND :
g = tf.random.Generator.from_non_deterministic_state()
x = g.uniform((10,))
y = g.uniform((10,))
Post not yet marked as solved
wwdc20-10673 briefly shows how to visualize optical flow generated by VNGenerateOpticalFlowRequest and sample code is available through the developer app. But how can we build the OpticalFlowVisualizer.ci.metallib file from the CI-kernel code provided as OpticalFlowVisualizer.cikernel?
Post not yet marked as solved
I just got my new MacBook Pro with M1 Max chip and am setting up Python. I've tried several combinational settings to test speed - now I'm quite confused. First put my questions here:
Why python run natively on M1 Max is greatly (~100%) slower than on my old MacBook Pro 2016 with Intel i5?
On M1 Max, why there isn't significant speed difference between native run (by miniforge) and run via Rosetta (by anaconda) - which is supposed to be slower ~20%?
On M1 Max and native run, why there isn't significant speed difference between conda installed Numpy and TensorFlow installed Numpy - which is supposed to be faster?
On M1 Max, why run in PyCharm IDE is constantly slower ~20% than run from terminal, which doesn't happen on my old Intel Mac.
Evidence supporting my questions is as follows:
Here are the settings I've tried:
1. Python installed by
Miniforge-arm64, so that python is natively run on M1 Max Chip. (Check from Activity Monitor, Kind of python process is Apple).
Anaconda.: Then python is run via Rosseta. (Check from Activity Monitor, Kind of python process is Intel).
2. Numpy installed by
conda install numpy: numpy from original conda-forge channel, or pre-installed with anaconda.
Apple-TensorFlow: with python installed by miniforge, I directly install tensorflow, and numpy will also be installed. It's said that, numpy installed in this way is optimized for Apple M1 and will be faster. Here is the installation commands:
conda install -c apple tensorflow-deps
python -m pip install tensorflow-macos
python -m pip install tensorflow-metal
3. Run from
Terminal.
PyCharm (Apple Silicon version).
Here is the test code:
import time
import numpy as np
np.random.seed(42)
a = np.random.uniform(size=(300, 300))
runtimes = 10
timecosts = []
for _ in range(runtimes):
s_time = time.time()
for i in range(100):
a += 1
np.linalg.svd(a)
timecosts.append(time.time() - s_time)
print(f'mean of {runtimes} runs: {np.mean(timecosts):.5f}s')
and here are the results:
+-----------------------------------+-----------------------+--------------------+
| Python installed by (run on)→ | Miniforge (native M1) | Anaconda (Rosseta) |
+----------------------+------------+------------+----------+----------+---------+
| Numpy installed by ↓ | Run from → | Terminal | PyCharm | Terminal | PyCharm |
+----------------------+------------+------------+----------+----------+---------+
| Apple Tensorflow | 4.19151 | 4.86248 | / | / |
+-----------------------------------+------------+----------+----------+---------+
| conda install numpy | 4.29386 | 4.98370 | 4.10029 | 4.99271 |
+-----------------------------------+------------+----------+----------+---------+
This is quite slow. For comparison,
run the same code on my old MacBook Pro 2016 with i5 chip - it costs 2.39917s.
another post reports that run with M1 chip (not Pro or Max), miniforge+conda_installed_numpy is 2.53214s, and miniforge+apple_tensorflow_numpy is 1.00613s.
you may also try on it your own.
Here is the CPU information details:
My old i5:
$ sysctl -a | grep -e brand_string -e cpu.core_count
machdep.cpu.brand_string: Intel(R) Core(TM) i5-6360U CPU @ 2.00GHz
machdep.cpu.core_count: 2
My new M1 Max:
% sysctl -a | grep -e brand_string -e cpu.core_count
machdep.cpu.brand_string: Apple M1 Max
machdep.cpu.core_count: 10
I follow instructions strictly from tutorials - but why would all these happen? Is it because of my installation flaws, or because of M1 Max chip? Since my work relies heavily on local runs, local speed is very important to me. Any suggestions to possible solution, or any data points on your own device would be greatly appreciated :)
Post not yet marked as solved
Hi all, I've spent some time experimenting with the BNNS (Accelerate) LSTM-related APIs lately and despite a distinct lack of documentation (even though the headers have quite a few) a got most things to a point where I think I know what's going on and I get the expected results.
However, one thing I have not been able to do is to get this working if inputSize != hiddenSize.
I am currently only concerned with a simple unidirectional LSTM with a single layer but none of my permutations of gate "iw_desc" matrices with various 2D layouts and reordering input-size/hidden-size made any difference, ultimately BNNSDirectApplyLSTMBatchTrainingCaching always returns -1 as an indication of error.
Any help would be greatly appreciated.
PS: The bnns.h framework header file claims that "When a parameter is invalid or an internal error occurs, an error message will be logged. Some combinations of parameters may not be supported. In that case, an info message will be logged.", and yet, I've not been able to find any such messages logged to NSLog() or stderr or Console. Is there a magic environment variable that I need to set to get more verbose logging?
Post not yet marked as solved
I'm using Vision to conduct some OCR from a live camera feed. I've setup my VNRecognizeTextRequests as follows:
let request = VNRecognizeTextRequest(completionHandler: recognizeTextCompletionHandler)
request.recognitionLevel = .accurate
request.usesLanguageCorrection = false
And I handle the results as follows:
guard let observations = request.results as? [VNRecognizedTextObservation] else { return }
for observation in observations {
if let recognizedText = observation.topCandidates(1).first {
guard recognizedText.confidence >= self.confidenceLimit, // set to 0.5
let foundText = validateRegexPattern(text: recognizedText.string, regexPattern: self.regexPattern),
let foundDecimal = Double(foundText) else { continue }
}
This is actually working great and yielding very accurate results, but the confidence values I'm receiving from the results are generally either 0.5 or 1.0, and rarely 0.3. I find these to be pretty nonsensical confidence values and I'm wondering if this is the intended result or some sort of bug. Conversely, using recognitionLevel = .fast yields more realistic and varied confidence values, but much less accurate results overall (even though fast is recommended for OCR from a live camera feed, I've had significantly better results using the accurate recognition level, which is why I've been using the accurate recognition level)
Post not yet marked as solved
Greetings.
I was shopping around for an external GPU and Machine Learning + GPU solution for my Mac.
Are there any suggestions? It looks like Keras 2.4 has dropped the multi backed support which is worrying.
I'm trying to make sure I make a purchase that will fly with what I have.
I'm using a Mac Mini (2018) with 64GB RAM. I have multiple Thunderbolt 3 ports. This is an Intel chipset machine, not an M1.
Is this feasible?
Post not yet marked as solved
How to convert an existing mlmodel to mlprogram without source code?
It has been verified that coremltools.convert does not support coremltools.models.MLModel as source.
Post not yet marked as solved
as above.
I found that M1 Chip has extremely bad performance on training RNNs (158s vs 6h). Could anyone know why the M1 Pro chip is so unfriendly for RNNs? How can I only use CPU to run my code as mlcompute package cannot be recognized somehow.