Apple Developer Forums

Tensorflow on M1 Macbook Pro, error when model fit executes

It doesn't matter if I install miniforge or mamba, directly or through brew, when I try to fit the sample model from https://developer.apple.com/metal/tensorflow-plugin/, even with a simple sequential model, I always get this error. Is there any workaround on this? I'll appreciate any help, thanks! 2022-12-10 11:18:19.941623: W tensorflow/tsl/platform/profile_utils/cpu_utils.cc:128] Failed to get CPU frequency: 0 Hz 2022-12-10 11:18:20.427283: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:114] Plugin optimizer for device_type GPU is enabled. 2022-12-10 11:18:21.222950: W tensorflow/core/framework/op_kernel.cc:1830] OP_REQUIRES failed at xla_ops.cc:418 : NOT_FOUND: could not find registered platform with id: 0x28edf1f90 2022-12-10 11:18:21.223003: W tensorflow/core/framework/op_kernel.cc:1830] OP_REQUIRES failed at xla_ops.cc:418 : NOT_FOUND: could not find registered platform with id: 0x28edf1f90 2022-12-10 11:18:21.363366: W tensorflow/core/framework/op_kernel.cc:1830] OP_REQUIRES failed at xla_ops.cc:418 : NOT_FOUND: could not find registered platform with id: 0x28edf1f90 2022-12-10 11:18:21.364757: W tensorflow/core/framework/op_kernel.cc:1830] OP_REQUIRES failed at xla_ops.cc:418 : NOT_FOUND: could not find registered platform with id: 0x28edf1f90 2022-12-10 11:18:21.388739: W tensorflow/core/framework/op_kernel.cc:1830] OP_REQUIRES failed at xla_ops.cc:418 : NOT_FOUND: could not find registered platform with id: 0x28edf1f90 2022-12-10 11:18:21.388757: W tensorflow/core/framework/op_kernel.cc:1830] OP_REQUIRES failed at xla_ops.cc:418 : NOT_FOUND: could not find registered platform with id: 0x28edf1f90 NotFoundError Traceback (most recent call last) Cell In[25], line 2 1 model = create_model() ----> 2 history = model.fit(Xf_train, yf_train, epochs=3, batch_size=64); File /opt/homebrew/Caskroom/miniforge/base/envs/tf/lib/python3.10/site-packages/keras/utils/traceback_utils.py:70, in filter_traceback..error_handler(*args, **kwargs) 67 filtered_tb = _process_traceback_frames(e.traceback) 68 # To get the full stack trace, call: 69 # tf.debugging.disable_traceback_filtering() ---> 70 raise e.with_traceback(filtered_tb) from None 71 finally: 72 del filtered_tb File /opt/homebrew/Caskroom/miniforge/base/envs/tf/lib/python3.10/site-packages/tensorflow/python/eager/execute.py:52, in quick_execute(op_name, num_outputs, inputs, attrs, ctx, name) 50 try: 51 ctx.ensure_initialized() ---> 52 tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name, 53 inputs, attrs, num_outputs) 54 except core._NotOkStatusException as e: 55 if name is not None: NotFoundError: Graph execution error: Detected at node 'StatefulPartitionedCall_4' defined at (most recent call last): File "/opt/homebrew/Caskroom/miniforge/base/envs/tf/lib/python3.10/runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "/opt/homebrew/Caskroom/miniforge/base/envs/tf/lib/python3.10/runpy.py", line 86, in _run_code exec(code, run_globals) File "/opt/homebrew/Caskroom/miniforge/base/envs/tf/lib/python3.10/site-packages/ipykernel_launcher.py", line 17, in app.launch_new_instance() File "/opt/homebrew/Caskroom/miniforge/base/envs/tf/lib/python3.10/site-packages/traitlets/config/application.py", line 992, in launch_instance app.start() File "/opt/homebrew/Caskroom/miniforge/base/envs/tf/lib/python3.10/site-packages/ipykernel/kernelapp.py", line 711, in start self.io_loop.start() File "/opt/homebrew/Caskroom/miniforge/base/envs/tf/lib/python3.10/site-packages/tornado/platform/asyncio.py", line 215, in start self.asyncio_loop.run_forever() File "/opt/homebrew/Caskroom/miniforge/base/envs/tf/lib/python3.10/asyncio/base_events.py", line 603, in run_forever self._run_once() File "/opt/homebrew/Caskroom/miniforge/base/envs/tf/lib/python3.10/asyncio/base_events.py", line 1899, in _run_once handle._run() ... File "/var/folders/f9/bp40pn0d401d974fy48dxm8h0000gn/T/ipykernel_63636/3393788193.py", line 2, in <module> history = model.fit(Xf_train, yf_train, epochs=3, batch_size=64); File "/opt/homebrew/Caskroom/miniforge/base/envs/tf/lib/python3.10/site-packages/keras/utils/traceback_utils.py", line 65, in error_handler return fn(*args, **kwargs) File "/opt/homebrew/Caskroom/miniforge/base/envs/tf/lib/python3.10/site-packages/keras/engine/training.py", line 1650, in fit tmp_logs = self.train_function(iterator) File "/opt/homebrew/Caskroom/miniforge/base/envs/tf/lib/python3.10/site-packages/keras/engine/training.py", line 1249, in train_function return step_function(self, iterator) ...... File "/opt/homebrew/Caskroom/miniforge/base/envs/tf/lib/python3.10/site-packages/keras/engine/training.py", line 1222, in run_step outputs = model.train_step(data) File "/opt/homebrew/Caskroom/miniforge/base/envs/tf/lib/python3.10/site-packages/keras/engine/training.py", line 1027, in train_step self.optimizer.minimize(loss, self.trainable_variables, tape=tape) File "/opt/homebrew/Caskroom/miniforge/base/envs/tf/lib/python3.10/site-packages/keras/optimizers/optimizer_experimental/optimizer.py", line 527, in minimize self.apply_gradients(grads_and_vars) File "/opt/homebrew/Caskroom/miniforge/base/envs/tf/lib/python3.10/site-packages/keras/optimizers/optimizer_experimental/optimizer.py", line 1140, in apply_gradients return super().apply_gradients(grads_and_vars, name=name) File "/opt/homebrew/Caskroom/miniforge/base/envs/tf/lib/python3.10/site-packages/keras/optimizers/optimizer_experimental/optimizer.py", line 634, in apply_gradients iteration = self._internal_apply_gradients(grads_and_vars) File "/opt/homebrew/Caskroom/miniforge/base/envs/tf/lib/python3.10/site-packages/keras/optimizers/optimizer_experimental/optimizer.py", line 1166, in _internal_apply_gradients return tf.__internal__.distribute.interim.maybe_merge_call( File "/opt/homebrew/Caskroom/miniforge/base/envs/tf/lib/python3.10/site-packages/keras/optimizers/optimizer_experimental/optimizer.py", line 1216, in _distributed_apply_gradients_fn distribution.extended.update( File "/opt/homebrew/Caskroom/miniforge/base/envs/tf/lib/python3.10/site-packages/keras/optimizers/optimizer_experimental/optimizer.py", line 1211, in apply_grad_to_update_var return self._update_step_xla(grad, var, id(self._var_key(var))) Node: 'StatefulPartitionedCall_4' could not find registered platform with id: 0x28edf1f90 [[{{node StatefulPartitionedCall_4}}]] [Op:__inference_train_function_1241]

Machine Learning & AI General ML Compute tensorflow-metal

29

7

18k

Dec ’22

Core ML Model Deployment can not upload mlarchive file,InvalidArgumentError: Unable to unzip MLArchive

I have tried many times. When I change the file or re-create it, it shows 404 error { "code": 400, "message": "InvalidArgumentError: Unable to unzip MLArchive", "reason": "There was a problem with your request.", "detailedMessage": "InvalidArgumentError: Unable to unzip MLArchive", "requestUuid": "699afb97-8328-4a83-b186-851f797942aa" }

Machine Learning & AI Core ML Core ML

4

0

1.2k

Jan ’23

Keras with tensorflow-metal freezes during training with image augmentation

I am trying to train an image classification network in Keras with tensorflow-metal. The training freezes after the first 2-3 epochs if image augmentation layers are used (RandomFlip, RandomContrast, RandomBrightness) The system appears to use both GPU as well as CPU (as indicated by Activity Monitor). Also, warnings appear both in Jupyter and Terminal (see below). When the image augmentation layers are removed (i.e. we only rebuild the head and feed images from disk), CPU appears to be idle, no warnings appear, and training completes successfully. Versions: python 3.8, tensorflow-macos 2.11.0, tensorflow-metal 0.7.1 Sample code: img_augmentation = Sequential( [ layers.RandomFlip(), layers.RandomBrightness(factor=0.2), layers.RandomContrast(factor=0.2) ], name="img_augmentation", ) inputs = layers.Input(shape=(384, 384, 3)) x = img_augmentation(inputs) model = tf.keras.applications.EfficientNetV2S(include_top=False, input_tensor=x, weights='imagenet') model.trainable = False x = tf.keras.layers.GlobalAveragePooling2D(name="avg_pool")(model.output) x = tf.keras.layers.BatchNormalization()(x) top_dropout_rate = 0.2 x = tf.keras.layers.Dropout(top_dropout_rate, name="top_dropout")(x) outputs = tf.keras.layers.Dense(179, activation="softmax", name="pred")(x) newModel = Model(inputs=model.input, outputs=outputs, name="EfficientNet_DF20M_species") reduce_lr = tf.keras.callbacks.ReduceLROnPlateau(monitor='val_accuracy', factor=0.9, patience=2, verbose=1, min_lr=0.000001) optimizer = tf.keras.optimizers.legacy.SGD(learning_rate=0.01, momentum=0.9) newModel.compile(optimizer=optimizer, loss='categorical_crossentropy', metrics=['accuracy']) history = newModel.fit(x=train_ds, validation_data=val_ds, epochs=30, verbose=2, callbacks=[reduce_lr]) During training with image augmentation, Jupyter prints the following warnings while training the first epoch: WARNING:tensorflow:Using a while_loop for converting Bitcast cause there is no registered converter for this op. WARNING:tensorflow:Using a while_loop for converting Bitcast cause there is no registered converter for this op. WARNING:tensorflow:Using a while_loop for converting StatelessRandomUniformV2 cause there is no registered converter for this op. WARNING:tensorflow:Using a while_loop for converting RngReadAndSkip cause there is no registered converter for this op. WARNING:tensorflow:Using a while_loop for converting Bitcast cause there is no registered converter for this op. WARNING:tensorflow:Using a while_loop for converting Bitcast cause there is no registered converter for this op. WARNING:tensorflow:Using a while_loop for converting StatelessRandomUniformFullIntV2 cause there is no registered converter for this op. WARNING:tensorflow:Using a while_loop for converting StatelessRandomGetKeyCounter cause there is no registered converter for this op. ... During training with image augmentation, Terminal keeps spamming the following warning: 2023-02-21 23:13:38.958633: I metal_plugin/src/kernels/stateless_random_op.cc:282] Note the GPU implementation does not produce the same series as CPU implementation. 2023-02-21 23:13:38.958920: I metal_plugin/src/kernels/stateless_random_op.cc:282] Note the GPU implementation does not produce the same series as CPU implementation. 2023-02-21 23:13:38.959071: I metal_plugin/src/kernels/stateless_random_op.cc:282] Note the GPU implementation does not produce the same series as CPU implementation. 2023-02-21 23:13:38.959115: I metal_plugin/src/kernels/stateless_random_op.cc:282] Note the GPU implementation does not produce the same series as CPU implementation. 2023-02-21 23:13:38.959359: I metal_plugin/src/kernels/stateless_random_op.cc:282] Note the GPU implementation does not produce the same series as CPU implementation. ... Any suggestions?

Machine Learning & AI General Machine Learning Core ML tensorflow-metal

3

0

1.7k

Feb ’23

ANE-Optimized Layer Norm Fails on ANE

In the ml-ane-transformers repo, there is a custom LayerNorm implementation for the Neural Engine-optimized shape of (B,C,1,S). The coremltools documentation makes it sound like the layer_norm MIL op would support this natively. In fact, the following code works on CPU: B,C,S = 1,768,512 g,b = 1, 0 @mb.program(input_specs=[mb.TensorSpec(shape=(B,C,1,S)),]) def ln_prog(x): gamma = (torch.ones((C,), dtype=torch.float32) * g).tolist() beta = (torch.ones((C), dtype=torch.float32) * b).tolist() return mb.layer_norm(x=x, axes=[1], gamma=gamma, beta=beta, name="y") However it fails when run on the Neural Engine, giving results that are scaled by an incorrect value. Should this work on the Neural Engine?

Machine Learning & AI General Machine Learning Core ML

2

0

830

Apr ’23

Create ML activity training problem

Hi everyone! I’m trying to train an activity classification model with 3 classes. The problem is that only one class has precision and recall > 0 after training. Even with 2 classes result is the same First I’d thought that there is a problem with my data but when I switched “left” label to “right” and vice versa the results were the same: only “left”-labeled data get non-zero precision and recall.

Machine Learning & AI General Machine Learning Create ML

1

0

1.1k

Apr ’23

CreateML tabular regression ... SVM model?

does anyone know if the CreateML app has a way to build Support Vector Machine models for tabular regression? I see only the attached options. xcode14.2

Machine Learning & AI Create ML Create ML

1

0

897

May ’23

Updatable model using built-in Create ML classifiers

Is it possible to create an updatable sound classifier model which uses Apple's built in MLSoundClassifier available via Create ML that can be trained/personalized on device using Core ML? I tried to look up in quite a few places for a long while, however, I know that when on-device training was initially announced in 2019, updatable models were only restricted to non built-in classifiers, but any additional information that may have come out after 2019 in this regard has been hard to find.

Machine Learning & AI General Machine Learning Core ML Create ML

3

0

974

May ’23

Jax-metal - whisper-jax

Testing out https://developer.apple.com/metal/jax/ mainly for trying to run whisper-jax (https://github.com/sanchit-gandhi/whisper-jax/tree/main) on my M2. The jax-metal plugin seems to install without issues, and the basic test code runs fine. However, the jax-whisper code fails when trying to encode a file with the following error: error: failed to legalize operation 'mhlo.convolution' /Users/pere/jax-metal/lib/python3.10/site-packages/whisper_jax/layers.py:1236:0: note: see current operation: %111 = "mhlo.convolution"(%110, <<UNKNOWN SSA VALUE>>) {batch_group_count = 1 : i64, dimension_numbers = #mhlo.conv<[b, 0, f]x[0, i, o]->[b, 0, f]>, feature_group_count = 1 : i64, lhs_dilation = dense<1> : tensor<1xi64>, padding = dense<1> : tensor<1x2xi64>, precision_config = [#mhlo<precision DEFAULT>, #mhlo<precision DEFAULT>], rhs_dilation = dense<1> : tensor<1xi64>, window_reversal = dense<false> : tensor<1xi1>, window_strides = dense<1> : tensor<1xi64>} : (tensor<1x3000x80xf32>, tensor<3x80x384xf32>) -> tensor<1x3000x384xf32>

Machine Learning & AI General tensorflow-metal

3

4

2.4k

Jun ’23

AudioFeaturePrint Create ML Components Property Limits

Hey, Are there any limits to the windowDuration property of the AudioFeaturePrint transformer such as the minimum value or maximum value? If we create a model with the Create ML UI App, upon selecting the AudioFeaturePrint as the feature extractor, we cannot go below 0.5 seconds for the window duration. Is the limit same if we programmatically create a model using the AudioFeaturePrint?

Machine Learning & AI General Machine Learning Core ML Create ML

1

0

795

Jun ’23

Find Semantic Similairity Between Models?

In the video of Explore Natural Language multilingual models https://developer.apple.com/videos/play/wwdc2023/10042/, it's said at 6:24 that there are three models. I wonder if it is possible to find semantic similairity between models? For example English and Japanese belong to different models(Latin and CJK), can we compare the vector produced from the different models to find out if two sentences have similar meanings?

Machine Learning & AI General Natural Language Machine Learning wwdc2023-10042

1

0

1k

Jun ’23

Failure of speech recognition when "supportsOnDeviceRecognition" is set to "True".

I am using SFSpeechRecognizer to perform speech recognition, but I am getting the following error. [SpeechFramework] -[SFSpeechRecognitionTask localSpeechRecognitionClient:speechRecordingDidFail:]_block_invoke Ignoring subsequent local speech recording error: Error Domain=kAFAssistantErrorDomain Code=1101 "(null)" Setting requiresOnDeviceRecognition to False works correctly, but previously it worked with True with no error. The value of supportsOnDeviceRecognition was True, so the device is recognizing that it supports speech recognition. iPad Pro 11inch iOS 16.5. Is this expected behavior?

Machine Learning & AI General Speech

2

0

1.5k

Jun ’23

Can we incorporate Memoji into our apps?

I need a simple text-to-speech avatar in my iOS app. iOS already has Memojis ready to go - but I cannot find anywhere in the dev docs on how to access Memojis to use in as a tool in app development. Am I missing something? Also - can anyone point me to any resources besides the Apple docs for using AVSpeechSynthesis?

Machine Learning & AI General Speech Developer Program wwdc2023-10033

2

3

1.6k

Jun ’23

Siri enters loop of requesting parameter when running AppIntent

I want to add shortcut and Siri support using the new AppIntents framework. Running my intent using shortcuts or from spotlight works fine, as the touch based UI for the disambiguation is shown. However, when I ask Siri to perform this action, she gets into a loop of asking me the question to set the parameter. My AppIntent is implemented as following: struct StartSessionIntent: AppIntent { static var title: LocalizedStringResource = "start_recording" @Parameter(title: "activity", requestValueDialog: IntentDialog("which_activity")) var activity: ActivityEntity @MainActor func perform() async throws -> some IntentResult & ProvidesDialog { let activityToSelect: ActivityEntity = self.activity guard let selectedActivity = Activity[activityToSelect.name] else { return .result(dialog: "activity_not_found") } ... return .result(dialog: "recording_started \(selectedActivity.name.localized())") } } The ActivityEntity is implemented like this: struct ActivityEntity: AppEntity { static var typeDisplayRepresentation = TypeDisplayRepresentation(name: "activity") typealias DefaultQuery = MobilityActivityQuery static var defaultQuery: MobilityActivityQuery = MobilityActivityQuery() var id: String var name: String var icon: String var displayRepresentation: DisplayRepresentation { DisplayRepresentation(title: "\(self.name.localized())", image: .init(systemName: self.icon)) } } struct MobilityActivityQuery: EntityQuery { func entities(for identifiers: [String]) async throws -> [ActivityEntity] { Activity.all()?.compactMap({ activity in identifiers.contains(where: { $0 == activity.name }) ? ActivityEntity(id: activity.name, name: activity.name, icon: activity.icon) : nil }) ?? [] } func suggestedEntities() async throws -> [ActivityEntity] { Activity.all()?.compactMap({ activity in ActivityEntity(id: activity.name, name: activity.name, icon: activity.icon) }) ?? [] } } Has anyone an idea what might be causing this and how I can fix this behavior? Thanks in advance

Machine Learning & AI General Siri and Voice App Intents

3

1k

Jun ’23

Memory Leak Using TensorFlow-Metal

Hi, I've found a memory leak issue when using the tensorFlow-metal plugin for running a deep learning model on a Mac with the M1 chip. Here are the details of my system: System Information MacOS version: 13.4 TensorFlow (macos) version: 2.12.0, 2.13.0-rc1, tf-nightly==2.14.0.dev20230616 TensorFlow-Metal Plugin Version: 0.8, 1.0.0, 1.0.1 Model Details I've implemented a custom model architecture using TensorFlow's Keras API. The model has a dynamic Input, which I resize the images in a Resizing layer. Moreover, the data is passed to the model through a data generator class, using model.fit(). Problem Description When I train this model using the GPU on M1 Mac, I observe a continuous increase in memory usage, leading to a memory leak. This memory increase is more prominent with larger image inputs. For smaller images or average sizes (1024x128), the increase is smaller, but continuous, leading to a memory leak after several epochs. On the other hand, when I switch to using the CPU for training (tf.config.set_visible_devices([], 'GPU')), the memory leak issue is resolved, and I observe normal memory usage. In addition, I've tested the model with different sizes of images and various layer configurations. The memory leak appears to be present only when using the GPU, indeed. I hope this information is helpful in identifying and resolving the issue. If you need any further details, please let me know. The project code is private, but I can try to provide it with pseudocode if necessary.

Machine Learning & AI General tensorflow-metal

5

2

1.7k

Jun ’23

CTCLossV2 Op Not Supported on MacOS M1

System Information MacOS version: 13.4 TensorFlow (macos) version: tf-nightly==2.14.0.dev20230616 TensorFlow-Metal Plugin Version: 1.0.1 Problem Description I'm trying to compute the CTC Loss using TensorFlow's tf.nn.ctc_loss on M1 Mac, but an error is thrown indicating that no OpKernel was registered to support the CTCLossV2 operation. However, when using the CPU or even tf.keras.backend.ctc_batch_cost, it works fine. The error stack trace is as follows: tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name, tensorflow.python.framework.errors_impl.InvalidArgumentError: Graph execution error: Detected at node CTCLossV2 defined at (most recent call last): <stack traces unavailable> No OpKernel was registered to support Op 'CTCLossV2' used by {{node CTCLossV2}} with these attrs: [ctc_merge_repeated=true, preprocess_collapse_repeated=false, ignore_longer_outputs_than_inputs=false] Registered devices: [CPU, GPU] Registered kernels: <no registered kernels> [[CTCLossV2]] [[ctc_loss_func/PartitionedCall]] [Op:__inference_train_function_13095]

Machine Learning & AI General tensorflow-metal

3

0

594

Jun ’23

TTS problem iOS 17 beta

I see a lot of crashes on iOS 17 beta regarding some problem of "Text To Speech". Does anybody has a clue why TTS crashes? Anybody else seeing the same problem? Exception Type: EXC_BAD_ACCESS (SIGSEGV) Exception Subtype: KERN_INVALID_ADDRESS at 0x000000037f729380 Exception Codes: 0x0000000000000001, 0x000000037f729380 VM Region Info: 0x37f729380 is not in any region. Bytes after previous region: 3748828033 Bytes before following region: 52622617728 REGION TYPE START - END [ VSIZE] PRT/MAX SHRMOD REGION DETAIL MALLOC_NANO 280000000-2a0000000 [512.0M] rw-/rwx SM=PRV ---> GAP OF 0xd20000000 BYTES commpage (reserved) fc0000000-1000000000 [ 1.0G] ---/--- SM=NUL ...(unallocated) Termination Reason: SIGNAL 11 Segmentation fault: 11 Terminating Process: exc handler [36389] Triggered by Thread: 9 ..... Thread 9 name: Thread 9 Crashed: 0 libobjc.A.dylib 0x000000019eeff248 objc_retain_x8 + 16 1 AudioToolboxCore 0x00000001b2da9d80 auoop::RenderPipeUser::~RenderPipeUser() + 112 (AUOOPRenderPipePool.mm:400) 2 AudioToolboxCore 0x00000001b2e110b4 -[AUAudioUnit_XPC internalDeallocateRenderResources] + 92 (AUAudioUnit_XPC.mm:904) 3 AVFAudio 0x00000001bfa4cc04 AUInterfaceBaseV3::Uninitialize() + 60 (AUInterface.mm:524) 4 AVFAudio 0x00000001bfa894bc AVAudioEngineGraph::PerformCommand(AUGraphNodeBaseV3&, AVAudioEngineGraph::ENodeCommand, void*, unsigned int) const + 772 (AVAudioEngineGraph.mm:3317) 5 AVFAudio 0x00000001bfa93550 AVAudioEngineGraph::_Uninitialize(NSError**) + 132 (AVAudioEngineGraph.mm:1469) 6 AVFAudio 0x00000001bfa4b50c AVAudioEngineImpl::Stop(NSError**) + 396 (AVAudioEngine.mm:1081) 7 AVFAudio 0x00000001bfa4b094 -[AVAudioEngine stop] + 48 (AVAudioEngine.mm:193) 8 TextToSpeech 0x00000001c70b3c5c __55-[TTSSynthesisProviderAudioEngine renderSpeechRequest:]_block_invoke + 1756 (TTSSynthesisProviderAudioEngine.m:613) 9 libdispatch.dylib 0x00000001ae4b0740 _dispatch_call_block_and_release + 32 (init.c:1519) 10 libdispatch.dylib 0x00000001ae4b2378 _dispatch_client_callout + 20 (object.m:560) 11 libdispatch.dylib 0x00000001ae4b990c _dispatch_lane_serial_drain + 748 (queue.c:3885) 12 libdispatch.dylib 0x00000001ae4ba470 _dispatch_lane_invoke + 432 (queue.c:3976) 13 libdispatch.dylib 0x00000001ae4c5074 _dispatch_root_queue_drain_deferred_wlh + 288 (queue.c:6913) 14 libdispatch.dylib 0x00000001ae4c48e8 _dispatch_workloop_worker_thread + 404 (queue.c:6507) ... Thread 9 crashed with ARM Thread State (64-bit): x0: 0x0000000283309360 x1: 0x0000000000000000 x2: 0x0000000000000000 x3: 0x00000002833093c0 x4: 0x00000002833093c0 x5: 0x0000000101737740 x6: 0x0000000000000013 x7: 0x00000000ffffffff x8: 0x0000000283309360 x9: 0x3c788942d067009a x10: 0x0000000101547000 x11: 0x0000000000000000 x12: 0x00000000000007fb x13: 0x00000000000007fd x14: 0x000000001ee24020 x15: 0x0000000000000020 x16: 0x0000b1037f729360 x17: 0x000000037f729360 x18: 0x0000000000000000 x19: 0x0000000000000000 x20: 0x00000001016a8de8 x21: 0x0000000283e21d00 x22: 0x0000000283b3f1f8 x23: 0x0000000283098000 x24: 0x00000001bfb4fc35 x25: 0x00000001bfb4fc43 x26: 0x000000028033a688 x27: 0x0000000280c93090 x28: 0x0000000000000000 fp: 0x000000016fc86490 lr: 0x00000001b2da9d80 sp: 0x000000016fc863e0 pc: 0x000000019eeff248 cpsr: 0x1000 esr: 0x92000006 (Data Abort) byte read Translation fault

Machine Learning & AI General Speech Accessibility wwdc2023-10033

20

2

6.1k

Jun ’23

tensorflow-metal 1.0.1 crashes in protobuf init code

I have a custom model that runs just fine on the CPU under tensorflow-2.13.0rc1. I'd go to the stable 1.12.0, however there's no pip version that can be installed on an ARM-based computer. If I install tensorflow-metal 1.0.1 the same model crashes in what appears to be a protobuf initialization: 0 libtensorflow_framework.2.dylib 0x31e103724 google::protobuf::Message::InitializationErrorString() const + 88 1 libarrow.600.dylib 0x168dc95c0 google::protobuf::MessageLite::ParseFromArray(void const*, int) + 276 2 libmetal_plugin.dylib 0x371c0b400 metal_plugin::P_Optimize(void*, TF_Buffer const*, TF_GrapplerItem const*, TF_Buffer*, TSL_Status*) + 88 3 libtensorflow_cc.2.dylib 0x359f5196c tensorflow::grappler::CGraphOptimizer::Optimize(tensorflow::grappler::Cluster*, tensorflow::grappler::GrapplerItem const&, tensorflow::GraphDef*) + 116 This is on an M2 Ultra with macOS 13.4.1. Can anyone from the tensorflow-metal team look at what's going on? Thanks!

Machine Learning & AI General tensorflow-metal

3

0

781

Jun ’23

jnp.where is not working with Metal GPU backend

XlaRuntimeError Traceback (most recent call last) Cell In[49], line 4 1 arr = jnp.array( [7, 8, 9]) 3 # Find indices where the condition is True ----> 4 indices = jnp.where(arr > 1) 6 print(indices) XlaRuntimeError: UNKNOWN

Machine Learning & AI General tensorflow-metal

7

2

889

Jun ’23

tensorflow

my personal python version is 3.8.6 pip install --upgrade tensorflow ERROR: Could not find a version that satisfies the requirement t ensorflow (from versions: none) ERROR: No matching distribution found for tensorflow how to fix it？

Machine Learning & AI General tensorflow-metal

1

0

525

Jul ’23

Create ML Multi-Label Image Classifier will not train, throws error: "Unexpected Error"

Hello everyone, I am new to using Create ML, but am running up against a problem where the error is not descriptive, and I can not figure out what might be causing it. I am fairly sure my data is formatted properly, as in the CreateML software, it detects the images and shows me a bar graph of how many images belong to each label. But when it comes to actually training, the moment I press the "Train" button, it shows an error with the message: "Unexpected Error". I have also attempted to create and train the model programmatically, and that actually works! The framework requires that the JSON be named "annotations.json" instead of "annotation.json" and that the key representing the name of image be changed to "filename" from "image", but other than that, the data is the same. I tried to use the software with the changes I made to the JSON for use in the framework, but if I try that, it won't even parse the data, so I am fairly sure that my data is formatted correctly. I would prefer to use the app rather than do everything programmatically, because it presents the data in a much more digestible way. Has anyone else come up against this issue or a similar issue. I should note that I am running the latest Beta of MacOS Sonoma and Xcode.

Machine Learning & AI Create ML Create ML

7

0

1.4k

Jul ’23

Machine Learning & AI

Post

Replies

Boosts

Views

Activity