Crash when trying to use tensorflow in a macOS VM

I created a macOS 14 VM using https://github.com/s-u/macosvm which uses the Virtualization Framework. I want to check if I can use paravirtualized graphics for tensorflow workloads.

I followed the steps from https://developer.apple.com/metal/tensorflow-plugin/ but when I run the script from step 4. Verify, I get a segmentation fault (see below).

Did anyone try to get this kind of GPU compute in a VM and succeed?

/Users/teuf/venv-metal/lib/python3.9/site-packages/urllib3/__init__.py:34: NotOpenSSLWarning: urllib3 v2 only supports OpenSSL 1.1.1+, currently the 'ssl' module is compiled with 'LibreSSL 2.8.3'. See: https://github.com/urllib3/urllib3/issues/3020
  warnings.warn(
2023-11-20 07:41:11.723578: I metal_plugin/src/device/metal_device.cc:1154] Metal device set to: Apple Paravirtual device
2023-11-20 07:41:11.723620: I metal_plugin/src/device/metal_device.cc:296] systemMemory: 10.00 GB
2023-11-20 07:41:11.723626: I metal_plugin/src/device/metal_device.cc:313] maxCacheSize: 0.50 GB
2023-11-20 07:41:11.723700: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:306] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2023-11-20 07:41:11.723968: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:272] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: <undefined>)
zsh: segmentation fault  python3 ./tensorflow-test.py


Thread 0 Crashed::  Dispatch queue: metal gpu stream
0   MPSCore                       	       0x1999598f8 MPSDevice::GetMPSLibrary_DoNotUse(MPSLibraryInfo const*) + 92
1   MPSCore                       	       0x19995c544 0x199927000 + 218436
2   MPSCore                       	       0x19995c908 0x199927000 + 219400
3   MetalPerformanceShadersGraph  	       0x1fb696a58 0x1fb583000 + 1129048
4   MetalPerformanceShadersGraph  	       0x1fb6f0cc8 0x1fb583000 + 1498312
5   MetalPerformanceShadersGraph  	       0x1fb6ef2dc 0x1fb583000 + 1491676
6   MetalPerformanceShadersGraph  	       0x1fb717ea0 0x1fb583000 + 1658528
7   MetalPerformanceShadersGraph  	       0x1fb717ce4 0x1fb583000 + 1658084
8   MetalPerformanceShadersGraph  	       0x1fb6edaac 0x1fb583000 + 1485484
9   MetalPerformanceShadersGraph  	       0x1fb7a85e0 0x1fb583000 + 2250208
10  MetalPerformanceShadersGraph  	       0x1fb7a79f0 0x1fb583000 + 2247152
11  MetalPerformanceShadersGraph  	       0x1fb6602b4 0x1fb583000 + 905908
12  MetalPerformanceShadersGraph  	       0x1fb65f7b0 0x1fb583000 + 903088
13  libmetal_plugin.dylib         	       0x1156dfdcc invocation function for block in metal_plugin::runMPSGraph(MetalStream*, MPSGraph*, NSDictionary*, NSDictionary*) + 164
14  libdispatch.dylib             	       0x18e79b910 _dispatch_client_callout + 20
15  libdispatch.dylib             	       0x18e7aacc4 _dispatch_lane_barrier_sync_invoke_and_complete + 56
16  libmetal_plugin.dylib         	       0x1156dfd14 metal_plugin::runMPSGraph(MetalStream*, MPSGraph*, NSDictionary*, NSDictionary*) + 108
17  libmetal_plugin.dylib         	       0x115606634 metal_plugin::MPSStatelessRandomUniformOp<float>::ProduceOutput(metal_plugin::OpKernelContext*, metal_plugin::Tensor*) + 876
18  libmetal_plugin.dylib         	       0x115607620 metal_plugin::MPSStatelessRandomOpBase::Compute(metal_plugin::OpKernelContext*) + 620
19  libmetal_plugin.dylib         	       0x1156061f8 void metal_plugin::ComputeOpKernel<metal_plugin::MPSStatelessRandomUniformOp<float>>(void*, TF_OpKernelContext*) + 44
20  libtensorflow_framework.2.dylib	       0x10b807354 tensorflow::PluggableDevice::Compute(tensorflow::OpKernel*, tensorflow::OpKernelContext*) + 148
21  libtensorflow_framework.2.dylib	       0x10b7413e0 tensorflow::(anonymous namespace)::SingleThreadedExecutorImpl::Run(tensorflow::Executor::Args const&) + 2100
22  libtensorflow_framework.2.dylib	       0x10b70b820 tensorflow::FunctionLibraryRuntimeImpl::RunSync(tensorflow::FunctionLibraryRuntime::Options, unsigned long long, absl::lts_20230125::Span<tensorflow::Tensor const>, std::__1::vector<tensorflow::Tensor, std::__1::allocator<tensorflow::Tensor>>*) + 420
23  libtensorflow_framework.2.dylib	       0x10b715668 tensorflow::ProcessFunctionLibraryRuntime::RunMultiDeviceSync(tensorflow::FunctionLibraryRuntime::Options const&, unsigned long long, std::__1::vector<std::__1::variant<tensorflow::Tensor, tensorflow::TensorShape>, std::__1::allocator<std::__1::variant<tensorflow::Tensor, tensorflow::TensorShape>>>*, std::__1::function<absl::lts_20230125::Status (tensorflow::ProcessFunctionLibraryRuntime::ComponentFunctionData const&, tensorflow::ProcessFunctionLibraryRuntime::InternalArgs*)>) const + 1336
24  libtensorflow_framework.2.dylib	       0x10b71a8a4 tensorflow::ProcessFunctionLibraryRuntime::RunSync(tensorflow::FunctionLibraryRuntime::Options const&, unsigned long long, absl::lts_20230125::Span<tensorflow::Tensor const>, std::__1::vector<tensorflow::Tensor, std::__1::allocator<tensorflow::Tensor>>*) const + 848
25  libtensorflow_cc.2.dylib      	       0x2801b5008 tensorflow::KernelAndDeviceFunc::Run(tensorflow::ScopedStepContainer*, tensorflow::EagerKernelArgs const&, std::__1::vector<std::__1::variant<tensorflow::Tensor, tensorflow::TensorShape>, std::__1::allocator<std::__1::variant<tensorflow::Tensor, tensorflow::TensorShape>>>*, tsl::CancellationManager*, std::__1::optional<tensorflow::EagerFunctionParams> const&, std::__1::optional<tensorflow::ManagedStackTrace> const&, tsl::CoordinationServiceAgent*) + 572
26  libtensorflow_cc.2.dylib      	       0x28016613c tensorflow::EagerKernelExecute(tensorflow::EagerContext*, absl::lts_20230125::InlinedVector<tensorflow::TensorHandle*, 4ul, std::__1::allocator<tensorflow::TensorHandle*>> const&, std::__1::optional<tensorflow::EagerFunctionParams> const&, tsl::core::RefCountPtr<tensorflow::KernelAndDevice> const&, tensorflow::GraphCollector*, tsl::CancellationManager*, absl::lts_20230125::Span<tensorflow::TensorHandle*>, std::__1::optional<tensorflow::ManagedStackTrace> const&) + 452
27  libtensorflow_cc.2.dylib      	       0x2801708ec tensorflow::ExecuteNode::Run() + 396
28  libtensorflow_cc.2.dylib      	       0x2801b0118 tensorflow::EagerExecutor::SyncExecute(tensorflow::EagerNode*) + 244
29  libtensorflow_cc.2.dylib      	       0x280165ac8 tensorflow::(anonymous namespace)::EagerLocalExecute(tensorflow::EagerOperation*, tensorflow::TensorHandle**, int*) + 2580
30  libtensorflow_cc.2.dylib      	       0x2801637a8 tensorflow::DoEagerExecute(tensorflow::EagerOperation*, tensorflow::TensorHandle**, int*) + 416
31  libtensorflow_cc.2.dylib      	       0x2801631e8 tensorflow::EagerOperation::Execute(absl::lts_20230125::Span<tensorflow::AbstractTensorHandle*>, int*) + 132

Replies

Can you please file a bug report using Feedback Assistant? The crash report would be useful for debugging the Metal Performance Shaders.

I've created https://feedbackassistant.apple.com/feedback/13423851