[Question] In the tutorial code, the name space has changed?

Hi all,

I am new to the metal Pytorch. I am trying to implement the demo code of customized ops in Pytorch. The demo code

However, I think the torch namespace doesn't have "mps" now? The "torch::mps" cannot be found if I try to compile the .mm file into PyTorch cpp extension.

After some digging, I think everybody is using Aten namespace with "at::"? How can I use functions in mps and make this demo code work?

Thanks in advance.

Error message


In file included from /Users/ethan/Downloads/CustomizingAPyTorchOperation/CustomSoftshrink.mm:10:
/Users/ethan/Downloads/CustomizingAPyTorchOperation/CustomSoftshrink.h:11:30: warning: ISO C++11 does not allow conversion from string literal to 'char *' [-Wwritable-strings]
static char *CUSTOM_KERNEL = R"MPS_SOFTSHRINK(
                             ^
/Users/ethan/Downloads/CustomizingAPyTorchOperation/CustomSoftshrink.mm:43:53: error: no member named 'mps' in namespace 'torch'
        id<MTLCommandBuffer> commandBuffer = torch::mps::get_command_buffer();
                                             ~~~~~~~^
/Users/ethan/Downloads/CustomizingAPyTorchOperation/CustomSoftshrink.mm:47:47: error: no member named 'mps' in namespace 'torch'
        dispatch_queue_t serialQueue = torch::mps::get_dispatch_queue();
                                       ~~~~~~~^
/Users/ethan/Downloads/CustomizingAPyTorchOperation/CustomSoftshrink.mm:76:20: error: no member named 'mps' in namespace 'torch'
            torch::mps::commit();
            ~~~~~~~^
1 warning and 3 errors generated.
ninja: build stopped: subcommand failed.

CustomSoftshrink.mm code


/*
See the LICENSE.txt file for this sample’s licensing information.

Abstract:
The code that registers a PyTorch custom operation.
*/


#include <torch/extension.h>
#include "CustomSoftshrink.h"

#import <Foundation/Foundation.h>
#import <Metal/Metal.h>

// Helper function to retrieve the `MTLBuffer` from a `torch::Tensor`.
static inline id<MTLBuffer> getMTLBufferStorage(const torch::Tensor& tensor) {
  return __builtin_bit_cast(id<MTLBuffer>, tensor.storage().data());
}

torch::Tensor& dispatchSoftShrinkKernel(const torch::Tensor& input, torch::Tensor& output, float lambda) {
    @autoreleasepool {
        id<MTLDevice> device = MTLCreateSystemDefaultDevice();
        NSError *error = nil;

        // Set the number of threads equal to the number of elements within the input tensor.
        int numThreads = input.numel();

        // Load the custom soft shrink shader.
        id<MTLLibrary> customKernelLibrary = [device newLibraryWithSource:[NSString stringWithUTF8String:CUSTOM_KERNEL]
                                                                  options:nil
                                                                    error:&error];
        TORCH_CHECK(customKernelLibrary, "Failed to to create custom kernel library, error: ", error.localizedDescription.UTF8String);

        std::string kernel_name = std::string("softshrink_kernel_") + (input.scalar_type() == torch::kFloat ? "float" : "half");
        id<MTLFunction> customSoftShrinkFunction = [customKernelLibrary newFunctionWithName:[NSString stringWithUTF8String:kernel_name.c_str()]];
        TORCH_CHECK(customSoftShrinkFunction, "Failed to create function state object for ", kernel_name.c_str());

        // Create a compute pipeline state object for the soft shrink kernel.
        id<MTLComputePipelineState> softShrinkPSO = [device newComputePipelineStateWithFunction:customSoftShrinkFunction error:&error];
        TORCH_CHECK(softShrinkPSO, error.localizedDescription.UTF8String);

        // Get a reference to the command buffer for the MPS stream.
        id<MTLCommandBuffer> commandBuffer = torch::mps::get_command_buffer();
        TORCH_CHECK(commandBuffer, "Failed to retrieve command buffer reference");

        // Get a reference to the dispatch queue for the MPS stream, which encodes the synchronization with the CPU.
        dispatch_queue_t serialQueue = torch::mps::get_dispatch_queue();

        dispatch_sync(serialQueue, ^(){
            // Start a compute pass.
            id<MTLComputeCommandEncoder> computeEncoder = [commandBuffer computeCommandEncoder];
            TORCH_CHECK(computeEncoder, "Failed to create compute command encoder");

            // Encode the pipeline state object and its parameters.
            [computeEncoder setComputePipelineState:softShrinkPSO];
            [computeEncoder setBuffer:getMTLBufferStorage(input) offset:input.storage_offset() * input.element_size() atIndex:0];
            [computeEncoder setBuffer:getMTLBufferStorage(output) offset:output.storage_offset() * output.element_size() atIndex:1];
            [computeEncoder setBytes:&lambda length:sizeof(float) atIndex:2];

            MTLSize gridSize = MTLSizeMake(numThreads, 1, 1);

            // Calculate a thread group size.
            NSUInteger threadGroupSize = softShrinkPSO.maxTotalThreadsPerThreadgroup;
            if (threadGroupSize > numThreads) {
                threadGroupSize = numThreads;
            }
            MTLSize threadgroupSize = MTLSizeMake(threadGroupSize, 1, 1);

            // Encode the compute command.
            [computeEncoder dispatchThreads:gridSize
                      threadsPerThreadgroup:threadgroupSize];

            [computeEncoder endEncoding];

            // Commit the work.
            torch::mps::commit();
        });
    }

    return output;
}

// C++ op dispatching the Metal soft shrink shader.
torch::Tensor mps_softshrink(const torch::Tensor &input, float lambda = 0.5) {
    // Check whether the input tensor resides on the MPS device and whether it's contiguous.
    TORCH_CHECK(input.device().is_mps(), "input must be a MPS tensor");
    TORCH_CHECK(input.is_contiguous(), "input must be contiguous");

    // Check the supported data types for soft shrink.
    TORCH_CHECK(input.scalar_type() == torch::kFloat ||
                input.scalar_type() == torch::kHalf, "Unsupported data type: ", input.scalar_type());

    // Allocate the output, same shape as the input.
    torch::Tensor output = torch::empty_like(input);

    return dispatchSoftShrinkKernel(input, output, lambda);
}

// Create Python bindings for the Objective-C++ code.
PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {
    m.def("mps_softshrink", &mps_softshrink);
}
  • My GPU is AMD 560X.
Add a Comment

Replies

Hi @Waxpple. Thanks for the report. You would need a pytorch-nightly build in order for the sample to work. To install it, you can use: pip3 install --pre --force-reinstall torch --index-url https://download.pytorch.org/whl/nightly/cpu.

To run the sample:

python3 run_sample.py

Let me know if you're seeing any issues with the nightly build.

Hi,

Thanks for the fast reply! After using pip3 install --pre --force-reinstall torch --index-url https://download.pytorch.org/whl/nightly/cpu

The PyTorch version switched from 2.0.1 to 2.1.0. And the above issue is gone. I also check with the mps availability with following code:

import torch
if torch.backends.mps.is_available():
    mps_device = torch.device("mps")
    x = torch.ones(1, device=mps_device)
    print (x)
else:
    print ("MPS device not found.")

And the output is correct. tensor([1.], device='mps:0')

However, I still cannot run the example code successfully. The error output shows that I didn't have the correct version of metal 2.2 or above.

(base) ethan@liuyanfudeMacBook-Pro CustomizingAPyTorchOperation % python3 run_sample.py
Traceback (most recent call last):
  File "/Users/ethan/Downloads/CustomizingAPyTorchOperation/run_sample.py", line 59, in <module>
    test_softshrink()
  File "/Users/ethan/Downloads/CustomizingAPyTorchOperation/run_sample.py", line 55, in test_softshrink
    test_correctness()
  File "/Users/ethan/Downloads/CustomizingAPyTorchOperation/run_sample.py", line 49, in test_correctness
    output_custom_softshrink_op = custom_softshrink(input_data)
  File "/Users/ethan/miniconda3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1505, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/Users/ethan/miniconda3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1514, in _call_impl
    return forward_call(*args, **kwargs)
  File "/Users/ethan/Downloads/CustomizingAPyTorchOperation/softshrink.py", line 27, in forward
    return compiled_lib.mps_softshrink(input, self.lambd)
RuntimeError: Failed to to create custom kernel library, error: program_source:18:3: error: 'host_name' attribute requires Metal language standard macos-metal2.2 or higher
[[host_name("softshrink_kernel_half")]]
  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
program_source:25:3: error: 'host_name' attribute requires Metal language standard macos-metal2.2 or higher
[[host_name("softshrink_kernel_float")]]
  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

My Xcode version is 14.3.1 and running with Ventura 13.4.1.

I am using Intel/AMD Macbook pro, will this be the problem?

Since the requirements include mac with AMD GPUs.

Thanks!

  • I think AMD 560X GPU is not supported... (metal 2.2 needs Metal 3 api?)

    Another question!

    Why does Pytorch with MPS not support Apple neural engine (ANE) for training? I see there are some examples of using ANE to train a model with "createML" framework. Since ANE can train the model in Swift, why can't it be used in Pytorch?

Add a Comment