Using the Apple Neural Engine for MLTensor operations

Question

Created Feb ’25

Replies 2

Boosts 4

Participants 3

Based on the documentation, it appears that MLTensor can be used to perform tensor operations using the ANE (Apple Neural Engine) by wrapping the tensor operations with withMLTensorComputePolicy with a MLComputePolicy initialized with MLComputeUnits.cpuAndNeuralEngine (it can also be initialized with MLComputeUnits.all to let the OS spread the load between the Neural Engine, GPU and CPU).

However, when using the Instruments app, it appears that the tensor operations never get executed on the Neural Engine. It would be helpful if someone can guide me on the correct way to ensure that the Nerual Engine is used to perform the tensor operations (not as part of a CoreML model file).

based on this example, I've created a simple code to try it:

import Foundation
import CoreML

print("Starting...")
let semaphore = DispatchSemaphore(value: 0)
Task {
    await withMLTensorComputePolicy(.init(MLComputeUnits.cpuAndNeuralEngine)) {
        let v1 = MLTensor([1.0, 2.0, 3.0, 4.0])
        let v2 = MLTensor([5.0, 6.0, 7.0, 8.0])
        let v3 = v1.matmul(v2)
        await v3.shapedArray(of: Float.self) // is 70.0


        let m1 = MLTensor(shape: [2, 3], scalars: [
            1, 2, 3,
            4, 5, 6
        ], scalarType: Float.self)
        let m2 = MLTensor(shape: [3, 2], scalars: [
             7,  8,
             9, 10,
            11, 12
        ], scalarType: Float.self)
        let m3 = m1.matmul(m2)
        let result = await m3.shapedArray(of: Float.self) // is [[58, 64], [139, 154]]


        // Supports broadcasting
        let m4 = MLTensor(randomNormal: [3, 1, 1, 4], scalarType: Float.self)
        let m5 = MLTensor(randomNormal: [4, 2], scalarType: Float.self)
        let m6 = m4.matmul(m5)
        
        print("Done")
        return result;
    }
    semaphore.signal()
}
semaphore.wait()

Here's what I get on the Instruments app: Screenshot 2025-03-01 at 2.50.25.png

Notice how the Neural Engine line shows no usage. Ive run this test on an M1 Max MacBook Pro.

Boost

Answer 1

Xcode-K OP

Mar ’25

I have tried the code - it throw a few warnings

"Result of call to 'withMLTensorComputePolicy' is unused"
"Result of call to 'shapedArray(of:)' is unused"
"Initialization of immutable value 'm6' was never used; consider replacing with assignment to '' or removing it purpose of code is simply put ANE into use example "Instance Method matmul(:)"

import Foundation
import CoreML

print("Starting ANE demonstration...")
let semaphore = DispatchSemaphore(value: 0)

Task {
    
    await withMLTensorComputePolicy(.init(MLComputeUnits.all)) {
        // Vector-vector multiplication (dot product)
        let v1 = MLTensor([1.0, 2.0, 3.0, 4.0])
        let v2 = MLTensor([5.0, 6.0, 7.0, 8.0])
        let v3 = v1.matmul(v2)
        let dotProduct = await v3.shapedArray(of: Float.self)
        print("Dot product result: \(dotProduct)")  // 70.0
        
        // Matrix-matrix multiplication
        let m1 = MLTensor(shape: [2, 3], scalars: [
            1, 2, 3,
            4, 5, 6
        ], scalarType: Float.self)
        
        let m2 = MLTensor(shape: [3, 2], scalars: [
            7,  8,
            9, 10,
            11, 12
        ], scalarType: Float.self)
        
        let m3 = m1.matmul(m2)
        let matrixResult = await m3.shapedArray(of: Float.self)
        print("Matrix multiplication result: \(matrixResult)")  // Result [[58, 64], [139, 154]]
        
        
        let m4 = MLTensor(randomNormal: [3, 1, 1, 4], scalarType: Float.self)
        let m5 = MLTensor(randomNormal: [4, 2], scalarType: Float.self)
        _ = m4.matmul(m5)  // ignorance
        
        print("ANE operations completed successfully")
    }
    semaphore.signal()
}

semaphore.wait()

Instrument shown same as yours Screenshot 2025-03-03 at 23.11.33.png power metrics as well

sudo powermetrics --samplers ane_power

Screenshot 2025-03-03 at 23.12.53.png

1

Answer 2

BBfat OP

Mar ’25

Same problem.

0