I'm trying to use the randomTensor function from MPS graph to initialize the weights of a fully connected layer. I can create the graph and run inference using the randomly initialized values, but when I try to train and update these randomly initialized weights, I'm hitting a crash:
Assertion failed: (isa<To>(Val) && "cast<Ty>() argument of incompatible type!"), function cast, file Casting.h, line 578.
I can train the graph if I instead initialize the weights myself on the CPU, but I thought using the randomTensor functions would be faster/allow initialization to occur on the GPU.
Here's my code for building the graph including both methods of weight initialization:
func buildGraph(variables: inout [MPSGraphTensor]) -> (MPSGraphTensor, MPSGraphTensor, MPSGraphTensor, MPSGraphTensor) {
let inputPlaceholder = graph.placeholder(shape: [2], dataType: .float32, name: nil)
let labelPlaceholder = graph.placeholder(shape: [1], name: nil)
// This works for inference but not training
let descriptor = MPSGraphRandomOpDescriptor(distribution: .uniform, dataType: .float32)!
let weightTensor = graph.randomTensor(withShape: [2, 1], descriptor: descriptor, seed: 2, name: nil)
// This works for inference and training
// let weights = [Float](repeating: 1, count: 2)
// let weightTensor = graph.variable(with: Data(bytes: weights, count: 2 * MemoryLayout<Float32>.size), shape: [2, 1], dataType: .float32, name: nil)
variables += [weightTensor]
let output = graph.matrixMultiplication(primary: inputPlaceholder, secondary: weightTensor, name: nil)
let loss = graph.softMaxCrossEntropy(output, labels: labelPlaceholder, axis: -1, reuctionType: .sum, name: nil)
return (inputPlaceholder, labelPlaceholder, output, loss)
}
And to run the graph I have the following in my sample view controller:
override func viewDidLoad() {
super.viewDidLoad()
var variables: [MPSGraphTensor] = []
let (inputPlaceholder, labelPlaceholder, output, loss) = buildGraph(variables: &variables)
let gradients = graph.gradients(of: loss, with: variables, name: nil)
let learningRate = graph.constant(0.001, dataType: .float32)
var updateOps: [MPSGraphOperation] = []
for (key, value) in gradients {
let updates = graph.stochasticGradientDescent(learningRate: learningRate, values: key, gradient: value, name: nil)
let assign = graph.assign(key, tensor: updates, name: nil)
updateOps += [assign]
}
let commandBuffer = MPSCommandBuffer(commandBuffer: Self.commandQueue.makeCommandBuffer()!)
let executionDesc = MPSGraphExecutionDescriptor()
executionDesc.completionHandler = { (resultsDictionary, nil) in
for (key, value) in resultsDictionary {
var output: [Float] = [0]
value.mpsndarray().readBytes(&output, strideBytes: nil)
print(output)
}
}
let inputDesc = MPSNDArrayDescriptor(dataType: .float32, shape: [2])
let input = MPSNDArray(device: Self.device, descriptor: inputDesc)
var inputArray: [Float] = [1, 2]
input.writeBytes(&inputArray, strideBytes: nil)
let source = MPSGraphTensorData(input)
let labelMPSArray = MPSNDArray(device: Self.device, descriptor: MPSNDArrayDescriptor(dataType: .float32, shape: [1]))
var labelArray: [Float] = [1]
labelMPSArray.writeBytes(&labelArray, strideBytes: nil)
let label = MPSGraphTensorData(labelMPSArray)
// This runs inference and works
// graph.encode(to: commandBuffer, feeds: [inputPlaceholder: source], targetTensors: [output], targetOperations: [], executionDescriptor: executionDesc)
//
// commandBuffer.commit()
// commandBuffer.waitUntilCompleted()
// This trains but does not work
graph.encode(
to: commandBuffer,
feeds: [inputPlaceholder: source, labelPlaceholder: label], targetTensors: [], targetOperations: updateOps, executionDescriptor: executionDesc)
commandBuffer.commit()
commandBuffer.waitUntilCompleted()
}
And a few other relevant variables are created at the class scope:
let graph = MPSGraph()
static let device = MTLCreateSystemDefaultDevice()!
static let commandQueue = device.makeCommandQueue()!
How can I use these randomTensor
functions on MPSGraph
to randomly initialize weights for training?