Unpredictable return time for currentDrawable

Question

Created Feb ’17

Replies 2

Boosts 0

Views 1.1k

Participants 2

Hi,

I’m writing a view to plot real time data in Metal. I’m drawing the samples using point primitives, and I’m triple buffering both the vertices, and the uniform data. The issue I’m having is that the time it takes for a call to currentDrawable to return seems to be unpredictable. It’s almost as if sometimes there are no drawables ready, and I have to wait a whole frame for one to become available. Usually the time for currentDrawable to return is ~0.07 ms (which is about what I would expect), but other times it’s a full 1/60 s. This causes the whole main thread to block, which is to say the least not very desirable.

I’m seeing this issue on an iPhone 6S Plus and an iPad Air. I have not yet seen this behavior an the Mac (I have a 2016 MPB with an AMD 460 GPU). My guess is that this somehow has to do with the fact that the GPUs in iOS devices are TBDR-based. I don’t think I’m bandwidth constrained, because I get the exact same behavior no matter how many or how few samples I’m drawing.

To illustrate the issue I wrote a minimal example that draws a static sine wave. This is a simplified example as I normally would have memcpy’ed the samples into the current vertexBuffer just like I do with the uniforms. This is why I’m triple buffering the vertex data as well as the uniforms. It’s still enough to illustrate the problem though. Just set this view as your base view in a storyboard, and run. On some runs it works just fine. Other times currentDrawable starts out with return time of 16.67 ms, then after a few seconds jumps to 0.07 ms, then after a while back to 16.67. It seems to jump from 16.67 to 0.07 if you rotate the device for some reason.

import MetalKit

let N = 500

class MetalGraph: MTKView {
    typealias Vertex = Int32

    struct Uniforms {
        var offset: UInt32
        var numSamples: UInt32
    }

    // Data
    var uniforms = Uniforms(offset: 0, numSamples: UInt32(N))

    // Buffers
    var vertexBuffers  = [MTLBuffer]()
    var uniformBuffers = [MTLBuffer]()
    var inflightBufferSemaphore = DispatchSemaphore(value: 3)
    var inflightBufferIndex = 0

    // Metal State
    var commandQueue: MTLCommandQueue!
    var pipeline: MTLRenderPipelineState!


    // Setup

    override func awakeFromNib() {
        super.awakeFromNib()

        device = MTLCreateSystemDefaultDevice()
        commandQueue = device?.makeCommandQueue()
        colorPixelFormat = .bgra8Unorm

        setupPipeline()
        setupBuffers()
    }

    func setupPipeline() {
        let library = device?.newDefaultLibrary()

        let descriptor = MTLRenderPipelineDescriptor()
        descriptor.colorAttachments[0].pixelFormat = .bgra8Unorm
        descriptor.vertexFunction   = library?.makeFunction(name: "vertexFunction")
        descriptor.fragmentFunction = library?.makeFunction(name: "fragmentFunction")

        pipeline = try! device?.makeRenderPipelineState(descriptor: descriptor)
    }

    func setupBuffers() {
        // Produces a dummy sine wave with N samples, 2 periods, with a range of [0, 1000]
        let vertices: [Vertex] = (0..<N).map {
            let periods = 2.0
            let scaled = Double($0) / (Double(N)-1) * periods * 2 * .pi
            let value = (sin(scaled) + 1) * 500 // Transform from range [-1, 1] to [0, 1000]
            return Vertex(value)
        }

        let vertexBytes  = MemoryLayout<Vertex>.size * vertices.count
        let uniformBytes = MemoryLayout<Uniforms>.size

        for _ in 0..<3 {
            vertexBuffers .append(device!.makeBuffer(bytes: vertices,  length: vertexBytes))
            uniformBuffers.append(device!.makeBuffer(bytes: &uniforms, length: uniformBytes))
        }
    }



    // Drawing

    func updateUniformBuffers() {
        uniforms.offset = (uniforms.offset + 1) % UInt32(N)

        memcpy(
            uniformBuffers[inflightBufferIndex].contents(),
            &uniforms,
            MemoryLayout<Uniforms>.size
        )
    }

    override func draw(_ rect: CGRect) {
        _ = inflightBufferSemaphore.wait(timeout: .distantFuture)

        updateUniformBuffers()

        let start = CACurrentMediaTime()
        guard let drawable = currentDrawable else { return }
        print(String(format: "Grab Drawable: %.3f ms", (CACurrentMediaTime() - start) * 1000))

        guard let passDescriptor = currentRenderPassDescriptor else { return }

        passDescriptor.colorAttachments[0].loadAction = .clear
        passDescriptor.colorAttachments[0].storeAction = .store
        passDescriptor.colorAttachments[0].clearColor = MTLClearColorMake(0.2, 0.2, 0.2, 1)

        let commandBuffer = commandQueue.makeCommandBuffer()

        let encoder = commandBuffer.makeRenderCommandEncoder(descriptor: passDescriptor)
        encoder.setRenderPipelineState(pipeline)
        encoder.setVertexBuffer(vertexBuffers[inflightBufferIndex],  offset: 0, at: 0)
        encoder.setVertexBuffer(uniformBuffers[inflightBufferIndex], offset: 0, at: 1)
        encoder.drawPrimitives(type: .point, vertexStart: 0, vertexCount: N)
        encoder.endEncoding()

        commandBuffer.addCompletedHandler { _ in
            self.inflightBufferSemaphore.signal()
        }
        commandBuffer.present(drawable)
        commandBuffer.commit()

        inflightBufferIndex = (inflightBufferIndex + 1) % 3
    }
}

#include <metal_stdlib>
using namespace metal;

struct VertexIn {
    int32_t value;
};

struct VertexOut {
    float4 pos [[position]];
    float pointSize [[point_size]];
};

struct Uniforms {
    uint32_t offset;
    uint32_t numSamples;
};

vertex VertexOut vertexFunction(device   VertexIn *vertices [[buffer(0)]],
                                constant Uniforms *uniforms [[buffer(1)]],
                                uint vid [[vertex_id]])
{
    // I'm using the vertex index to evenly spread the
    // samples out in the x direction
    float xIndex = float((vid + (uniforms->numSamples - uniforms->offset)) % uniforms->numSamples);
    float x = (float(xIndex) / float(uniforms->numSamples - 1)) * 2.0f - 1.0f;

    // Transforming the values from the range [0, 1000] to [-1, 1]
    float y = (float)vertices[vid].value / 500.0f - 1.0f ;

    VertexOut vOut;
    vOut.pos = {x, y, 1, 1};
    vOut.pointSize = 3;

    return vOut;
}

fragment half4 fragmentFunction() {
    return half4(1, 1, 1, 1);
}

Possibly related to this: In all the examples I’ve seen, inflightBufferSemaphore is incremented inside the commandBuffer’s completionHandler, just before the semaphore is signaled (which makes sense to me). When I have that line there I get a weird jittering effect, almost as if the framebuffers are being displayed out of order. Moving this line to the bottom of the draw function fixes the issue, although it doesn’t make a lot of sense to me. I’m not sure if this is related to currentDrawable’s return time being so unpredictable, but I have a feeling these two issues are emerging from the same underlying problem.

Any help would be very much appreciated!

Boost

Answer 1

MikeAlpha OP

Feb ’17

Hi vegather

I don't use Swift yet, and I don't have the time (and means) right now to test run your code. So take what I write with the grain of salt, ok? But I tried to read it several times, and there is at least one problem I see. In your draw function, you're getting your semaphore in the first place by wait()ing on it. Then you're running through rendering setup, and if either drawable or currentRenderPassDescriptor is null, you return. Just like that. Unless there is some Swift magic involved, which I haven't caught, semaphore is left in unsignalled state! You're basically leaking semaphore counts on each time drawable is not available, which may be caused by system being busy, for example. That would explain why device rotation has noticeable effect - because then draw( ) call timing is affected, I think.

So your code runs fine if, and only if, drawable and currentRenderPassDescriptor are both not null each time. Which is probably what happens on that powerful MBP of yours. Experimenting with synchronization like you did may cause all kinds of weird effects, really, but there is no place for random "what will happen if I do this" changes in code, if you want to write good software. It may happen to work for awhile, then youll change the other code (changing timings slightly), or change the device, or OS version, and everything fails again.

And finally...it may be just a matter of personal preferences, and again I am no Swift programmer, but your style is...weird. Take shader for example, why something like:

float xIndex = float((vid + (uniforms->numSamples - uniforms->offset)) % uniforms->numSamples);
float x = (float(xIndex) / float(uniforms->numSamples - 1)) * 2.0f - 1.0f;
float y = (float)vertices[vid].value / 500.0f - 1.0f ;

There are unnecessary casts, putting index in float value is strange, and probably dangerous with bigger data sets, and it may be my personal preference but it is easier for me to think of X positions as "constant" and rotate their Y values than put same Y value in various X places.

What I'd do would be something like:

float x = float( vid ) / ( uniforms->numSamples - 1 ) * 2.0f - 1.0f;
uint index = ( vid + uniforms->offset ) % uniforms->numSamples;
float y = vertices[ index ].value / 500.0f - 1.0f;

Hope that helps

Michal

0

Answer 2

vegather OP

Feb ’17

Hi Michal, thanks for your reply!

You are absolutely right that just returning if currentDrawable or currentRenderPassDescriptor is nil, this might hang (semaphore not being signalled). However, from what I've seen, this never happens (I tried printing in the else block). It only blocks until a drawable is available. I wanted to keep my example fairly simple, so I left it out.

My impression was that there were 3 drawables in rotation, and the semaphore would make sure I wait until one is ready. This does not appear to be the case, which is why I'm confused. I absolutely want to understand what is going on, and make sure my code is deterministic.

It's been a while since I've written any C/C++ style code. On second look, some of my casts are probably redundant and xIndex should probably be called something else (as it's not really an index).

Thanks for your time!

0