MTAudioProcessingTap with kMTAudioProcessingTapCreationFlag_PostEffects not reflecting AVAudioMix volume

I am trying to build level metering for

AVPlayer
. I am doing this with an
MTAudioProcessingTap
that gets passed to an
AVAudioMix
, which in turns gets passed to the
AVPlayerItem
. The
MTAudioProcessingTap
gets created with the
kMTAudioProcessingTapCreationFlag_PostEffects
flag.
Technical Q&A QA1783 has the following to say about the
PreEffects
and
PostEffects
flags:

When you create a "pre-effects" audio tap using the kMTAudioProcessingTapCreationFlag_PreEffects flag, the tap will be called before any effects specified by AVAudioMixInputParameters are applied; when you create a "post-effects" tap by using the kMTAudioProcessingTapCreationFlag_PostEffects flag, the tap will be called after those effects are applied. Currently the only "effect" that AVAudioMixInputParameters supports is a linear volume ramp.

The problem:
When created with the

kMTAudioProcessingTapCreationFlag_PostEffects
, I would expect the that samples received by the
MTAudioProcessingTap
would reflect the the volume or audio ramps set on the
AVAudioMixInputParameters
. For example, if I set the volume to 0, I would expect to get all 0 samples. However the samples I receive seem to be totally unaffected by the volume or volume ramps.

Am I doing something wrong?

Here is a quick an dirty playground that illustrates the problem. The example sets the volume directly, but I observed the same problem when using audio ramps. Tested on both macOS and iOS:


import Foundation
import XCPlayground
import PlaygroundSupport
import AVFoundation
import Accelerate


PlaygroundPage.current.needsIndefiniteExecution = true;


let assetURL = Bundle.main.url(forResource: "sample", withExtension: "mp3")!


let asset = AVAsset(url: assetURL)
let playerItem = AVPlayerItem(asset: asset)
var audioMix = AVMutableAudioMix()


// The volume. Set to > 0 to hear something.
let kVolume: Float = 0.0


var parameterArray: [AVAudioMixInputParameters] = []


for assetTrack in asset.tracks(withMediaType: .audio) {


    let parameters = AVMutableAudioMixInputParameters(track: assetTrack);
    parameters.setVolume(kVolume, at: kCMTimeZero)
    parameterArray.append(parameters)


    // Omitting most callbacks to keep sample short:
    var callbacks = MTAudioProcessingTapCallbacks(
        version: kMTAudioProcessingTapCallbacksVersion_0,
        clientInfo: nil,
        init: nil,
        finalize: nil,
        prepare: nil,
        unprepare: nil,
        process: { (tap, numberFrames, flags, bufferListInOut, numberFramesOut, flagsOut) in


            guard MTAudioProcessingTapGetSourceAudio(tap, numberFrames, bufferListInOut, flagsOut, nil, numberFramesOut) == noErr else {
                preconditionFailure()
            }


            // Assume 32bit float format, native endian:


            for i in 0..<bufferListInOut.pointee.mNumberBuffers {


                let buffer = bufferListInOut.pointee.mBuffers
                let stride: vDSP_Stride = vDSP_Stride(buffer.mNumberChannels)
                let numElements: vDSP_Length = vDSP_Length(buffer.mDataByteSize / UInt32(MemoryLayout<Float>.stride))


                for j in 0..<Int(buffer.mNumberChannels) {


                    // Use vDSP_maxmgv tof ind the maximum amplitude
                    var start = buffer.mData!.bindMemory(to: Float.self, capacity: Int(numElements))
                    start += Int(j * MemoryLayout<Float>.stride)
                    var magnitude: Float = 0
                    vDSP_maxmgv(start, stride, &magnitude, numElements - vDSP_Length(j))


                    DispatchQueue.main.async {
                        print("buff: \(i), chan: \(j), max: \(magnitude)")
                    }
                }
            }
        }
    )


    var tap: Unmanaged<MTAudioProcessingTap>?


    guard MTAudioProcessingTapCreate(kCFAllocatorDefault, &callbacks, kMTAudioProcessingTapCreationFlag_PostEffects, &tap) == noErr else {
        preconditionFailure()
    }


    parameters.audioTapProcessor = tap?.takeUnretainedValue()


}


audioMix.inputParameters = parameterArray


playerItem.audioMix = audioMix


let player = AVPlayer(playerItem: playerItem)
player.rate = 1.0

Replies

I multiplied the samples by the AVPlayer's volume level and then do an RMS calculation on the resulting samples.

I send the leftVol and rightVol values off to a NSLevelIndicator instances -- on which I then do a pair of clever NSView overlays to get a lagging indication of peak volume levels.

Code Block
    float scalar = player.volume;
    for (NSInteger i=0; i<bufferListInOut->mNumberBuffers; i)
        vDSP_vsmul(bufferListInOut->mBuffers[i].mData, 1, &scalar,
                   bufferListInOut->mBuffers[i].mData, 1,
                   bufferListInOut->mBuffers[i].mDataByteSize / sizeof(float));
    float leftVol, rightVol;
    for (NSInteger i=0; i<bufferListInOut->mNumberBuffers; i) {
        AudioBuffer *pBuffer = &bufferListInOut->mBuffers[i];
        float rms = 0.0f;
        vDSP_rmsqv(pBuffer->mData, 1, &rms, numberFrames*pBuffer->mNumberChannels);
        if (i==0)
            leftVol = rms;
        if (i==1)
            rightVol = rms;
    }