Structure

MPSNNPaddingMethod

Options that define a graph's padding.

Declaration

struct MPSNNPaddingMethod

Overview

The MPSNNGraph must make automatic decisions about how big to make the result of each filter node. This is typically determined by a combination of input image size, size of the filter window (for example, convolution weights), filter stride, and a description of how much extra space beyond the edges of the image to allow the filter read. By knowing the properties of the filter, you can then infer the size of the result image. Most of this information is known to the MPSNNGraph as part of its normal operation. However, the amount of padding to add and where to add it is a matter of choice left to you. Different neural network frameworks such as TensorFlow and Caffe make different choices here. Depending on where your network was trained, you will need to adjust the policies used by MPS during inference. In the event that the padding method is not simply described by this enumeration, you may provide you own custom policy definition by overriding the destinationImageDescriptor(forSourceImages:sourceStates:for:suggestedDescriptor:) method in a custom MPSNNPadding child class. Common values that influence the size of the result image by adjusting the amount of padding added to the source images:

  • validOnly Result values are only produced for the area that is guaranteed to have all of its input values defined (i.e. not off the edge). This produces the smallest result image

  • sizeSame The result image is the same size as the input image. If the stride is not 1, then the result is scaled accordingly.

  • sizeFull Result values are produced for any position for which at least one input value is defined (i.e. not off the edge).

  • custom The sizing and centering policy is given by the destinationImageDescriptor(forSourceImages:sourceStates:for:suggestedDescriptor:).

Except possibly when custom is used, the area within the source image that is read will be centered on the source image. Even so, at times the area can not be perfectly centered because the source image has odd size and the region read has even size, or vice versa. In such cases, you may use the following values to select where to put the extra padding:

topLeft

Leftover padding is added to the top or left side of image as appropriate.

addRemainderToBottomRight

Leftover padding is added to the bottom or right side of image as appropriate.

Here again, different external frameworks may use different policies. 


In some cases, Caffe introduces the notion of a region beyond the padding which is invalid. This can happen when the padding is set to a width narrower than what is needed for a destination size. In such cases, MPSCNNPaddingMethodExcludeEdges is used to adjust normalization factors for filter weights (particularly in pooling) such that invalid regions beyond the padding are not counted towards the filter area. Currently, only pooling supports this feature. Other filters ignore it.


The size and a add remainder policies always appear together in the MPSNNPaddingMethod. There is no provision for a size policy without a remainder policy or vice versa. It is, in practice, used as a bit field.


Most MPS neural network filters are considered forward filters. Some (for example, convolution transpose and unpooling) are considered reverse filters. For the reverse filters, the image stride is measured in destination values rather than source values and has the effect of enlarging the image rather than reducing it. When a reverse filter is used to "undo" the effects of a forward filter, the size policy should be the opposite of the forward padding method. For example, if the forward filter used validOnly | topLeft, the reverse filter should use sizeFull | topLeft. Some consideration of the geometry of inputs and outputs will reveal why this is so. It is usually not important to adjust the centering method because the size of the reverse result generally doesn't suffer from centering asymmetries. That is: the size would usually be given by: 


static int DestSizeReverse( int sourceSize, int stride, int filterWindowSize, Style style ) {
    // style = {-1,0,1} for valid-only, same, full
    return (sourceSize-1) * stride + 1 + style  * (filterWindowSize-1);  
} 


so the result size is exactly the one needed for the source size and there are no centering problems. In some cases where the reverse pass is intended to completely reverse a forward pass, the MPSState object produced by the forward pass should be used to determine the size of the reverse pass result image.


Tensorflow does not appear to provide a full padding method, but instead appears to use its valid-only padding mode for reverse filters to in effect achieve what is called sizeFull here. 
 


Walkthrough of Operation of Padding Policy

Most MPSCNNKernel objects have two types of encode calls. There is one for which you must pass in a preallocated MPSImage to receive the results. This is for manual configuration. It assumes you know what you are doing, and asks you to correctly set a diversity of properties to correctly position image inputs and size results. It does not use the padding policy. You must size the result correctly, set the clipRect, offset and other properties as needed yourself.

Layered on top of that is usually another flavor of encode call that returns a destination image instead from the left hand side of the function. It is designed to automatically configure itself based on the paddingPolicy. When this more automated encode method is called, it invokes a method in the MPSKernel that looks at the MPSNNPaddingMethod bitfield of the policy. Based on the information therein and the size of the input images and other filter properties, it determines the size of the output, sets the offset property, and returns an appropriate MPSImageDescriptor for the destination image.

If you set the custom bit in the MPSNNPaddingMethod, then the destinationImageDescriptor(forSourceImages:sourceStates:for:suggestedDescriptor:) method is called. The MPSImageDescriptor prepared earlier is passed in as the last parameter. You can use this descriptor or modify as needed. In addition, you can adjust any properties of the MPSKernel with which it will be used. If, for example, the descriptor is not the right MPSImageFeatureChannelFormat, you can change it, or make your own MPSImageDescriptor based on the one handed to you. This is your opportunity to customize the configuration of the MPSKernel. In some cases (for example, forTensorflowAveragePooling()) you might change other properties such as the filter edging mode, or adjust the offset that was already set for you. When the kernel is fully configured, return the MPSImageDescriptor.

The MPSImageDescriptor is then passed to the destinationImageAllocator to allocate the image. You might provide such an allocator if you want to use your own custom MTLHeap rather than the MPS internal heap. The allocator can be set either directly in the MPSCNNKernel or through the imageAllocator property.

It is intended that most of the time, default values for padding method and destination image allocator should be good enough. Only minimal additional configuration should be required, apart from occasional adjustments to set the MPSNNPaddingMethod when something other than default padding for the object is needed. If you find yourself encumbered by frequent adjustments of this kind, you might find it to your advantage to subclass MPSNNFilterNode or MPSCNNKernel objects to adjust the default padding policy and allocator at initialization time.

Relationships

Conforms To