Max pooling 2D

This block reduces the size of the data, the number of parameters, the amount of computation needed, and it also controls overfitting.
You can say that 2D max pooling is similar to scaling down the size of an image.

The 2D Max pooling block represents a max pooling operation. This block outputs a smaller tensor than its input, which means downstream blocks will need fewer parameters and amount of computation; it also serves to control overfitting.

How does max pooling work?

2D Max pooling block
Figure 1. 2D Max pooling block

The 2D Max pooling block moves a rectangle (window) over the incoming data, computing the maximum in each specific window. The size of the window is determined by the Horizontal pooling factor, the Vertical pooling factor, and how big steps the window takes is determined by the Horizontal and Vertical stride.

Max pooling blocks are inserted after one or more convolutional blocks; they help inner convolutional blocks receive information from a bigger portion of the original image (a 3x3 filter after pooling is influenced by a 6x6 portion of the tensor before pooling).

If we see convolutional blocks as detectors of a specific feature, max pooling keeps only the “strongest” value of that feature inside the pooling rectangle. Each channel (hence each feature) is treated separately.

Output size

The output size is (input width/horizontal pooling factor) x (input height/vertical pooling factor) x (input channels)

Parameters

Horizontal pooling factor: The width of the rectangle within which the maximum is computed.

Vertical pooling factor: The height of the rectangle within which the maximum is computed.

Horizontal stride: Horizontal distance between the left edge of consecutive pooling windows.

Vertical stride: Vertical distance between the top edge of consecutive pooling windows.

Padding: Same (output (height) x (width) is the same as the input) or valid (output (height) x (width) is smaller than the input).

Was this page helpful?
YesNo