Pooling
Pooling doesn’t have weights, it just has parameters (filter size , stride , and type (max or average)).
- Max Pooling → Take the max within each filter
Take your input and break it down into smaller regions. Max Pooling is just taking the maximum number for each region.
Intution: As long as the features are detected anywhere in one of these quadrants, it remains preserved in the output of max pooling.
- Average Pooling → Take the average within each filter
This reduces the size or resolution of a signal (e.g., an image):
Downsampling
- In CNNs (e.g., U-Net):
- Max Pooling: Keeps the largest value in a region (e.g., 2×2).
- Average Pooling: Takes the average of values in the region.
- Strided Convolution: A convolution operation with a stride > 1 (e.g., 2), which skips pixels.
- In signal/image processing:
- Decimation: Drop every n-th sample after low-pass filtering to avoid aliasing.
Upsampling
- In CNNs
- Transpose Convolution (a.k.a. Deconvolution): Learns how to upsample by using learned kernels.
- Nearest Neighbor: Copies the closest pixel to fill in the new pixels.
- Bilinear/Bicubic Interpolation: Interpolates new pixel values based on neighbors.
- Unpooling: Reverses pooling using stored indices (less common).
- In signal/image processing:
- Zero Insertion + Filtering: Insert zeros between samples and apply low-pass filter to reconstruct.