Convolution

Pooling

Pooling doesn’t have weights, it just has parameters (filter size , stride , and type (max or average)).

  • Max Pooling Take the max within each filter

Take your input and break it down into smaller regions. Max Pooling is just taking the maximum number for each region.

Intution: As long as the features are detected anywhere in one of these quadrants, it remains preserved in the output of max pooling.

  • Average Pooling Take the average within each filter

This reduces the size or resolution of a signal (e.g., an image):

Downsampling

  1. In CNNs (e.g., U-Net):
    • Max Pooling: Keeps the largest value in a region (e.g., 2×2).
    • Average Pooling: Takes the average of values in the region.
    • Strided Convolution: A convolution operation with a stride > 1 (e.g., 2), which skips pixels.
  2. In signal/image processing:
    • Decimation: Drop every n-th sample after low-pass filtering to avoid aliasing.

Upsampling

  1. In CNNs
    • Transpose Convolution (a.k.a. Deconvolution): Learns how to upsample by using learned kernels.
    • Nearest Neighbor: Copies the closest pixel to fill in the new pixels.
    • Bilinear/Bicubic Interpolation: Interpolates new pixel values based on neighbors.
    • Unpooling: Reverses pooling using stored indices (less common).
  2. In signal/image processing:
    • Zero Insertion + Filtering: Insert zeros between samples and apply low-pass filter to reconstruct.