SSD

Original Paper: https://arxiv.org/abs/1512.02325

https://developers.arcgis.com/python/guide/how-ssd-works/

Summary: SSD is an image detector that uses a single deep neural network. It uses a set of default boxes that have different scales and aspect ratios. At prediction time, it adjusts the bounding box to produce a better fit.

The key innovation with SSD is that it eliminates region proposal and the subsequent pixel/feature resampling stage.

Key Points:

  1. Multiple layers handle different scales
  2. Different filters predict boxes of different shapes/sizes
Architecture

Image Feature Extraction Detection Heads Non-Maximum Suppression (NMS)

  1. Feature Extraction to build feature maps.
  2. Detection Heads Convert feature maps into prediction about boxes.
  3. Non-Maximum Suppression (NMS) Exists to remove repeated detections

Key lessons:

Data augmentation: Used to make model more robust to various object sizes and shapes.

is crucial.