SSD

Original Paper: https://arxiv.org/abs/1512.02325

https://developers.arcgis.com/python/guide/how-ssd-works/

Summary: SSD is an image detector that uses a single deep neural network. It uses a set of default boxes that have different scales and aspect ratios. At prediction time, it adjusts the bounding box to produce a better fit.

The key innovation with SSD is that it eliminates region proposal and the subsequent pixel/feature resampling stage.

Key Points:

Multiple layers → handle different scales
Different filters predict boxes of different shapes/sizes

Architecture

Image → Feature Extraction → Detection Heads → Non-Maximum Suppression (NMS)

Feature Extraction to build feature maps.
- Uses VGG
Detection Heads Convert feature maps into prediction about boxes.
Non-Maximum Suppression (NMS) Exists to remove repeated detections

Key lessons:

Data augmentation: Used to make model more robust to various object sizes and shapes.

is crucial.

🛠️ Steven Gong

Table of Contents

SSD

Architecture

Graph View

Backlinks