Region Proposals

This an interesting idea. However, Andrew Ng says that the end-to-end technique by Andrew Ng says that the end-to-end technique by YOLO seems to be much more promising than a two stage one.

  1. Propose regions (using Semantic Segmentation)
    1. This is was previously done using traditional Computer Vision techniques, like Computer Vision techniques, like Selective Search. However, modern techniques learn these regions.
  2. Make detections on those regions

This medium article is also a worthwhile read.

Algorithms, noted from here

  • R-CNN: Propose regions (these are not learned) using Selective Search (2000 regions generated). Classify proposed regions one at a time. Output label + bounding box (this is a more accurate bounding box). 47s / image with Selective Search (2000 regions generated). Classify proposed regions one at a time. Output label + bounding box (this is a more accurate bounding box). 47s / image with VGG16, not feasible
    • Ad hoc training objectives
  • Fast R-CNN: Use convolution implementation of sliding windows to classify all the proposed regions. see Sliding Window for explanation of how this is done.
  • Faster R-CNN: Use CNN to propose regions.

Region Proposal Network (RPN)