Visual SLAM

Uses cameras to construct a map of the environment. Does feature matching, see ORB-SLAM

Tesla kind of does SLAM by creating a bird-eyes view from multiple camera views.

Other resources:

You basically try to figure out where the features align.



Classical Visual SLAM Stack

Typical Visual SLAM Workflow

A typical visual SLAM workflow includes the following steps:

  1. Sensor data acquisition. (cameras, and optionally motor encoders, IMUs, etc).
  2. Visual Odometry (frontend): VO’s task is to estimate the camera movement between adjacent frames (ego-motion) and generate a rough local map.
  3. Backend filtering/optimization (backend). Receives poses from VO and loop closing, and then applies optimization to generate a fully optimized trajectory and map through Bundle Adjustment
  4. Loop Closing. Determines whether the robot has returned to its previous position in order to reduce the accumulated drift. If a loop is detected, it will provide information to the backend for further optimization.
  5. Reconstruction (optional). It constructs a task-specific map based on the estimated camera trajectory.

frontend more relevant to computer vision topics (image feature extraction and matching) backend state estimation research area

SLAM Formalization

See SLAM Formalization.


I think Sachin showed me this image instead