Depth Estimation
Monocular Depth Estimation
For monocular, see Monocular Depth Estimation.
Stereo Depth Estimation
Usually, with stereo depth estimation, we first estimate the Disparity between a stereo image pair. Then, computing the depth from the disparity output of these models are disparity values.
Disparity to depth?
For stereo, we can obtain the depth from the Disparity of a stereo camera image pair, and with the baseline of the stereo image pair, we can easily calculate depth for a given pixel.
Ways to achieve Depth estimation
- Stereo Matching (simplest way)
- Semi-Global Matching (a more advanced stereo matching algorithm)
- Learning-based depth estimation (ex: ESS)
Monocular scale ambiguity (CS231n 2024 Lec 18, slides 13–16)
A small close object and a large far object project to identical pixels — absolute depth from a single image is fundamentally ambiguous. The standard fix is a scale-invariant loss (Eigen, Puhrsh, Fergus NeurIPS 2014; Eigen & Fergus ICCV 2015):
Penalizes relative log-depths between pixel pairs while ignoring any constant log-scale offset, so the net is graded on shape consistency rather than absolute metric depth. Equivalent to where .