Depth Estimation

Monocular Depth Estimation

For monocular, see Monocular Depth Estimation.

Stereo Depth Estimation

Usually, with stereo depth estimation, we first estimate the Disparity between a stereo image pair. Then, computing the depth from the disparity output of these models are disparity values.

Disparity to depth?

For stereo, we can obtain the depth from the Disparity of a stereo camera image pair, and with the baseline of the stereo image pair, we can easily calculate depth for a given pixel.

Ways to achieve Depth estimation

Monocular scale ambiguity (CS231n 2024 Lec 18, slides 13–16)

A small close object and a large far object project to identical pixels — absolute depth from a single image is fundamentally ambiguous. The standard fix is a scale-invariant loss (Eigen, Puhrsh, Fergus NeurIPS 2014; Eigen & Fergus ICCV 2015):

Penalizes relative log-depths between pixel pairs while ignoring any constant log-scale offset, so the net is graded on shape consistency rather than absolute metric depth. Equivalent to where .