CameraInfo
I didn’t know that this was a thing, until Ashwin at NVIDIA introduced this to me at work.
But this has the camera intrinsics and extrinsics.
http://docs.ros.org/en/melodic/api/sensor_msgs/html/msg/CameraInfo.html
I remember just working with sensor_imgs/Image
, but there are 2 things:
http://docs.ros.org/en/noetic/api/sensor_msgs/html/msg/Image.html
This is just the standard that ROS has:
https://github.com/NVIDIA-ISAAC-ROS/isaac_ros_argus_camera/blob/main/isaac_ros_argus_camera/src/argus_camera_mono_node.cpp
These are the different fields:
std_msgs/Header header
uint32 height
uint32 width
string distortion_model
float64[] D
float64[9] K
float64[9] R
float64[12] P
uint32 binning_x
uint32 binning_y
sensor_msgs/RegionOfInterest roi
D is the Distortion (Camera)
# The distortion parameters, size depending on the distortion model.
# For "plumb_bob", the 5 parameters are: (k1, k2, t1, t2, k3).
float64[] D
K is the Calibration Matrix
# Intrinsic camera matrix for the raw (distorted) images.
# [fx 0 cx]
# K = [ 0 fy cy]
# [ 0 0 1]
# Projects 3D points in the camera coordinate frame to 2D pixel
# coordinates using the focal lengths (fx, fy) and principal point
# (cx, cy).
float64[ 9 ] K # 3x3 row-major matrix
R is the Rotation Matrix
P is the Projection Matrix
# Projection/camera matrix
# [fx' 0 cx' Tx]
# P = [ 0 fy' cy' Ty]
# [ 0 0 1 0]
# By convention, this matrix specifies the intrinsic (camera) matrix
# of the processed (rectified) image. That is, the left 3x3 portion
# is the normal camera intrinsic matrix for the rectified image.
# It projects 3D points in the camera coordinate frame to 2D pixel
# coordinates using the focal lengths (fx', fy') and principal point
# (cx', cy') - these may differ from the values in K.
# For monocular cameras, Tx = Ty = 0. Normally, monocular cameras will
# also have R = the identity and P[1:3,1:3] = K.
# For a stereo pair, the fourth column [Tx Ty 0]' is related to the
# position of the optical center of the second camera in the first
# camera's frame. We assume Tz = 0 so both cameras are in the same
# stereo image plane. The first camera always has Tx = Ty = 0. For
# the right (second) camera of a horizontal stereo pair, Ty = 0 and
# Tx = -fx' * B, where B is the baseline between the cameras.
# Given a 3D point [X Y Z]', the projection (x, y) of the point onto
# the rectified image is given by:
# [u v w]' = P * [X Y Z 1]'
# x = u / w
# y = v / w
# This holds for both images of a stereo pair.
float64[ 12 ] P # 3x4 row-major matrix
The process is
World coordinates → camera frame → normalize → undistort image → apply camera intrinsics
The normalization can be done anywhere, since it’s just multiplying by a scalar