February 23, 2007
As we've mentioned, cameras will be used to determine the position of our blimps. But to do this, we have to understand a few things about the cameras. Quite obviously, we have to know where the camera is located, and where it's facing. But cameras also have certain intrinsic properties which cause straight lines to appear as curves, particularly near the outer edges of an image. The intrinsic properties also include the camera's resolution.
It turns out that we can capture all of these properties in a 3x4 matrix, P. The matrix is set up such that, when postmultiplied by a 3D coordinate, the product is a 2D coordinate. That is, where x and X are 2D and 3D homogeneous coordinates respectively,
x = PX
Algorithms exist to automatically compute P, but let's start with a manual example. Let's think about a single 3D point X at coordinates [x,y,z]=[1,1,1]. If the camera was sitting at coordinate [1,1,0], and facing along the z-axis, this point it should be right in the center of view. If the camera had a resolution of 640x480, this means the point X should show up on the image at 320x240.
For a moment, let's ignore the camera's intrinsic properties and just figure out how to account for the camera's position C=(1,1,0) and orientation (facing straight along the z-axis). We can do this in two steps -- shifting the camera -- and then rotating it. We can achieve this by using a homogeneous translation followed by a rotation. So we translate by C. Actually, we translate by -C because we're moving the point X rather than the camera. What about the rotation? Well, we have already implied that the Z-direction is "straight ahead" for the camera. So it turns out the camera is already facing the right direction, and we can use the identity matrix for the rotation, R=I. These two operations produce a matrix operation X' = RTX, where x' is the position of the fixed-frame point X in the camera's reference frame. R and T are the aformementioned rotation and translation.
In MATLAB, noting that X has an extra "1" scale factor tacked on, we get the result we expect, X' = (0 0 1), that is, just one unit in the Z-direction in front of the camera!
R=eye(3); T=[eye(3) [-1 -1 0]']; X=[1 1 1 1]; X_prime = T*X = [0 0 1]
In the above calculations, we have still ignored the fact that the camera is actually projecting 3D coordinates onto a 2D plane, and that the camera has other intrinsic properties (like focal length). It turns out that these can all be represented by one more matrix K multiplied by the other two we have already seen, yielding the full camera matrix P = K*R*T. The definition of K is shown below. The camera has a focal length, which we'll just arbitrarily specify as 35.0, and two other intrinsic parameters k_x and k_y, which we also arbitrarily specify as 1.0 for now.
Now we have everything to fully-specify our camera, so we compute the result of the multiplication, in MATLAB,
35 1 320 -36
0 -35 240 35
0 0 1 0
Now let's see what this camera can do. Remember all along we've been trying to take our coordinate X=(1,1,1) and turn it into camera coordinates, which we know from experience to be (320,240). So we will "take a picture" by premultiplying X by P, and we get precisely that:
>> P*[1 1 1 1]' ans = 320 240 1
Actually it may be confusing that we have that extra "1". Again, this is a scale factor, and as long as it is "1," the coordinate is what we would expect. The K matrix is responsible for the camera's projection into 2D, which requires a division that will in general yield non-unit scale factor. In this case, we simply normalize the coordinate by dividing by the scale factor, e.g.,
Next, I'll apply the camera matrix to visualize something other than a single point, and then we'll show how the camera matrix is crucial in using multiple 2D images to reconstruct 3D scenes.
Posted by jrpowers at February 23, 2007 10:06 AM
Jeff, have I mentioned that you're REALLY the right guy to have working on this?
Posted by: rasputin at March 3, 2007 09:47 AMLogin to leave a comment. Create a new account.