An Idiot, a digital camera, and a PC: Homogeneous coordinates: more warping

Now we all had fun with those 2×2 matrices in the previous entry, right? ;) (boo....) Before you think to yourself that that's the end of matrices, lemme warn you that this entry will involve even more. *SIGH* But, I promise you I won't turn this into a math blog!! =) In all seriousness, I assure you it's nothing too complicated.

So! As simple and nice as those 2×2 matrices were, the math geeks were not satisfied. Why not, you may ask. Well, the problem is

that although we can do all the mirroring, shearing, and rotating till the sun goes down, we cannot, for the love of God, move the damn image from one position to another. I mean... We defined image warping as a general notion of moving pixels around, yet we can't do simple horizontal/vertical sliding around of images?! That's just wrong, no?? Well, so there's a trick to doing this, and it's called homoegeneous coordinates.

Although I won't go into details on what exactly homogeneous coordinates are, or in what other context they're used, I'll at least say that as far as we're concerned, it's just a kludge that will allow us to represent image translation (sliding it horizontally and vertically) in matrix form.

In real simple terms, what we're going to do is add the number "1" as a new entry into our coordinate space like so:

[x
 y
 1]

Now, what that does is it forces us to turn our transformation matrices into 3×3 matrices, just to be able to get the matrices to multiply together. So if we had a 3×3 matrix like this one:

[x'  =  [1 0 t_x  [x
 y'  =   0 1 t_y   y
 1]  =   0 0 1 ]  1]

and solved the equation, like so:

x' = 1 × x + 0 × y + t_x * 1
y' = 0 × x + 1 × y + t_y * 1
1 = 0 × x + 0 × y + 1 * 1

You'll end up with:

x' = x + t_x
y' = y + t_y

Boo yah! You've got yourself a translation matrix that lets you add arbitrary scalar values to the x and y coordinates. =)

For a more in-depth intro to homogeneous coordinates, check this paper titled introduction to homogeneous coordinates. Oh, and when they mention some mumbo jumbo about a Euclidean plane, those ~~pedantic fools~~ math scholars are basically referring to a plane that's 2 dimensional and consists only of real numbers.

So now you can take all the 2×2 matrices we've shown in the previous entry, and turn them into 3×3 matrices that use homogeneous coordinates

Translate

[x'  =  [1 0 t_x  [x
 y'  =   0 1 t_y   y
 1]  =   0 0 1 ]  1]

Scale

[x'  =  [s_x 0  0  [x
 y'  =   0  s_y  0   y
 1]  =   0  0   1 ]  1]

Rotate

[x'  =  [cosθ -sinθ 0  [x
 y'  =   sinθ cosθ  0   y
 1]  =   0    0     1 ] 1]

Shear

[x'  =  [1  sh_x 0  [x
 y'  =   sh_y 1 0   y
 1]  =   0    0 1 ]  1]

Alright, now before we dive further into our newly found interest in 3×3 matrices, let's first make some notes about the propertis of the linear transformations discussed in the previous entry.

orgin maps to orgin, meaning that th pixel found at the origin (0,0) in the original image can be found at the origin (0,) in the transformed image
lines map to lines
parallel lines remain parallel
ratios are preserved
closed under composition - meaning you can combine them together to yield a single 2x2 transformation matrix

With that said, let's look at transformations that involve both linear and non-linear transformations. Let's take affine, for example.

Affine

[x'  =  [a b c  [x
 y'  =   d e f   y
 w]  =   0 0 1]  w]

This one is a combo of translation and transformation and it violates one of the properties of a linear transformation, which is that its

origin does not necessarily map to origin

Next we have projective transformation which takes the following form:

Projective

[x'  =  [a b c  [x
 y'  =   d e f   y
 w]  =   g h i]  w]

it's basically affine plus projective warps, and it violates 3 of the properties found in a linear transformation since

origin doesn't necessarily map to origin
paralle lines do not ncessarily remain parallel
ratios are not preserved

One thing to note is that since these matrices are closed under composition, we can multiply them together to get a single matrix that will carry out the entire series of transformations. Do keep in mind, however, that order is important.

So, armed with any of the above transformation matrices, we can iterate through all the pixels on a given image and find out where it would end up in the transformed image. This is known as forward warping. Watch out, though, cuz it can get interesting when the x, y values come out to be a fraction (i.e. in between two pixels). In this case we can use a technique called splatting.

Then there is also inverse warping where you iterate through all the pixels on the transformed image and multiply it by the inverse of the transformation matrix to get the pixel that

it would have originated from to find the brightness value to use. Of course, we can, again, run into cases where the originating pixel lands onn a non integer x, y value in which case we can use various interpolation techniques such as nearest neighbor, bilinear, bicubic, gaussian, etc... to pick a nearby pixel.

In general, inverse warping is more commonly used given that it makes sure that all pixels on the transformed image is covered. In a foward warp, you may run into cases where not all pixesl on the transformed image is accounted for. That means that you might end up with holes in your transformed image due to the fact that no brightness value has been picked for that pixel. Inverse warping isn't perfect, either. The problem in this case is that you have to assume that the inverse of the transformation matrix can be found. Unfortunately, this is not always the case.

Alright, so tune in next time for an interesting adventure into the world of image morphing!

An Idiot, a digital camera, and a PC +

Google Site Search

Textbooks

Other Classes

Homogeneous coordinates: more warping

1 Comments: