Why do we use homogenious coordinates?

2 weeks, 6 days ago
Edited by
longtran2904
on July 12, 2021, 3:40 p.m.
Reason: Initial post
1. Why do most of the game industry use a 4d homogenous coordinate system? Why don't we just use a 3d vector?

2. What does the w in [x, y, z, w] mean?

3. Why do most of the game industry use quaternion instead of euler angles in form of a vector 3?

4. What does quaternion represent? (In an abstract/high-level view)

2. What does the w in [x, y, z, w] mean?

3. Why do most of the game industry use quaternion instead of euler angles in form of a vector 3?

4. What does quaternion represent? (In an abstract/high-level view)

Why do we use homogenious coordinates?

2 weeks, 5 days ago
1. because it means that a matrix multiplication can encode translation, one of the most common operations in games. And using a slightly adjusted matrix you get a nearly free perspective projection by swapping the z and w and re-homogenizing.

2. there is no special meaning to the letter it just happens to be the 4th to last letter of the alphabet. In the math all homogeneous coordinates are point projected onto the w=1 plane with the origin as the projection point.

3. fewer games use quaternions than you'd think. But quaterions can interpolate and compose very easily

4. it's an axis+angle representation but instead of representing that as a unit length axis and angle in radians, you instead have the w component be the cosine of the half angle and the axis length is the sine of the half angle. This happens to have all the nice properties that we use quaternions for.

2. there is no special meaning to the letter it just happens to be the 4th to last letter of the alphabet. In the math all homogeneous coordinates are point projected onto the w=1 plane with the origin as the projection point.

3. fewer games use quaternions than you'd think. But quaterions can interpolate and compose very easily

4. it's an axis+angle representation but instead of representing that as a unit length axis and angle in radians, you instead have the w component be the cosine of the half angle and the axis length is the sine of the half angle. This happens to have all the nice properties that we use quaternions for.

Why do we use homogenious coordinates?

2 weeks, 5 days ago
Edited by
longtran2904
on July 12, 2021, 6:04 p.m.
Can you elaborate on this a little bit more? Maybe give some examples? Also, when I asked about the 'w', I meant that what does the 'w' encode?

Ben Visness

64 posts

HMN admin, Handmade Math contributor, high school robotics mentor/enthusiast, web developer, etc.

Why do we use homogenious coordinates?

2 weeks, 5 days ago
As I'm sure you're aware it's really common to use matrices to encode the transformations you do in graphics programming. They have the lovely property where, if you have a bunch of matrices for specific transformations (translation, rotation, scaling, projection), you can multiply the matrices together to get one transformation that does everything in one go.

The catch is that matrices can only encode*linear* transformations, and translation is not a linear transformation. One of the conditions for a linear transformation is that it does not change the origin; in other words, the point [0, 0] must remain at [0, 0] after the transformation. Translation obviously does not do this; a translation of [x, y] moves [0, 0] to [x, y].

There's a neat trick you can do, though - by using an extra dimension (e.g. using 3D vectors to represent 2D points) you can represent translation with a shear, which is indeed a linear transformation. It's easy to visualize in the 2D/3D case, and Wikipedia has a good animation: https://en.wikipedia.org/wiki/File:Affine_transformations.ogv

Indeed, if you look up shear matrices and translation matrices online, you'll see that they have the exact same structure, because they're the exact same thing.

So that's one reason to use homogeneous coordinates, but there are other reasons like ratchetfreak alluded to. Perspective projection is another nonlinear transformation, but again one where the properties of homogeneous coordinates are very useful. I don't have a nice visual for that unfortunately; maybe someone else knows of a good one.

If you wanted, you absolutely could just use 3D vectors for everything and not use matrices to do your transformations. But homogeneous coordinates are so darn convenient that they're just the standard.

The catch is that matrices can only encode

There's a neat trick you can do, though - by using an extra dimension (e.g. using 3D vectors to represent 2D points) you can represent translation with a shear, which is indeed a linear transformation. It's easy to visualize in the 2D/3D case, and Wikipedia has a good animation: https://en.wikipedia.org/wiki/File:Affine_transformations.ogv

Indeed, if you look up shear matrices and translation matrices online, you'll see that they have the exact same structure, because they're the exact same thing.

So that's one reason to use homogeneous coordinates, but there are other reasons like ratchetfreak alluded to. Perspective projection is another nonlinear transformation, but again one where the properties of homogeneous coordinates are very useful. I don't have a nice visual for that unfortunately; maybe someone else knows of a good one.

If you wanted, you absolutely could just use 3D vectors for everything and not use matrices to do your transformations. But homogeneous coordinates are so darn convenient that they're just the standard.

Ben Visness

64 posts

HMN admin, Handmade Math contributor, high school robotics mentor/enthusiast, web developer, etc.

Why do we use homogenious coordinates?

2 weeks, 5 days ago
As for what the w encodes - that's the scaling factor on the rest of the members of the vector.

Say we have the 2D vector [1, 2]. In homogeneous coordinates we have a third member, that scaling factor. The following vectors all represent the same point in homogeneous coordinates:

With any of these, you can divide the whole vector by that scaling factor to get back to the normalized version where w = 1: [1, 2, 1].

If you play around with this, getting a bunch of random 3D vectors [x, y, z] and then normalizing them this way (divide the whole vector by the last component), you'll see that this appears to "project" those points onto the plane where z = 1.

The projection method actually used in computer graphics is a bit different, but you can hopefully see why homogeneous coordinates are helpful for this kind of thing.

Say we have the 2D vector [1, 2]. In homogeneous coordinates we have a third member, that scaling factor. The following vectors all represent the same point in homogeneous coordinates:

- [1, 2, 1]
- [2, 4, 2]
- [0.5, 1, 0.5]

With any of these, you can divide the whole vector by that scaling factor to get back to the normalized version where w = 1: [1, 2, 1].

If you play around with this, getting a bunch of random 3D vectors [x, y, z] and then normalizing them this way (divide the whole vector by the last component), you'll see that this appears to "project" those points onto the plane where z = 1.

The projection method actually used in computer graphics is a bit different, but you can hopefully see why homogeneous coordinates are helpful for this kind of thing.

Why do we use homogenious coordinates?

2 weeks, 5 days ago
Edited by
longtran2904
on July 13, 2021, 7:09 a.m.
What about quaternion? What advantage does quaternion have over a vector3?

Regarding homogenous coordinates, I found a clip that talks about this fairly well.

Regarding homogenous coordinates, I found a clip that talks about this fairly well.

Why do we use homogenious coordinates?

2 weeks, 5 days ago
Edited by
Guntha
on July 13, 2021, 7:26 a.m.
If by a vector3 you mean an angle for each axis (yaw, pitch, roll), if you ever tested such a rotation you'll notice the rotation will be different based on the order in which you apply the rotations (yaw->pitch->roll will give a different result than pitch->yaw->roll).

As ratchetfreak said, quaternions are a way of encoding a rotation as (axis + angle).

Before quaternions (or rotators now) were commonly known, some early 3D games could get away with such rotations (for example if I'm not mistaken, Quake used them, and it's entities could only rotate on the up axis).

As ratchetfreak said, quaternions are a way of encoding a rotation as (axis + angle).

Before quaternions (or rotators now) were commonly known, some early 3D games could get away with such rotations (for example if I'm not mistaken, Quake used them, and it's entities could only rotate on the up axis).

Why do we use homogenious coordinates?

2 weeks, 5 days ago
euler angles have the danger of gimbal locking where you lose a degree of freedom

representing the rotation as a full 3x3 rotation matrix fixes that but that is more difficult to orthonormalize (to avoid the matrix from representing something that isn't a rotation) and doesn't interpolate

representing the rotation as a full 3x3 rotation matrix fixes that but that is more difficult to orthonormalize (to avoid the matrix from representing something that isn't a rotation) and doesn't interpolate

Why do we use homogenious coordinates?

2 weeks, 1 day ago
longtran2904

What about quaternion? What advantage does quaternion have over a vector3?

Let me try and explain what the problem is with treating 3D vectors as rotations...

So, a rotation is a transformation, i.e. a function that maps vectors to vectors.

For example, which 2D rotation is this?

1 2 3 4 5 | Vector2D MyRotation(Vector2D input) { Vector2D output; output.x = input.y; output.y = -input.x; } |

It's a 90 degree, clockwise rotation, about the origin, right? (You can convince yourself this is true by plugging a few values into the function and see what you get!)

However, this is not a very nice form for rotations to be in. If I gave you an arbitrarily complicated function that does a rotation, there's no easy way to find to the inverse rotation (the one that has the opposite effect).

So instead humans assign numbers to 2D rotations, which we call angles! If I want the inverse of a rotation of "angle 30", I negate the number, giving the rotation "angle -30". Furthermore, if I want find an equivalent rotation to applying two rotations one after another, I add their angles together. And importantly for animation, if I want to find a rotation somewhere between two rotations with angles x and y, I linearly interpolate them, i.e. (y-x)t+x, where t ranges between 0 and 1.

But then what's the problem with doing this for 3D rotations? For a 3D rotation, we need 3 angles, often called row, pitch and yaw. But the effect of these angles are intertwined! To apply a 3D rotation, we actually apply the row, pitch and yaw rotations in succession, one after the other. Therefore, our rotation now looks like a C function again!

1 2 3 4 5 6 7 | Vector3D Apply3DRotation(Vector3D input, Vector3D angles) { Vector3D vector = input; vector = ApplyPlanarRotation(vector, angles.x, THE_X_PLANE); vector = ApplyPlanarRotation(vector, angles.y, THE_Y_PLANE); vector = ApplyPlanarRotation(vector, angles.z, THE_Z_PLANE); return vector; } |

This means all the nice things that we used to be able to do with angles in 2D, we now can't do =(. Therefore, we have to go back to the drawing board and get quaternions involved -- a 4D generalisation of complex numbers.

const char *hexChars = "0123456789ABDCEF";