Why is OpenGL defined in a weird way or I am getting something wrong?

Hey guys,

I've never been a programmer it's just a hobby so this may be the reason why I don't know w lot of stuff. I'm currently trying to create my "own" platform layer on OS X. I previously used CoreGraphics to "blit" my bitmap to the screen. However to learn more and due to performance issues I started to use OpenGL for blitting.
I'm currently going through Casey's videos where he explains and describes OpenGL. But when I came to the point where he started to explain matrices (video 237), my brain shut down. I'm not that bad at math I guess. I'm studying mechanical engineering so I need to know a lot of math but I am used to use matrices in terms of FEM (Finite element method). When designing e.g. beams for whatever, mechanical engineers have to make sure that these are actually going to carry the weight they are loaded up with. So we use programs like Abaqus that slice the 3D models into "finite elements" and simulate stress with the rules of mechanics of materials.
Mechanics of materials makes heave use of matrices and mechanics theory itself copes a lot with linear equations which can be expressed through matrices which is a good form for computers to compute.

However in terms of OpenGL, drawing something it first gets translated into the unit cube and then stretched to the screen. And then there are matrix multiplications going on. This seems like a "round the edge" solution to me. Why not directly stretch it to the screen? Why first the unit cube and matrix multiplications which will lead to a result which is not wanted.
I would understand it if there were matrix multiplications if you wanted them to happen but they happen in the default way, which makes no sense to me as its unnecessarily complicating a task like drawing a bitmap to screen.

For an example:
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
glViewport(0, 0, WindowDimension.Width, WindowDimension.Height);
  glClearColor(R, 0.0f, 1.0f, 0.0f);
  glClear(GL_COLOR_BUFFER_BIT);

  glMatrixMode(GL_MODELVIEW);
  glLoadIdentity();

  glBegin(GL_TRIANGLES);

  glVertex2i(-1, -1);
  glVertex2i(1, -1);
  glVertex2i(1, 1);

  glVertex2i(-1, -1);
  glVertex2i(-1, 1);
  glVertex2i(1, 1);


This code will draw two triangles to fill a rectangle. However you first have to disable some matrix multiplication and then use weird coordinates. To me it would make much more sense to pass actual coordinates to glVertex2i.

Moreover why matrix multiplications. I cant see the reason for multiplying a vector (consisting out of the points I'm passing in using glVertex2i) with a matrix.
Casey' said it's used for rotation and scaling and the 4rth value of a vector is used for displacement but it still doesn't make a lot sense to me.
To scale things up I could also multiply vectors by scalars right? Why use matrixes?

Of course you could also point me to online references.

Thank you.
Hi :)

I'm hardly an expert, but I'll try to answer what parts I understand as I understand them.

Just passing the desired coordinates to glVertex would seem like the most intuitive course of action, and if you already had them, then that might be the right way to go, however the whole point of a graphics card is that you are offloading the math onto it. So it makes more sense if you are going to be doing a lot of transforms on those coordinates to input some basic data and then describe what you want the graphics card to do with that data, then the graphics card will have all the information required to do its work. The coordinates that you pass to glVertex are usually constant, like points in a mesh loaded from disk or something.

Using matrices is merely just a formal way to express the math. Its something the the driver can easily translate into operations for the graphics card to do.

Graphics cards are designed to do a lot of heavy work, and the interface does seem a little unintuitive for an easy task like what you have there, but it makes more sense, when you want to rotate an image, or when you want to translate a great many points (like when mapping a texture), by the same sets of transformations, not just a few.

Edited by David Butler on
opengl was meant for 3D so with a free camera so that's what the matrix is for, so you don't have to do math in your code for the vertex positions because the gpu could do it faster.

Then why target the unit cube instead of the rectangle of the screen? Because the math for attribute interpolation works out better that way. With the NDC it means that a point is on screen if x y and z are between -w and w and 0 is in the center. (though have z that way as well turned out to be a mistake because you lose precision with a floating point depth buffer) So you can cull triangles without having to do the perspective divide.
In old OpenGL (before 2.0 or 2.1), there only was a fixed rendering pipeline, that you could configure to make the transformation you wanted. Transforming vertex was expensive and the pipeline was in the circuits of the GPU. You could only set a few matrices (GL_MODELVIEW, GL_PROJECTION, GL_TEXTURE) that would be used to transform vertices. The default was GL_MODELVIEW, and I believe if you don't want to modify it you can just ignore it. But setting it to the identity ensure that no transformation will happen. Nowadays you can use vertex shaders to do whatever you want with vertices (so we don't use glMatrixMode).
adge
To me it would make much more sense to pass actual coordinates to glVertex2i.

There are no "actual coordinates". You "define" your coordinate space by setting a matrix that transforms your coordinate into the unit cube.
Matrix multiplication is used to combine operations on vertex. If you multiply all your transformation matrices together (translation, rotation, scale, projection...), you are left with one matrix that does all the transformation (the multiplication order is important) and is the same for all vertices expressed in the same space. If you set that matrix to GL_MODELVIEW with glLoadMatrix (or use a vertex shaders) the GPU will transform every vertex for you faster then the CPU.
Disclaimer: I haven't actually gotten to the OpenGL HMH episodes, I've just done some OpenGL stuff before.
I think the previous 3 answers have all the information collectively, but I'll add my response as a sort of summary that (hopefully) links them together.

Many of vectors that you give to OpenGL for vertex positions are constant (vertices in a mesh loaded from disk for a character or level, as Croepha said). Ideally you don't want to be uploading constant data to the GPU over and over again so if you could just upload it once and then modify it as necessary every time you wanted to use it, that'd be great. Thats where the matrices come in.
As mrmixer said there isn't such a thing as "actual" coordinates, just coordinates with respect to some axis. What you get when you load a model from disk is usually so-called "model-space coordinates", where all the positions are defined relative to some origin for the model (such as the centre/start of the level, a point between the character's feet etc). When you draw that to the screen though, you're not interested in model-space coordinates, you want to know where that vertex should be on your screen. So you first do a transformation (read: matrix multiplication) to put the model in the correct place relative to the centre of the world. Then you transform it so that it is in the correct place relative to the camera, so that if you turn around in a first-person game, you don't see the objects behind you. Finally you transform that position into the unit cube using whatever camera projection your application needs. For 3D this is almost always a perspective transformation, which makes things smaller the further they are from the camera. This lets you upload the vertex positions to the GPU once, and then just upload the transformation matrices each frame.

So thats why we need to do transformations to the data, but why matrix multiplications? As others have mentioned its something that the GPU can do really efficiently, but I suspect that that is also in part because it allows the graphics card to have fewer things that it needs to do efficiently.
As you suggest we could certainly achieve scaling by multiplication with a scalar. However what if we want to scale differently in different directions? Say you have a cylinder of height 1 and radius r and you want to place a cylinder of height 2 and radius r. To achieve that by scalar multiplication you'd need to have different scalars for each direction, which is exactly what you're doing by specifying scale via a matrix.
Using matrix multiplication also allows us to do translation, scaling and rotation (by independent amounts in each axis) with a "single" operation. The 4th vector is used for displacement because you have normal 3D coordinates plus an extra "w" coordinate which you just set to 1. When you do the matrix multiplication, that 1 gets multiplied by the translation entries in the matrix which amounts to adding those entries (the vector specifying the translation) to the vertex's position (after it has been scaled/rotated).

So all 3 of these transformations (or any affine transformation for that matter) can be achieved by matrix multiplication. Each of these transformations has a pretty simple matrix representation and to get the composition of the three you just multiply the matrices together (though as mrmixer said, order is important) and then give that to the GPU.

I hope that helps (and that I didn't just reiterate what other people have already said :O)