Chapter 02: Math Primer

 

While the end result of 3D graphics programming seems more akin to art than science, the process of creating a 3D image is inherently mathematical. This realization is often what keeps many programmers from seriously pursuing careers as 3D graphics programmers. Even I, at one time, considered the subject nearly unapproachable as a hobby – let alone a career. But with persistence and many late nights with a linear Algebra book, I started to feel more confident with the math and eventually built up enough experience on my own to land an actual job doing 3D graphics programming.

 

Of course, the math required to simply start learning is not too hair-raising, especially if you still remember basic Algebra. If you’ve ever taken a Linear Algebra course in high school or college, then you’re already half way there. For the rest lf you, I’ll try my best to keep things as simple and straightforward as possible, so we can move on the fun stuff – 3D graphics.

 

With that said, we’ll be making good use of Direct3D’s math utilities to help us out. Direct3D has an extensive set of custom data types and utility functions, which do a great job of hiding all that scary math, so most of our discussions dealing with math will revolve around the proper usage of these utilities with only the bare necessity of equations thrown in for clarification. I’m sure that’ll ruffle the feathers of more than a few mathematicians, but this is book about programming – not math. What I show you here is more than enough to get you started, and if you really want the full mathematical treatment on any particular subject; feel free to fire-up Google and search online. The Internet is literally chocked full of introductory texts on 3D mathematics. It must be a course requirement for Computer Science majors to post at least one 3D math tutorial online, because a fresh crop seem to pop up each day.

 

Polygons and the Pieces of a Polygon

 

Before we get too involved with the math stuff, we need to define a few basic terms. Do you remember when you took a Science course in school and the instructor introduced the concept of an atom? Your instructor probably described them as being like little building blocks, which can join together to form larger things. In a similar fashion, the 3D objects that make up a simulation or a game are also created from little building blocks, except in our case, they’re called polygons. Polygons are simply closed geometric shapes composed of at least three straight lines. In actuality, polygons are made of even smaller parts called vertices and edges, but we’ll treat them like sub-atomic particles because we really can’t do much visually with just vertices and edges. It’s not until we reach the polygon level do things become interesting in 3D graphics.

 


Figure 2-1: Three polygons with vertices and edges labaled

 

Now, when it comes to 3D graphics programming, the most important polygon is the triangle. The triangle is special, as far as polygons go, because it’s the only polygon that’s always guaranteed to satisfy the three following properties of a polygon:

 

  1. Triangles are guaranteed to be convex.
  2. Triangles are guaranteed to have coplanar vertices.
  3. Triangles are guaranteed not to be self-intersecting.

 

So, what do I mean by “convex”? If the vertices and edges that define a polygon are laid out in such a way that a straight line drawn through it crosses at most two sides, and every interior angle is less than 180° - the polygon is convex. The opposite of convex is concave, so if you can draw at least one straight line through it that crosses more than two sides, or it has at least one interior angle that is more than 180° - the polygon is concave. If these mathematical descriptions still leave you wondering, you can simply sum up convex as meaning the polygon has no “dents” in it. The reason we desire this property is fairly simple; concave polygons are computationally more costly to draw and fill with pixels than convex polygons.

 


Figure 2-2: Convex polygon vs. concave polygon

 

The second desirable property of triangles is the fact that they’re guaranteed to be constructed only from coplanar vertices. This means that all the vertices that make up the polygon exist in the same plane. When all the vertices exist in the same plane, the polygon is guaranteed to be completely flat. In other words, if you were to view the polygon edge-on, it would appear to be nothing more that a straight line. If the vertices were not coplanar, the polygon would become problematic to render when seen edge-on. The following figure shows a polygon both face-on and edge-on. Note how the polygon has three coplanar vertices (red) and three others, which are not.

 



Figure 2-3: What appears to be a simple polygon can become very confusing when seen edge-on

 

The third and final property that makes triangles so desirable is simplicity. Triangles are the simplest polygon that can be created. Any fewer vertices and it wouldn’t even be considered a polygon, but adding even one more vertex leads to the potential problem of self-intersection in which two or more edges cross each other. As you can clearly see in the following figure, our desire to reject self-intersecting polygons is obvious since it would be difficult for the hardware to interrupt how the creator of this polygon intended it to be rendered.

 


Figure 2-4: Simple triangle vs. complicated, self-intersecting polygon

 

In general, these three highly desirable properties allow us to process 3D geometry much faster because we no longer have to test for possible graphical anomalies, which can occur with higher order polygons. In return, the graphics hardware can devote more time and effort to simply processing vertices.

 

The 3D Coordinate System

 

It doesn’t matter if you’re rendering a simple cube or the 3D model of a sports car, the geometry that makes up a 3D object is nothing more than a collection of vertices, which have been interconnected by edges. Therefore, the first step to creating a piece of geometry is finding out where to place these vertices so they actually form something when interconnected. To help in this task, 3D graphics programmers use a Cartesian coordinate system to assist them in defining, manipulating, and viewing the vertices that make up a 3D model. You probably remember using a 2D version of the Cartesian coordinate system in Algebra to plot the points of a linear equation.

 


Figure 2-5: 2D Cartesian cooordinate system with x = 3, y = 2 plotted

 

The 2D version of the Cartesian coordinate system consists of two axes labeled “X” and “Y”, which are perpendicular or orthogonal (the angle between them is 90 degrees) to each other. Each axis of the coordinate system has a value of zero at the origin or center and is given a “positive” direction in which its values will grow. Movement along the axis in the opposite or “negative” direction causes the values to decreases until they finally become negative values on the opposite side of the origin. A 3D coordinate system is very similar to this except it adds a third axis, labeled “Z”, which is orthogonal to the first two axes.

 


Figure 2-6: 3D Cartesian coordinate system with x = 3, y =2, and z = 4 plotted

 

With a third axis in place, we can now plot the position of each vertex required for the construction of a model’s geometry by simply plotting the X,Y, and Z value of each vertex in reference to the coordinate system. In the following figure we can see a simple cube, which is defined by eight vertices, plotted on Direct3D’s coordinate system.

 


Figure 2-7: Direct3D’s three-dimensial coordinate system can be used to plot 3D objects like a cube

 

As an example, you’re probably wishing I had plotted the sport car’s vertices instead of a boring cube. Unfortunately, the car’s geometry, no doubt composed of many curved surfaces, would have required thousands of vertices and would have quickly cluttered up the image. The reason for this is simple; you can not draw a curved line as the edge that connects two vertex points when drawing or rendering 3D geometry. For speed and efficiency, 3D graphic cards render or draw everything using simple triangles. This is fine as long as you draw simple objects with flat faces and sharp edges, but when you need a curved surface you must learn to approximate it by increasing the triangle count in the area requiring curvature. The perfect example of how this approximation works is to render a simple sphere in wire frame mode using different degrees of tessellation. In 3D graphics programming, tessellation refers to the triangle count of a geometric surface. A highly tessellated surface has more triangles, thus it’s easier for it to simulate or approximate curves without appearing too blocky or coarse.

 


Figure 2-8: An increase in tessellation (triangle count) helps to produce smoother surfaces

 

From a quality point of view, it’s best to model everything using highly tessellated surfaces because they simply appear smoother due to their higher triangle count. It’s easy to see this as the sphere with the highest triangle count appears more sphere like and less blocky. On the other hand, we can’t just go around tessellating everything that has a curve in it. It’s clearly less efficient to render the smooth sphere over the coarser ones because it has so many more vertices. Every vertex that makes up the 3D object will have to be processed by the hardware when rendered and that eats up our draw time. The more vertices a model has, the longer it will take to render it and if we take too long, we won’t be able to manage a decent frame-rate. It’s simple economics.

 

Vectors & Normals

 

Suppose that you had been driving for a while and you decided to stop at a gas station to use the restroom, but when you asked the clerk for directions, he only responded with, “20 feet”. Sure, the knowledge that the restrooms are only 20 feet away is not completely useless. If you keep your legs crossed, you’ll probably find it before bursting, but wouldn’t it be more useful if he pointed in the restroom’s direction while saying, “20 feet”? This would make the restrooms far easier to find, wouldn’t you agree? In fact, what we really want from the gas station clerk is a vector! A vector is a mathematical entity, which combines a direction with a magnitude. In our case the direction is given by the clerks arm as he points toward the restroom and the magnitude is the 20 feet required to reach it.

 


Figure 2-9: The direction and distance to the nearest “facilities” can be defined using a vector

 

If you have ever taken Physics, then you have seen vectors at work in a more classical sense because they’re often used to represent such physical quantities as force, velocity and acceleration. For example, take one of those trains that always seem to pop up in math word problems.  There’s probably one leaving the station right now for Chicago, heading east at 50 mph to harass another generation high school students. The direction and speed of the train can be expressed as a vector. In this case, the vector points east and its magnitude or length would be 50 mph.

 

Graphically, a vector is represented by an arrow, which defines the direction, and whose length defines the magnitude. Mathematicians and programmers alike often examine vectors and their properties by drawing them in relation to a coordinate system. The following image shows two vectors plotted on Direct3D’s coordinate system.

 


Figure 2-10: Two vectors plotted on Direct3D’s coordinate system

 

It’s hard to get a feel for vectors especially when plotting them on a 2D image like this figure, so bare with me here. The first vector labeled (2,5,0) is plotted at X = 2, Y = 5, and Z = 0 and basically exists in the X/Y plane. The second vector labeled (0,0,4) is plotted at X = 0, Y = 0, and Z = 4 and is little harder to see since it points in the same direction as the Z axis.

 

Once the heads of the two vectors have been plotted, the direction and magnitude of each vector can be obtained by drawing a line from the coordinate system’s origin to the head point. The vector’s direction points away from the origin and toward the head point while the length of the line segment represents the vector’s magnitude.

 

Of course, not all vectors use the origin as their tail. In many math books vectors are actually defined as two vertices where the first one is the tail and the second one is the head. 3D graphics programmers typically don’t use this method and instead opt to use the origin as a common tail for all vectors in question. This way, they don’t have to waste memory storing an extra vertex point per vector.

 

Multiplying a Scalar by a Vector

 

Probably the simplest operation we can perform on a vector is to simply multiply it with a scalar. A scalar is just a fancy word for a regular number, which unlike our vector, has no direction - only a magnitude. This operation is often used in 3D graphics programming to increase or decrease a vector’s length. In the code snippet below a vector, which points down the X axis is doubled in size by multiplying it by 2.

 

D3DXVECTOR3 vMyVector( 1.0f, 0.0f, 0.0f );

 

// Before scalar multiplication...

// x = 1.0f

// y = 0.0f

// z = 0.0f

 

vMyVector *= 2.0f;

 

// After scalar multiplication...

// x = 2.0f

// y = 0.0f

// z = 0.0f

 


Figure 2-11: The length of vector ‘a’ can be altered by multiplying it by a scalar value such as 2 or 0.5

 

You should note that while a vector can be multiplied by a scalar – it can not be multiplied by another vector. The closet thing to vector multiplication is the cross-product and dot-product operations which will be covered later.

 

Vector Addition

 

Vector addition is also fairly straightforward. We simply add the individual components of the two vectors together to create a third vector, which represents their sum. The following code illustrates how two vectors can be added to create a third vector using the + operator defined by D3DXVECTOR3.

 

D3DXVECTOR3 vVector1( 1.0f, 0.0f, 0.0f );

D3DXVECTOR3 vVector2( 0.0f, 2.0f, 0.0f );

D3DXVECTOR3 vResult;

 

vResult = vVector1 + vVector2;

 

// vResult is now equal to...

// x = 1.0f

// y = 2.0f

// z = 0.0f

 

The best way to understand what’s happening during vector addition is to plot two simple 2D vectors on paper and then use the “head-to-tail” method to find the third resultant vector, which is their sum.

 


Figure 2-12: Vector addition using the “head-to-tail” method

 

As the name implies, the resultant vector can be found by placing the head of the first vector on the tail of the second and then drawing a directed line segment between the open ends. The direction of our new vector goes from the first vector’s tail toward the second vector’s head.

 

Vector Subtraction

 

Vector subtraction is performed in the same manner as addition except we subtract the individual components of the two vectors to create a third vector, which represents their difference.

 

Again, will use the same two vectors from our discussion concerning addition to demonstrate subtraction, but this time around we’ll using the - operator defined by D3DXVECTOR3 to assist us.

 

D3DXVECTOR3 vVector1( 1.0f, 0.0f, 0.0f );

D3DXVECTOR3 vVector2( 0.0f, 2.0f, 0.0f );

D3DXVECTOR3 vResult;

 

vResult = vVector1 - vVector2;

 

// vResult is now equal to...

// x =  1.0f

// y = -2.0f

// z =  0.0f

 

Like addition, there’s also a way to graphically plot the subtraction of two vectors. This method is called the “tail-to-tail” method and creates the resultant vector by placing the tails of the two vectors together and drawing a directed line segment between their heads. The direction of our new vector goes from the second vector’s head towards the first vector’s head.

 


Figure 2-13: Vector subtraction using the “tail-to-tail” method

 

Finding the Length of a Vector

 

Since one of the properties of a vector is its magnitude or length, it would be nice if we could find out what it is so we can make use of it. In some cases finding the length or magnitude of a vector is easy. For example, if two components of a vector are 0, such as X = 1, Y = 0, and Z = 0, we know the length is 1 because the Y and Z components of the vector have done nothing to reduce the vector’s length along the X axis. On the other hand, if a vector is set to X = 1, Y = 1, and Z = 0 we know that the length is more than 1, but we’re not sure how much more. If we were plotting the vector on actual paper we could pull out a ruler and measure it, but this is obviously not an option when programming 3D graphics. To help us out, Direct3D has been gracious enough to give us the D3DXVec3Length utility function, which will take a vector as an argument and will return its length. The following code snippet demonstrates how to use it.

 

D3DXVECTOR3 vMyVector( 1.0f, 1.0f, 0.0f );

 

float fLength = D3DXVec3Length( &vMyVector );

 

// fLength is now equal to 1.41421

 

Unfortunately, even with functions like D3DXVec3Length at our disposal, old habits die hard, and many programmers continue to code length calculations the manual way using the actual equation:

 

|v|= √(x2 + y2 + z2)

 

I only mention this so you’ll be prepared when you see it in someone else’s code. This often confuses programmers who are new to 3D graphics since seasoned programmers seldom comment it. The following demonstrates how this equation is typically coded:

 

D3DXVECTOR3 vMyVector( 1.0f, 1.0f, 0.0f );

float fLength;

 

fLength = sqrtf( vMyVector.x * vMyVector.x +

                 vMyVector.y * vMyVector.y +

                 vMyVector.z * vMyVector.z );

 

// Again... fLength is equal to 1.41421

 

Normals

 

A normal, also known as a unit vector, is simply a vector with a length or magnitude of one. A normalized vector can be very beneficial in certain equations since the original equation can be dramatically simplified if the vectors involved are of unit length. Lighting is probably the best example of where normals can save us considerable overhead since most lighting models require at least one normal per vertex to be passed for correct lighting. We’ll cover this more in the chapter on Lighting and Illumination.

 

The following code snippet demonstrates how to use the utility function, D3DXVec3Normalize, to normalize a vector called, vMyVector.

 

D3DXVECTOR3 vMyVector( 5.0f, 1.0f, 2.0f );

 

float fLength = D3DXVec3Length( &vMyVector );

 

// fLength is equal to 5.47723...

 

D3DXVec3Normalize( &vMyVector, &vMyVector );

 

fLength = D3DXVec3Length( &vMyVector );

 

// vMyVector has been normalized by shrinking its components...

// x = 0.912871

// y = 0.182574

// z = 0.365148

// So, fLength is now equal to 1.0.

 

It’s important to understand that the vector’s direction has not been altered by the normalization process – only the length or magnitude has been altered. Of course, accumulated rounding errors through float in-precision could throw us off a bit if we’re not careful.

 

Dot-Product

 

The dot-product is by far one of the most useful formulas in 3D graphics programming and can be found in some of the field’s most important algorithms. One of the earliest uses of the dot-product in 3D graphics was back-face culling. The technique of back-face culling allows the hardware to simply skip the costly processing of triangles, which face away from the viewer and therefore can’t possibly be visible; a huge time savings considering the potential number of triangles in a scene. Even a simple object such as a sphere has only half of its triangles visible at any given time. It would be a shame to process them when they contribute nothing to the final scene. Other uses for the dot-product include lighting and collision detection, which are crucial for creating realistic scenes both visually and physically, so we’ll definitely see the dot-product again when we cover lighting.

 

Now, from a mathematical point of view, a dot-product (often represented by the ∙ symbol) is the cosine of the angle between two vectors, scaled by the lengths of the vectors. In simpler terms, a dot-product supplies a measure of the difference between the directions in which two vectors point. Here’s the actual equation:

 

a ∙ b = |a| |b| cos(theta)

 

Where:

 

a = vector one

b = vector two

∙ = the dot-product symbol

|a| = magnitude or length of vector one

|b| = magnitude or length of vector two

cos(theta) = cosine of the angle between the two vectors

 

At first, this equation may not look too bad, but it’s filled with some terrible performance killers like square roots, division, and a cosine. Fortunately for us, this is one of those equations, which can be simplified if the vectors are normalized, so after dropping all that nasty baggage, we can re-write the equation like so:

 

a ∙ b = ( ax bx ) + ( ay by ) + ( az bz )

 

I know it looks a little longer on paper, but trust me - it’s a whole lot faster on the CPU. I’m not going to go into the gory details on how the square roots, divisions, and cosine get worked out, but if you’re really interested, it can be found online easy enough. We have better things to do, so we’re going to let Direct3D help us out again by using its special dot-product function called, D3DXVec3Dot. Here’s a very simple example of its usage:

 

D3DXVECTOR3 vVector1( 1.0f, 2.0f, 0.0f );

D3DXVECTOR3 vVector2( 2.0f, 1.0f, 0.0f );

 

D3DXVec3Normalize( &vVector1, &vVector1 );

D3DXVec3Normalize( &vVector2, &vVector2 );

 

float fDotProduct = D3DXVec3Dot( &vVector1, &vVector2 );

 

// fDotProduct is equal to 0.8

 


Figure 2-14: The dot-product

 

Now, it’s not necessarily the value of the dot-product that makes it important, it’s really the sign of the value that we care about (at least in most cases). If the dot-product’s sign is positive, the two vectors lie on the same side of a plane and if the sign is negative, the vector’s lie on opposite sides of a plane. And, obviously, if the dot-product is 0 (within some tolerance), the two vectors are perpendicular. The plane in question is defined as being perpendicular to the equation’s first vector and can be visualized along with the vectors being operated on by passing it through the first vector’s tail point (the origin in our case) at a right angle.

 


Figure 2-15: It’s easy to check if two vectors exist on the same side of plane using the dot-product operation

 

Cross-Product

 

Like the dot-product, the cross-product also defines a relationship between two vectors, but instead of producing a simple scalar value – the cross product produces a completely new vector, which is orthogonal or perpendicular to the two operand vectors. A perfect example of a vector, which is orthogonal to two other vectors, is the X axis of our very own coordinate system since the X axis is orthogonal to the Y, and Z axes and vice versa.

 

Of course, the purpose of the cross-product is not the double-checking of our coordinate system’s main axes – the orthogonal or perpendicular nature of the main axes is a given. The cross-product’s main purpose is to assist us in the creation of surface normals. Surface normals are normalized vectors which are required when lighting 3D objects. For an object to be lit correctly, each triangle that makes up the object must be assigned a surface normal, which accurately identifies the triangles orientation in space. The orientation of a triangle is crucial in lighting since it’s the orientation of each triangle, which determines how much light a 3D object will receive along its geometric surfaces.

 

To create a surface normal for a triangle, we start off by defining two temporary vectors using the triangle’s own vertices. Once we have these two vectors, we can use the cross-product operation on them to create a third vector, which we then normalize.

 

Since this new vector or normal is orthogonal to the temporary vectors, and the temporary vectors were defined using actual vertices of the triangle, which are guaranteed to be coplanar, the new vector’s orientation truly depicts the triangles orientation in 3D space. Again, this will be covered in more detail in the chapter on lighting, but here’s a figure to help you out till then:

 


Figure 2-16: The cross-product operation can be used to calculate the surface normal of a triangle

 

Vectors vs. Vertices

 

Before we move on, you should take note that since vertices, like vectors, have three components (X, Y, and Z), you’ll often see the D3DXVECTOR3 data type being used for both in code. This is especially true in Direct3D as many samples use the D3DXVECTOR3 data type to represent the positional portion of an application’s vertex structure. The declaration of vertex structures will be covered more in a later chapter, but here’s a brief example to get the point across. Note how VertexType_2 uses a D3DXVECTOR3 to hold the vertex’s position instead using three floats.

 

struct VertexType_1

{

    float x, y, z; // Vertex position (stored as three floats labeled x,y,z)

    DWORD color;   // Vertex color

};

struct VertexType_2

{

    D3DXVECTOR3 position; // Vertex position (stored in x,y,z of D3DXVECTOR3)

    DWORD color;          // Vertex color

};

 

VertexType_1 v1;

 

v1.x = 1.0f;

v1.y = 2.0f;

v1.z = 3.0f;

 

VertexType_2 v2;

 

v2.position.x = 1.0f;

v2.position.y = 2.0f;

v2.position.z = 3.0f;

 

Of course, the mathematical operators of the D3DXVECTOR3 data type will be meaningless if you’re treating the declared variable as a vertex’s position instead of an actual vector.

 

The Matrix

 

As much as I would like to conjure up some helpful analogy to enlighten you about matrices… none come to mind. That’s probably because matrices are inherently mysterious and it takes some time before most people can wrap their heads around the concept. Most newbie’s to 3D graphics programming tend to treat matrices like black boxes; they drop a few numbers into the right places, do the math, and get back what they hope is the right answer. For the mathematically challenged, this approach works reasonably well and is easy to do with Direct3D since it has a built-in matrix data type called D3DXMATRIX. Of course, even though D3DXMATRIX can auto-magically create valid matrices for you, I would be pretty negligent if I didn’t at least arm with you with the basics.

 

In simplest terms, a matrix is a two-dimensional array of numbers used primarily in Linear Algebra to represent a linear system of equations. This ability to encode a linear system of equations into one mathematical entity is ideal for use in 3D graphics programming since we often need to apply several equations to the same X, Y, and Z components of each vertex in our 3D models. The following figure contains three matrices of varying size or dimension along with a brief description.

 


Figure 2-17: Assorted matrices

 

As you can see, matrices can come in a variety of dimensions, but for 3D graphics work, the most popular type of matrix is a square 4x4 matrix of floats. You’ll occasionally see matrices of size 3x3, but the 4x4 size is preferred for general use since it’s the smallest matrix that supports both the rotation and translation of vertices simultaneously.

 


Figure 2-18: A 4x4 matrix of floats

 

While the values contained within a 4x4 float matrix can be set to anything considered valid for the float data type, there is a special type of matrix called an identity matrix, which is worth note. The identity matrix contains all zeros except for a line of ones, which cross it diagonally. It acts just like the scalar value of 1 during multiplication since multiplying any matrix by an identify matrix results in a new matrix, which is identical to the original matrix. Because of this behavior, you’ll often see matrices being initialized to the identity state prior to setting them up for other operations like rotation and translation.

 


Figure 2-19: Multiplication by an identity matrix results in no change to the original matrix

 

D3DXMATRIX

 

As I mentioned earlier, Direct3D gives us access to a special data type called D3DXMATRIX, which wraps around and abstracts the most common matrix operations required by 3D graphics programmers. Internally, D3DXMATRIX holds the actual matrix values in a smaller data structure called D3DMATRIX, which is simply a two-dimensional array of floats.

 

typedef struct _D3DMATRIX {

    union {

        struct {

            float        _11, _12, _13, _14;

            float        _21, _22, _23, _24;

            float        _31, _32, _33, _34;

            float        _41, _42, _43, _44;

 

        };

        float m[4][4];

    };

} D3DMATRIX;

 


Figure 2-20: Layout of named elements contained by the D3DMATRIX structure

 

Each element of the matrix array has a unique name which allows it to be accessed individually. The element names are written in row-column order, so if we’re trying to manually set a particular element’s value within the matrix’s array, we access it by specifying its row and column like so:

 

D3DXMATRIX mMyMatrix;

 

mMyMatrix._43 = 5.0f;

 

This sets the value stored at the row 4 and column 3 to 5.0f.

 


Figure 2-21: An identity matrix with its element at row 4 and column 3 set to 5.0

 

Of course, it’s probably more intuitive to set up the entire matrix by declaring and initializing an actual instance of the D3DMATRIX structure and then assigning it to our D3DXMATRIX variable. The following example creates an identity matrix with position row 4 and column 3 set to 5.0f.

 

D3DXMATRIX mMyMatrix;

 

D3DMATRIX translate =

{

       1.0f, 0.0f, 0.0f, 0.0f,

       0.0f, 1.0f, 0.0f, 0.0f,

       0.0f, 0.0f, 1.0f, 0.0f,

       0.0f, 0.0f, 5.0f, 1.0f

};

 

mMyMatrix = translate;

 

If this sounds confusing, just remember that D3DMATRIX (with no ‘X’ in its name) holds the actual matrix values while D3DXMATRIX (with an ‘X’) is the actual extended matrix data type that defines all the C++ operators (i.e *, +, -, *=), which make working with matrices easier.

 

In actuality, you will seldom need to manually set matrices when using or operating on them. The calculation and loading of matrix values are abstracted by utility functions defined by Direct3D. Occasionally, you may hear individuals bad-mouthing these built-in utilities by claiming they’re too slow for commercial use, but this accusation couldn’t be further from the truth. Microsoft, with the help of hardware vendors like nVIDIA and ATI, has spent considerable time and effort optimizing these routines for use on the most popular video cards. With that said, I think you would be hard-pressed to write a better one yourself.

 

Transformations

 

While matrices are used to solve a wide variety of problems in mathematics, 3D graphics programmers typically use them for just one thing: moving or transforming model vertices from one place in the 3D coordinate system to another. In other words, if we want to move a 3D object from point A to point B (and you know you want to), we have to literally move each and every vertex point which makes up the object’s geometry from A to B before rendering it again.

 

The act of moving vertices from one place to another with a matrix is called transformation and vertices can be transformed in a variety of ways; the three most important being translation, rotation, and scaling.

 

Translation – Vertices change location by sliding or translating along one or more of the three main axis of the coordinate system (X, Y, or Z). An example of translation is pushing an object across the floor.

 

Rotation – Vertices change location by either rotating around one of three main axis of the coordinate system (X,Y, or Z) or around an arbitrary axis defined by the user. An example of rotation is turning an object upside down.

 

Scaling – Vertices change location by either having the distance between them increased or decreased along one or more of the three main axis of the coordinate system (X,Y, or Z). An example of scaling is changing the size of object by shrinking it.

 

To understand these transformations better, let’s suppose you were sitting at a table and you placed a drinking glass in front of you. The glass will represent our 3D model and the table will give us a point of reference for moving or “transforming” our glass. Of course, to keep track of the glass, or more importantly, the vertices that make up the glass model we’ll need a 3D coordinate system, so let’s pretend that the coordinate system used by Direct3D is aligned with the table’s surface in such a way that the origin is sitting exactly on the table’s surface, the positive Y axis is pointing straight up from the table, the positive X axis is pointing to the right, and the positive Z axis is pointing away from where we are sitting.

 


Figure 2-22: A table with a drinking glass located at the origin

 

Translation

 

A translation transformation allows us to move an object form one place to another without altering its orientation. For example, we could move our glass away from us or towards us by either translating its vertices down the Z axis by some positive amount of distance (away from us), or translating its vertices along the Z axis by some negative amount of distance (towards us). To do this in code we would create a matrix using the D3DXMATRIX data type and then set it up for translation by passing it to a utility function defined by Direct3D called D3DXMatrixTranslation.

 

D3DXMATRIX *D3DXMatrixTranslation( D3DXMATRIX *pOut, FLOAT x, FLOAT y, FLOAT z );

 

The following demonstrates how we would move the glass away from us by translating it along the Z axis by 5 units:

 

D3DXMATRIX mMyMatrix;

D3DXMatrixTranslation( &mMyMatrix, 0.0f, 0.0f, 5.0f );

g_pd3dDevice->SetTransform( D3DTS_WORLD, mMyMatrix);

 

// Render drinking glass here...

 

The D3DXMatrixTranslation function takes an instance of a D3DXMATRIX as an argument along with the amount of translation along the X, Y, and Z axis to apply to the matrix it. The SetTransform method basically hands the matrix off to our application’s Direct3D device, which then makes the new matrix current for all vertices about to be rendered. The current matrix will continue to be used until a new one is set. The proper usage of enumerations like D3DTS_WORLD we’ll be covered in a later chapter.

 


Figure 2-23: Using translation, we can move the glass down the Z axis by 5 units

 

For another example, if we wanted to move the glass both away from us and up from the table’s surface, we would do this:

 

D3DXMATRIX mMyMatrix;

D3DXMatrixTranslation( &mMyMatrix, 0.0f, 4.0f, 5.0f );

g_pd3dDevice->SetTransform( D3DTS_WORLD, mMyMatrix);

 

This will cause the glass to translate up 4 units along the Y axis and then away from us 5 units along the Z axis.

 


Figure 2-24: This time we translate along two axis of movement

 

And of course, if we wanted our glass to move left or right across the table, we would simply set the X argument of D3DXMatrixTranslation to the amount of movement we desire.

 

D3DXMatrixTranslation( &mMyMatrix, 3.0f, 0.0f, 0.0f ); // Translate right

 

// Or, we could go the other way with a negative value!

 

D3DXMatrixTranslation( &mMyMatrix, -3.0f, 0.0f, 0.0f ); // Translate left

 

The following figure shows where the glass would end up if it was moved along the X axis by either 3 or -3 units.

 


Figure 2-25: Using translation, we can move the glass left or right by translating along the X axis

 

Rotation

 

At this point we can move our drinking glass all over the place using translations, but we still can’t do anything particularly interesting like tilt the glass over and pour out its contents. To pour out the glass’s contents we would need to change the glass’s orientation by rotating it with a rotation transformation. For example, if we wanted to pretend to pour out water from our drinking glass we could rotate it about the X axis -90 degrees. This would basically pour the glass right out into our lap… not a smart thing to do but it does make for a good demonstration. To accomplish this we’ll use the D3DXMatrixRotationX utility function to create a rotation matrix containing a -90 degree rotation around the X axis, and as you might guess, there’s also a D3DXMatrixRotationY and D3DXMatrixRotationZ functions that match it.

 

D3DXMATRIX *D3DXMatrixRotationX( D3DXMATRIX *pOut, FLOAT Angle );

D3DXMATRIX *D3DXMatrixRotationY( D3DXMATRIX *pOut, FLOAT Angle );

D3DXMATRIX *D3DXMatrixRotationZ( D3DXMATRIX *pOut, FLOAT Angle );

 

These functions all take a matrix to set and the amount of angular rotation in radians, but we’ll use the D3DXToRadian macro to convert the desired rotation from degrees to radians for us since it’s more intuitive to use degrees when learning transformations.

 

D3DXMATRIX mMyMatrix;

D3DXMatrixRotationX( &mMyMatrix, D3DXToRadian(-90.0f) );

g_pd3dDevice->SetTransform( D3DTS_WORLD, mMyMatrix);

 

// Render drinking glass here...

 


Figure 2-26: A simple rotation of our glass around the X axis doesn’t look right

 

Unfortunately, our first attempt at applying a rotation doesn’t look quite right as the glass now intersects the table and appears stuck in it. This is because we simply rotated the glass around the X axis without actually moving or translating it away. What we really want to do is rotate the glass and then translate it up a bit. That would look more natural.

 

D3DXMATRIX mTranslation;

D3DXMATRIX mRotationX;

D3DXMATRIX mMyMatrix;

 

D3DXMatrixTranslation( &mTranslation, 0.0f, 5.0f, 0.0f );

D3DXMatrixRotationX( &mRotationX, D3DXToRadian(-90.0f) );

 

mMyMatrix = mRotationX * mTranslation;

 

g_pd3dDevice->SetTransform( D3DTS_WORLD, &mMyMatrix );

 

This time we create two transformation matrices: a translation and a rotation. Once they’ve been set up correctly, we combine the two together through multiplication and store the result in a third matrix. The multiplication of two matrices, also known as “concatenation”, is one the most important reasons why programmers use matrices for 3D graphics programming. Concatenation allows several matrices, and the transformations they represent, to be combined into one matrix, which will perform all of the transformations at once. Instead of having to perform the calculations to rotate each vertex of our model around the X axis, and then perform a completely new set of calculations to translate each vertex along the Y axis, we simply concatenate the rotation and translation matrix together. We can then transform each vertex of the model by the single concatenated matrix and save ourselves considerable overhead. This also works regardless of the number of matrices being concatenate together, so these savings increase with each matrix added.

 


Figure 2-27: A glass being rotated about the X axis and and translated up the Y axis

 

Order of Rotation and Translation

 

Now, at this point you maybe asking your self, “Why not just lift the glass up and then rotate it? Wouldn’t that be more intuitive?”. I agree it would be more intuitive that way since that’s more like what a real person would do, but it’s important to understand that the order used when multiplying rotation and translation matrices together is crucial. The reason for this is that matrix multiplication, unlike multiplication between regular numbers, is not commutative. If you’re a little rusty on math jargon, “commutative” means if a x b = c then b x a = c. In other words, it doesn’t matter if we say 5 X 2 =10 or 2 X 5 = 10. The answer is always 10 regardless of order since multiplication between real numbers is commutative. The reason that matrix multiplication is not commutative is beyond the scope of this book so you’ll just have to trust me on this one, but it’s fairly easy to visualize why it isn’t by examining what happens if we reverse the order of multiplication used in our last example like so:

 

D3DXMatrixTranslation( &mTranslation, 0.0f, 5.0f, 0.0f );

D3DXMatrixRotationX( &mRotationX, D3DXToRadian(-90.0f) );

 

mMyMatrix = mTranslation * mRotationX; // Oops! Translations first, then rotation!

 

g_pd3dDevice->SetTransform( D3DTS_WORLD, &mMyMatrix );

 


Figure 2-28: The same glass with the order of rotation and translation reversed

 

As you can painfully see, if we switch the order of the multiplication from translation to rotation, we’ll practically hit ourselves in the head since we elevate the glass first, and then rotate it around the X axis by -90 degrees. The key to understanding the difference lies with the rotation of objects about an axis. If you want an object to spin in-place with out actually moving from its current location, you need to perform the rotation while the object is close to the chosen axis of rotation, preferably while it’s centered on the axis with the axis going straight through it. If the object moves away from the axis before the rotation takes place, the rotation causes the object to swing around the axis like its orbiting.

 

But don’t get me wrong, I’m not trying to tell you that you should never do a translation before a rotation. Maybe you really want an object to orbit about an axis of rotation instead spinning on it. There are even cases where you’ll want to do both at the same time. A perfect example of this is the Earth. The Earth not only spins on its axis, but orbits about the Sun.

 

A more typical example of needing to translate first before a rotation occurs when we desire to spin an object in-place, but the object is nowhere near the origin. In this case, we need to move or translate the model back to the origin, perform the rotation, and then translate It back out to where we want it placed.

 

Rotation about an Arbitrary Axis

 

So far we’ve talking about rotations about the main axes of the coordinate system: X, Y, and Z. Occasionally, you’ll want to use an arbitrary axis instead. To do this, we’ll use the D3DXMatrixRotationAxis utility function, which takes a matrix to set, a vector defining the desired axis of rotation, and the angle of rotation in radians. 

 

D3DXMATRIX *D3DXMatrixRotationAxis( D3DXMATRIX *pOut,

                                    CONST D3DXVECTOR3 *pV,

                                    FLOAT Angle );

 

The following code creates a rotation, which uses an arbitrary axis defined as the vector called, vMyAxis.

 

D3DXVECTOR3 vMyAxis( 2.0f, 6.0f, 0.0f );

D3DXMatrixRotationAxis( &mMyMatrix, &vMyAxis, D3DXToRadian( 45.0 ) );

 


Figure 2-29: Rotation about an arbitrary axis

 

Scaling

 

The last transformation we’ll be covering is scaling. A scaling transformation basically changes the size of our geometry by either spacing the vertices further apart or squeezing them closer together. The utility function for setting up a matrix for scaling is D3DXMatrixScaling and like translation, scaling can occur along one or more of the three main axis of the coordinate system: X, Y, or Z.

 

D3DXMATRIX *D3DXMatrixScaling( D3DXMATRIX *pOut,

                               FLOAT sx, FLOAT sy, FLOAT sz );

 

The following code demonstrates how to double the size of the drinking glass by passing in our matrix and setting the sx, sy, and sz arguments to 2.0f. Note how the scaling causes the larger version of drinking glass depicted in the figure to intersect the table.

 

D3DXMATRIX mScale;

D3DXMatrixScaling( &mScale, 2.0f, 2.0f, 2.0f );

g_pd3dDevice->SetTransform( D3DTS_WORLD, &mScale );

 

// Render extra big drinking glass here...

 


Figure 2-30: By uniformly scaling by 2.0, we can double the size of the original glass. Unfortunatly, doing this will cause it to intersect the table

 

Here’s a second example in which we shrink our glass by calling D3DXMatrixScaling and setting each axis to 0.5f. This will basically half the object’s size.

 

D3DXMATRIX mScaleDown;

D3DXMatrixScaling( &mScaleDown, 0.5f, 0.5f, 0.5f );

g_pd3dDevice->SetTransform( D3DTS_WORLD, &mScaleDown);

 

// Render little shot glass here...

 

Now, when we call D3DXMatrixScaling and set all three arguments to the same value, we’re performing what’s called uniform scaling. The opposite of this is called non-uniform scaling. A non-uniform scaling transformation can be used to make an object appear longer, taller or warped by scaling one or more the axis by a different value. For example, we could make the glass taller by scaling it along the Y axis while leaving the others set to 1.0, which is the default value for scaling. The following code stretches our glass along the Y axis by doubling it and leaving the other two axes set to 1, which is the default value for scaling.

 

D3DXMATRIX mNonUniformScale;

D3DXMatrixScaling( & mNonUniformScale, 1.0f, 2.0f, 1.0f );

g_pd3dDevice->SetTransform( D3DTS_WORLD, & mNonUniformScale);

 

// Render tall drinking glass here...

 


Figure 2-31: Assorted scaling examples

 

Finally, you might have noticed, the first scaling transformation we tried, which doubled the glass’s size, actually caused the glass to intersect the table’s surface creating an unacceptable graphical anomaly. To fix this we create a second matrix containing a translation along the Y axis of 2 units and concatenate it with our scaling matrix. This will compensate for the glass’s new size by elevating it up and placing it back on the table’s surface.

 

D3DXMATRIX mTranslation;

D3DXMATRIX mScaling;

D3DXMATRIX mGlassMatrix;

 

D3DXMatrixScaling( &mScaling, 2.0f, 2.0f, 2.0f );

D3DXMatrixTranslation( &mTranslation, 0.0f, 2.0f, 0.0f );

 

mGlassMatrix = mScaling * mTranslation;

 

g_pd3dDevice->SetTransform( D3DTS_WORLD, &mGlassMatrix);

 

// Render slightly elevated, 2X drinking glass here...

 


Figure 2-32: If we scale our drinking glass by 2.0, we will have to translate it up a bit to put it back on the table

 

Order of Scaling and Other Transformations

 

Again, the order of concatenation is important since scaling, like rotation, is performed with respect to the origin of the coordinate system. Of course, in many cases a change in order between scaling and translation or scaling and rotation may not reveal much difference, but it’s best to perform scaling first before all others. This is typically safe because most programmers and artists build models centered on the coordinate system’s origin, which is the ideal position for scaling transformations.

 

If for some reason you do load a model from a file and it’s not centered on the origin, you may have to translate it to the origin before scaling it to get the right effect, but in general, the best order for concatenating matrices is scaling first, rotation second and translation third.

 

S * R * T = M

 

This ordering represents what 3D graphics programmers are typically aiming for when they’re building up a model’s concatenated transform.

 

This brings our introduction to 3D math to an end. I definitely didn’t cover everything there is to know about 3D math, but we’ve covered everything we need to know in order to make use of this book. Some subjects like matrices and transformation will be covered again in more detail so don’t worry if you still don’t understand how they work or how all this fits into the big picture of learning Direct3D. That will come soon enough.

 

 

© 2003 Kevin R. Harris All rights reserved.

Legal Disclaimer and Copyright Notice