float vs double?

5 years ago
About float/real32 being good enough compared to double/real64, I think it really depends on the scale of your world vs the precision you want to achieve.

As an example, if the minimum absolute position precision you need is around 0.001 meter (i.e. 1mm or 1/25th an inch), then with floats you can't achieve that if you go much beyond a 10km scale (6 miles).

But with doubles the same limit is 10,000,000,000 km, which is about the size of the solar system.

As an example, if the minimum absolute position precision you need is around 0.001 meter (i.e. 1mm or 1/25th an inch), then with floats you can't achieve that if you go much beyond a 10km scale (6 miles).

But with doubles the same limit is 10,000,000,000 km, which is about the size of the solar system.

None

float vs double?

5 years ago
I don't know if you would reference a position in such a global manner. I think most games of such size tend to have smaller maps that share borders and are streamed in when the player is close to a border. So you would only reference a position inside the current map.

float vs double?

5 years ago
If I remember correctly it was discussed on the stream and we will be using integer world position so no problem with accuracy.

I think float can be used for relative position inside the tile.

I think float can be used for relative position inside the tile.

float vs double?

5 years ago
Right, Casey talks about this in the context of HMH in stream #30 or #31 I think.

It's true that an alternative solution is to divide the world into smaller sections, and use floats as a relative position within subsections. That works well for HMH.

But this solution is more complicated when the world is truly open and dynamic, with a mix of very big and very small objects, like a space sim where capital ships could be many miles long.

It's true that an alternative solution is to divide the world into smaller sections, and use floats as a relative position within subsections. That works well for HMH.

But this solution is more complicated when the world is truly open and dynamic, with a mix of very big and very small objects, like a space sim where capital ships could be many miles long.

None

In video #32, even though floats are used to just track the position within a tile, Casey is running into a precision issue when doing a modulo on floats:

Transforming a tiny negative position around zero -dx (in the current tile) into a positive position tileWidth-dx (relative to the next tile) just doesn't work. The ratio dx/tileWidth is so small that tileWidth-dx gets stored as tileWidth.

The interesting thing is that the same issue could even happen with doubles, but it's just more likely to happen with floats (which is actually a good thing since you can catch it earlier).

Operations with floats often require expressing wanted precision explicitly with an epsilon, so that if |dx| < epsilon, the value can be forced to a clean zero.

Transforming a tiny negative position around zero -dx (in the current tile) into a positive position tileWidth-dx (relative to the next tile) just doesn't work. The ratio dx/tileWidth is so small that tileWidth-dx gets stored as tileWidth.

The interesting thing is that the same issue could even happen with doubles, but it's just more likely to happen with floats (which is actually a good thing since you can catch it earlier).

Operations with floats often require expressing wanted precision explicitly with an epsilon, so that if |dx| < epsilon, the value can be forced to a clean zero.

None

This is what Tom Forsyth talks about in his article 'A matter of precision' https://home.comcast.net/~tom_forsyth/blog.wiki.html He recommends avoiding both float and double and using fixed point for space and time. It's an interesting read, and when you think about fixed point, to me at least, it's a more natural way of dividing up a space, as it creates constant intervals across the space.

So i think i have this correct ... a 32 bit int could represent a 24.8 bit fixed point, giving a range of 0 to 16777216+255/256. Where the fractional part provides a precision of 1/256th of a unit. That is a very reasonable division and quite a range. You can vary the position of the binary point to balance the integral range and the fractional precision.

So i think i have this correct ... a 32 bit int could represent a 24.8 bit fixed point, giving a range of 0 to 16777216+255/256. Where the fractional part provides a precision of 1/256th of a unit. That is a very reasonable division and quite a range. You can vary the position of the binary point to balance the integral range and the fractional precision.

float vs double?

5 years ago
1/256th of a unit seems OK at first but I don't think it's precise enough to handle the acceleration/friction thing.

Casey Muratori

818 posts
/ 1 project

Casey Muratori is a programmer at Molly Rocket on the game 1935 and is the host of the educational programming series Handmade Hero.

float vs double?

5 years ago
Fred

But this solution is more complicated when the world is truly open and dynamic, with a mix of very big and very small objects, like a space sim where capital ships could be many miles long.

I use a similar (two-coord, not double) scheme for exactly such things.

- Casey

float vs double?

5 years ago
cmuratoriFred

But this solution is more complicated when the world is truly open and dynamic, with a mix of very big and very small objects, like a space sim where capital ships could be many miles long.

I use a similar (two-coord, not double) scheme for exactly such things.

- Casey

In this case, would you end up baking the two-coord (say, int32 + real32) into your vector class and rewrite your vector math functions to manipulate both, like to compute the length of a very long vector?

For simple operations, like sub and add, it's probably ok to operate on the two coord directly, but maybe for more complex math (like vector length) it's best to convert to a temporary double? Hmm.

None

I'm torn...

Another possibility is that instead of storing world coordinates as int+float, we can use doubles:

- Both solutions are the same size in memory.

- With doubles, even if they're slower than floats, you can use standard vector math. With int+float you need to write world position manipulation that operates on the int coords and the float vec, then possibly "recanonicalize" (which requires divides and mods, etc).

Say that you want the distance between two points far apart in the game world.

With doubles it's straightforward:

(I don't user operator overloading here)

With int+float, you need:

For a subtraction, you're roughly operating on the same number of bits in both cases (128). But if you need to recanonicalize, int+float is gonna be slower.

To compute the length of a world vector, with int/float, you would need to return a scalar int+float to maintain full precision. And the math isn't obvious to me.

With world positions as doubles, when you want to do lots of fast "local" physics, you can convert all your world double positions for objects in the local sub-space into some float vector in a local frame of reference:

int+float does have the advantage that the precision doesn't depend on world location at all.

Also, the coord part (the ints) can be use to do some sort of broad phase collision detection (you can easily use them as indices into a quadtree).

Another possibility is that instead of storing world coordinates as int+float, we can use doubles:

- Both solutions are the same size in memory.

- With doubles, even if they're slower than floats, you can use standard vector math. With int+float you need to write world position manipulation that operates on the int coords and the float vec, then possibly "recanonicalize" (which requires divides and mods, etc).

Say that you want the distance between two points far apart in the game world.

With doubles it's straightforward:

(I don't user operator overloading here)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 | struct v2d { real64 x, y; //a pair of doubles } v2d subtract (v2d p1, v2d p2) { v2d res = {}; res.x = p2.x - p1.x; res.y = p2.y - p1.y; return res; } double length(v2d p) { double res = sqrt(p.x*p.x + p.y*p.y); return res; } double distance(v2d p1, v2d p2) { v2d d = subtract(p1, p2); return length(d); } |

With int+float, you need:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 | struct v2 { real32 x, y; } struct coord { uint32 x, y; } struct world_pos { coord c; v2 p; } world_pos subtract (world_pos p1, world_pos p2) { world_pos res = {}; res.c.x = p2.c.x - p1.c.x; res.c.y = p2.c.y - p1.c.y; res.p.x = p2.p.x - p1.p.x; res.p.y = p2.p.y - p1.p.y; //possibly recanonicalize res??? return res; } ?? length (world_pos p) { ??? } |

For a subtraction, you're roughly operating on the same number of bits in both cases (128). But if you need to recanonicalize, int+float is gonna be slower.

To compute the length of a world vector, with int/float, you would need to return a scalar int+float to maintain full precision. And the math isn't obvious to me.

With world positions as doubles, when you want to do lots of fast "local" physics, you can convert all your world double positions for objects in the local sub-space into some float vector in a local frame of reference:

1 2 3 4 5 6 7 8 9 | v2 convertToLocalPos( v2d localOrigin, v2d worldPosition) { v2d localPos = subtract(worldPosition, localOrigin); v2 res = {}; res.x = (real32) localPos.x; res.y = (real32) localPos.y; return res; } |

int+float does have the advantage that the precision doesn't depend on world location at all.

Also, the coord part (the ints) can be use to do some sort of broad phase collision detection (you can easily use them as indices into a quadtree).

None

float vs double?

1 year ago
The main difference is Floats and Doubles are binary floating point types and a Decimal will store the value as a floating decimal point type. So Decimals have much higher precision and are usually used within monetary (financial) applications that require a high degree of accuracy. But in performance wise Decimals are slower than double and float types.

Float - 7 digits (32 bit)

Double-15-16 digits (64 bit)

Decimal -28-29 significant digits (128 bit)

Decimals have much higher precision and are usually used within financial applications that require a high degree of accuracy. Decimals are much slower (up to 20X times in some tests) than a double/float. Decimals and Floats/Doubles cannot be compared without a cast whereas Floats and Doubles can. Decimals also allow the encoding or trailing zeros.

Float - 7 digits (32 bit)

Double-15-16 digits (64 bit)

Decimal -28-29 significant digits (128 bit)

Decimals have much higher precision and are usually used within financial applications that require a high degree of accuracy. Decimals are much slower (up to 20X times in some tests) than a double/float. Decimals and Floats/Doubles cannot be compared without a cast whereas Floats and Doubles can. Decimals also allow the encoding or trailing zeros.

float vs double?

1 year ago
What is this "Decimal" you are talking about? SQL datatype?

float vs double?

1 year ago
Seems they're just quoting verbatim from the web page they linked to, for whatever reason. The web page is talking about the C# decimal type, which is not relevant here.