It was cool to see that after my quick question pre-stream, you actually devoted half the episode to it :) I guess it had quite some impact, not to mention on the performance...
I do believe the code for loading the bitmap could be optimized by using the following code, which conforms to the online source I posted earlier. It won't be that important of an optimization, but at least it was good for me to see that my thoughts where correct.
| v4 Texel = {(real32)((C & RedMask) >> RedShiftDown),
(real32)((C & GreenMask) >> GreenShiftDown),
(real32)((C & BlueMask) >> BlueShiftDown),
(real32)((C & AlphaMask) >> AlphaShiftDown)};
#if 1
Texel.rgb *= SquareRoot(Texel.a*(1.0f/255.0f));
#endif
|
I tested this with the following code (Abs is needed as extra function in math) to see if it indeed results in the same values, which it does (within some floating point margin).
1
2
3
4
5
6
7
8
9
10
11
12
13
14 | v4 Texel = {(real32)((C & RedMask) >> RedShiftDown),
(real32)((C & GreenMask) >> GreenShiftDown),
(real32)((C & BlueMask) >> BlueShiftDown),
(real32)((C & AlphaMask) >> AlphaShiftDown)};
v4 TexelSlow = SRGB255ToLinear1(Texel);
#if 1
Texel.rgb *= SquareRoot(Texel.a*(1.0f/255.0f));
TexelSlow.rgb *= TexelSlow.a;
#endif
TexelSlow = Linear1ToSRGB255(TexelSlow);
real32 Epsilon = 0.001;
Assert(Abs(Texel.r - TexelSlow.r) < Epsilon && Abs(Texel.g - TexelSlow.g) < Epsilon && Abs(Texel.b - TexelSlow.b) < Epsilon);
|
If only the actual pixel operations in the Draw calls could be optimized like this :P But I guess -O2 or #ifdefs could come in handy there.
But thanks for spending the time going over this! And I do love seeing you speed through the pixel operations cleaning up the code and everything.
I will try to think about scenarios to test whether your (dare I say "our") solution now works correctly. But I guess it is not as easy as in 3D space, where the diffuse shading for example will be greatly different if done in linear color space.
Mox
PS Sorry for slowing down the main drawing routing :oops:
But I guess that was inevitable