StretchDIBits performance issues

I just got to Day 180 where we got some of the debug display working and I noticed a bit of a difference between my timing chart and Caseys. In particular I was a bit surprised to see that in the win32 layer, the step for copying the buffer to the screen was taking up a significant portion of the frame time (5-10%). Given it happens after the wait to hit the frame rate, it means that I always miss my frame rate by that margin. I would move that copy to before the wait but that kind of feels like a hack given the intent of the code is to always make sure that the frame is sent to the screen at the right time.

I put in some additional debug code to print out timings for some of the specific functions and noticed StretchDIBits took about 2ms to run. It seems a bit odd that the step of copying our buffer to the screen buffer would take that long given how much we do it in our code without it taking that long. I read somewhere that StretchDIBits is now deprecated in Windows 10 and thought it could be because of that? Casey is running HH on Windows 7, is it possible that the way this works changed in Windows 10?

This isn't a significant issue but it irks me a bit, should I just accept that windows can't do this faster or are there alternatives? I also know that HH eventually switches to OpenGL so this is a bit of a moot point I guess but I still want to know if there are better ways to display a bitmap from a buffer to a window?


Edited by Gops on Reason: Initial post
Hey Gops,

I think the best test would be to download Day 180 from Casey's downloads, build and run the timings to see if it's related to your machine. Like you state, the series eventually moves on to OpenGL, but this way you could compare his source to your source and see if it's something in your version of the code or it's machine based.

Btw, noticed your signature, where do you work at?

-Scott
Modern Windows uses GPU accelerated rendering when compositing contents of windows to screen. So any GDI call eventually will require updating window contents to GPU memory. So it is "normal" to have bad performance when using GDI functionality. Better way to display bitmap is to directly go through OpenGL or D3D API. Even if it just a full texture upload every frame. Not sure if this is slowness you see, but it should be possible to get pixels to screen much faster than StretchDIBits call.

Edited by Mārtiņš Možeiko on
Thanks for the suggestion Scott, I should've thought of that.

It turns out that the reason for the difference was that there was a minor bug in Caseys code, he reset the LastCounter counter value to the EndCounter and then compared the two to set the timer which would just be 0.

It looks like this would result in the framerate always being slightly lower than it should be because it doesn't take into account the time it takes for Windows to display the buffer. It does look like Casey does change the win32 layer and fixes this at some point, I had a peek at a more recent file so all is good.

As for your question I work for a small consultancy called WEC UK which has me working for the Airbus fuel systems design office at the moment. Not as exiting as NASA I would think.