Software rendering problem

Hi,

I've had a relatively harmless problem with the my current graphics card since I've bought it. Recently, I've reached the part in which Casey adds SIMD and multi-threaded rendering to his engine which makes it practically run great at max resolution 1920x1080 60fps.

At this point I've noticed that the problem I've mentioned above started to appear when running the code in fullscreen at 60fps. You can see what the problem is in the following images:

https://www.dropbox.com/sh/bnvznv...0h/AACNJcHtWeJeOeZCL8FOaZ5ra?dl=0

Now I will try to give more details about this issue. I am running the code under Windows 7 - 64 with a Gigabyte 1050 Windforce gpu. This "bug" manifests itself in two variants when I use the windows+d (show desktop) command to switch repeatedly between a window and the desktop. First, when aero is disabled, some horizontal lines are visible on the screen, as if there were holes when the window is being rendered. Second, when aero is enabled, the patterns presented in the sample images above can be seen on the screen.

In the handmade hero code I've managed to better reproduce the problem by changing the function Win32DisplayBufferInWindow to clear the whole buffer with PatBlt, before the StretchDIBits call, once with WHITENESS and once with BLACKNESS, as can be seen from pictures 6 to 16. Initially Casey only uses PatBlt to clear the area around what is actually being rendered (I noticed that even Casey had an annoying screen flicker when he tried to PatBlt the whole buffer).

This is the second gpu of this type that I've tested and both had this issue. It only happens on desktop activity, in games (2D, 3D) I have no problems at all; the temperature stays at a maximum of 63-64 degrees, so I don't think it's a hardware problem. Also, I've tried different drivers but it didn't help.

I've managed to bypass this by doing one of the following:

  • limit the fps to 30;
  • run everything at a lower resolution;
  • force the code to disable the windows desktop composition with the function DwmEnableComposition(DWM_EC_DISABLECOMPOSITION).


Also, when I try to printscreen and paste the image in Gimp, it comes out clean, without the artifacts visible in the photos.

From my point of view it seems like a synchronization problem (from the first 5 images it looks like the traditional image tearing case, more or less) but since I lack any windows programming experience, or how windows communicates with the gpu, I wanted to present this problem here hoping that maybe someone encountered something similar at some point.

Thank you in advance.

Edited by 2bytes on Reason: Initial post
You could try to use DwmFlush to synchronize with the window manager. You should call that after StrechDIBits (I think, I'm not an expert) and it needs to have the compositor on.

Some of the screenshots looks like there is a transparent window or overlay and Windows (I guess ?) has issues deciding who's on top. Like when the compositor is off and Windows doesn't redraw the whole screen. Do you have any overlay activated (performance monitor, screen capture...) ?

Since it's software rendering, I doubt your GPU is the problem here.

You could try to output the frame buffer as a bitmap file before sending it to StrechDIBits to make sure you are producing correct images.

Did you try to strip down your code to get a simple reproduction case ? It could help you figure out where the problem comes from (as it could be caused by issues in multi-threading or SIMD code) and you could share it with us because it's hard to figure an issue based on screen shots.

In which day of handmade hero have you managed to reproduce the issue and what are the exact modifications ?
I used DwmFlush and it solved the problem but it slowed the frame rate very much. I can get a better result by just setting the frame limit to 30 in the code.

Do you have any overlay activated (performance monitor, screen capture...) ?


I don't think so; I am keeping HWMonitor open, just to periodically check for overall temperatures and I am using EVGA Precision for manual gpu fan control. Closing them does not solve the problem.

Did you try to strip down your code to get a simple reproduction case ? It could help you figure out where the problem comes from (as it could be caused by issues in multi-threading or SIMD code) and you could share it with us because it's hard to figure an issue based on screen shots.


I am testing on the code from day 135 right now, it's right after multi-threaded rendering was added, which enables the frame rate to reach 60. If I am limiting the frame rate to 30 the problem disappears. All I've added to the code, to produce the results from image 6 to 16, was a simple line added inside the Win32DisplayBufferInWindow function inside win32_handmade.cpp:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
        PatBlt(DeviceContext, 0, 0, WindowWidth, WindowHeight, BLACKNESS); // added code - for the black background
        // PatBlt(DeviceContext, 0, 0, WindowWidth, WindowHeight, WHITENESS); // added code - for the white background

        // NOTE(casey): For prototyping purposes, we're going to always blit
        // 1-to-1 pixels to make sure we don't introduce artifacts with
        // stretching while we are learning to code the renderer!
        StretchDIBits(DeviceContext,
                      OffsetX, OffsetY, Buffer->Width, Buffer->Height,
                      0, 0, Buffer->Width, Buffer->Height,
                      Buffer->Memory,
                      &Buffer->Info,
                      DIB_RGB_COLORS, SRCCOPY);


You could try to output the frame buffer as a bitmap file before sending it to StrechDIBits to make sure you are producing correct images.


I haven't done this yet. I've tried to do a simple printscreen, or to use fraps' screenshot function and to paste it in Gimp. The resulted image was clear, no sign of the problem.

Edited by 2bytes on
I wasn't able to reproduce the issues on my machine (Windows 7 64bit) with or without the compositor (apart from tearing when the compositor is off). But for some reason on my machine (i7 from 2009), handmade hero seems to use only 25% of the CPU (2 logical core) with the compositor on and 50% (4 logical core) with the compositor off and the frame rate is between 20 and 25 ms.

I have the issue when you add PatBlt before StretchDIBits with or without the compositor but I think it's a separate issue. On my machine it's clearly a flicker between a fully black frame and the game frame when the compositor is on, and slices of black and actual frame when the compositor is off (expected from tearing). I guess (I've no expertise on the subject) this is caused by the compositor grabbing the DIB between the PatBlt and StretchDIBits calls. It mostly displays the game because PatBlt takes 0.05% of the time taken by StretchDIBits.

When you use DwmFlush you should disable the code that tries to wait for the target frame rate.

2bytes
I haven't done this yet. I've tried to do a simple printscreen, or to use fraps' screenshot function and to paste it in Gimp. The resulted image was clear, no sign of the problem.


If you use printscreen the result will come from the "windows" frame buffer. If you output your memory buffer you'll have the result of the software renderer and if there are artifacts there, then it's something wrong with your code.