Well, there's two reasons really.
One is that Sleep() is not a soft-real-time capable call, so really, the operating system is under no obligation to _actually_ wake you up in any approximation of the time you pass. It's purely a hint to the scheduler. So, in a high-performance scenario, you never really want to sleep this way, you want to sleep on the GPU, and have it wait on the monitor refresh, so that you only actually yield time to the system when you are far enough ahead of the monitor to be sure you don't need any time, and where the OS knows you are not "sleeping" idly.
The other is that we are going through Windows' compositor to show our graphics, and we don't really know if we are synchronized with it or not. So, depending on where we "hit" its processing when we try to upload our bitmap, we may be ahead of its work for the next frame, and we'll get shown, or behind it, and we will get pushed to the frame after. So you really don't want to be going through something like StretchDiBits these days, because it's simply not something suitable for consistent frame output. We're only using it right now because it is very simple and easy to understand, so I could show people how the pixels that we write go directly out to the OS and I don't have to explain a huge 3D API pipeline until later on in the process when everyone is more comfortable and has been programming Handmade Hero for many hours.
Hope that makes sense...
- Casey