I don't know how DwmFlush() works exactly. I thought it was returning after the flip, so it would actually present the image with one frame of lag (there is always more lag then you think). Maybe Martin will enlighten us ?
The graph not lining up correctly might be caused by the fact that, even if the vblank happens every 16.66ms, the moment you start timing things is not instantly after that. So it may be that the top of the graph actually lines up, but the bottom isn't. This is speculation on my part, don't take it as a fact.
There is always small variations in frame time, because you are sharing the CPU with other applications, and the operating system decides which application runs and how long it can run before giving back the hand.
There was a link to an article posted in the
handmade.network forums (I believe) that was an in depth description of what is happening with Windows 8/10 Desktop Window Manager. I remember that it explain how windows handle vsync, how much frame of lag to expect... depending of if the application was windowed, exclusive fullscreen on fullscreen window. I can't find it. By any chance, does anyone have that link ?