Fixed vs Variable Game Loop Update & Frame Rate Independence

Zachary

#7918

August 7, 2016

This is something I've been thinking about ever since watching:

Jonathon Blow: Q&A: frame-rate-independence:

You would think that the fundamental notion of the Game Loop would be "solved" at this point - but, at least personally, I can't find an objective argument to as what the "right" thing to do is (and perhaps, there is no universal "right way")

FYI: Many sources point back to this article: Glenn Fiedler: Fix Your Tilmestep!:

In HH - we use a rather odd loop where we integrate at a fixed tilmestep every frame, regardless what our game update Hz is (which is, display-dependent when we turn VSYNC on). (@Casey: Is this a permanent solution?)

The overall picture is very confusing to me, especially when trying to construct a solution that behaves reasonably under (a) varying CPU specs, and (b) varying display refresh rates, which, at least in the PC market, can be quite variant. Ever solution I can think of just trades problem A for problem B.

It's a rather convoluted issue (at least for me!), and so I have no *specific* question - just wanted to get people's thoughts on various setups.

Edited by Zachary on August 7, 2016, 3:53pm

Casey Muratori

#7927

August 7, 2016

In HH we haven't really looped back around to finalize our platform layer, and probably won't for a long time. But the idea is that yes, we actually will use a fixed timestep, it's just that the timestep will be chosen based on the machine and monitor refresh. The reason for this is that I find variable timesteps to feel wrong, and I can sense the jitter, and I much prefer a constant update each frame. So locked 60hz is the goal for HH, but if the monitor is 50hz we'll do locked 50, or if it's a slow machine we'll do locked 30.

- Casey

Zachary

#7930

August 7, 2016

Hey Casey,

Thanks for your response!

The reason for this is that I fin...er a constant update each frame.

I find this interesting, since variable-step is the only solution that actually sims the game world up to the current frame boundary (when VSYNC is on). The only source of jitter I can see in this case, is if the update->display latency is changing. But I don't think fixed-steps address this (in fact, is probably adds variance since may be doing a variable number of simulations)? (Somewhat aside as its not directly related to update time-steps, but how can we control this latency? For example with 60Hz display, our game update may complete in 10ms, but how can we know whether or not our submitted render commands to the driver will make their way to the GPU, be processed, and rendered into the back buffer before the next buffer swap? We really only control the first part, and a change in 10ms -> 11ms could cause latency to increase by 1/60s).

With fixed-step, in order to render a representation of the game world at some time, we will have to either interpolate or extrapolate results from our sims. I've never tried this myself, but I can imagine visual artifacts when this is done with relatively fast moving objects undergoing non-constant acceleration. Or, going the route of Mr. Blow, do tiny non-gameplay sims to fill the time gap and "catch-up" - but then throw those simulations away.

Perhaps I'll have to wait until we do a round-trip to the platform layer, but how do we plan to deal with the "self-propagating-frame-miss" problem if we go fixed-step? I.e. If we have VSYNC on, and miss a frame, then we potentially have to do 2 simulations after the next VSYNC. But having to do 2 simulations takes x2 time to compute then 1 sim, and so we miss the next frame, ect.

-Zach

Edited by Zachary on August 7, 2016, 7:22pm

Casey Muratori

#7943

August 8, 2016

I can't say for sure since I may be misunderstanding you, but it sounds like you are perhaps confused about what is going on in the simulation step.

If you always hit your frame rate (which is the goal of any action game, certainly), then you know that you are always completing frame computations at less than the monitor refresh rate. So a 60hz monitor refresh with a game that never misses its frame budget is always simulation a 60hz timestep and is always presenting a completely consistent 60hz series of images to the user, and this is the best possible experience (I contend).

Having a _variable_ timestep means that you will _never_ present a consistent series of images to the user except by accident if there are any effects which involve a duration-relative effect (for example, motion blur). These images will always be presenting a timestep that is _one frame off_ from correct, using, for example, the previous frame's duration to compute the blur for a given frame, and if that frame takes a different amount of time (which it obviously does, or we wouldn't be calling it a variable timestep), then it is wrong.

So as far as I'm concerned, there's really only one right answer to how framerates should work - you should be locked to a single framerate, but you should be flexible enough to make that framerate be whatever the monitor's natural refresh is so that you do not have tearing. This is why I say "variable fixed timestep", for lack of a better term, is the only real solution at the limit.

- Casey

Zachary

#7946

August 8, 2016

Sorry in advance.

"...but it sounds like you are perhaps confused about what is going on..."

Nailed it.

If you always hit your frame rate (which is the goal of any action game, certainly), then you know that you are always completing frame computations at less than the monitor refresh rate. So a 60hz monitor refresh with a game that never misses its frame budget is always simulation a 60hz timestep and is always presenting a completely consistent 60hz series of images to the user, and this is the best possible experience (I contend).

So I somewhat agree here. (Disclaimer: I'm not very knowledgeable about graphics, so feel free to correct me if I'm wrong). I think this only works if our latency is constant (or, if by "complete frame computations", you refer to completing everything up to the back buffer being rendered). For example:

- You have a 60Hz monitor
- VSYNC enabled
- You have a fixed-update of 1/60s
- It takes ~6ms from when the user-space graphics driver receives commands, to when those commands actually processed and rendered into back buffer by GPU.
- It takes ~10ms computation time to simulate a 1/60s game-update (well under the 16.67ms budget)

Consider 2 cases:

+ (Case A) We start with out 10ms of CPU time to compute a 1/60s game timestep. We send the render commands to the driver, and ~6ms later, the GPU renders our frame into the back buffer JUST BEFORE the device refresh.
+ (Case B) We start with out 10ms of CPU time to compute a 1/60s game timestep. We send the render commands to the driver, and ~6ms later, the GPU renders our frame into the back buffer JUST AFTER the device refresh.

* If Case A happens ALL the time - boy are we lucky. We render at 60Hz with 1 frame of latency. Life is good. :) (Perhaps this is what you are referring to?)
* If Case B happens ALL the time, it depends. If we are triple-buffering, then we are able to pipeline the GPU commands, and are able to render at 60Hz, but now have 2 frames of latency. If we are double-buffering, then my understanding is we just cut our framerate in half (30Hz), and have 2 frames of latency.
* If sometimes Case A happens, and sometimes Case B happens, then there is no way we are presenting a fresh frame to the user every device refresh.

"Having a _variable_ timestep means that you will _never_ present a consistent series of images to the user except by accident if there are any effects which involve a duration-relative effect (for example, motion blur). These images will always be presenting a timestep that is _one frame off_ from correct, using, for example, the previous frame's duration to compute the blur for a given frame, and if that frame takes a different amount of time (which it obviously does, or we wouldn't be calling it a variable timestep), then it is wrong."

So, my understanding is that with variable-timstep, you accept an additional frame of latency. You compute how long the _last_ frame took, and simulate up until the _start_ of the current frame. So at the start of every frame, your goal is to compute "what is the game state _right now_". Sure, by the time that frame is presented to the user, it's already in the past - but I wouldn't expect any inconsistencies or jitter effects. And, this is auto-correcting when the user starts streaming Netflix, causing you to miss a frame. If 40ms has past since the last update .. you just sim those 40ms and you're back to current :)

So as far as I'm concerned, there's really only one right answer to how framerates should work - you should be locked to a single framerate, but you should be flexible enough to make that framerate be whatever the monitor's natural refresh is so that you do not have tearing. This is why I say "variable fixed timestep", for lack of a better term, is the only real solution at the limit.

If we can somehow guarantee we won't miss a frame, and that our latency is constant, then I agree this is ideal. I have literally no idea how to do this.

- Zach

Edited by Zachary on August 8, 2016, 5:06am

Casey Muratori

#7950

August 8, 2016

I think there is a misunderstanding here about how the CPU/GPU pipeline works. The CPU and GPU are separate processors, so it is not the total time taken between them that determines the frame rate, but the maximum of the two individual times, barring fences (which are obviously a performance concern but that's a totally separate topic).

So in your examples, where you are thinking in terms of 10ms of CPU time plus 6ms of GPU time equals a tight 16ms frame time that might hit or miss the VSYNC, you're thinking of these things as happening sequentially inside a single frame. But that's not what usually happens. More typically the 10ms of CPU time is overlapped with the 6ms of GPU time for the previous frame, because the GPU is processing the data generated by the CPU asynchronously.

So the requirement for meeting VSYNC is that the CPU takes no more than 16ms to generate the GPU data for a frame, and the GPU takes no more than 16ms to render that frame. Nobody typically sits around waiting for VSYNC to happen - the CPU and GPU are plowing forwards on future frames. It's only if they're already a frame ahead that they actually wait on the vsync, or two frames ahead if you're doing "triple buffering".

Typically this means that as long as you're never going to miss your frame rate by too much, you even have one frame to respond to a missed frame budget without ever having missed a frame. This is because, since the pipeline has more than one frame in it, time saved on the previous frame is "banked" for the next frame, effectively. So if the CPU took only 14ms to produce the previous frame, it actually has the 16ms from this frame _and the 2 ms from the previous frame_ to finish. This means it can take 18ms to produce the data for the next frame and still make rate, and now it knows that whatever it was doing was too expensive and it should pare back for the next frame.

Does that make sense?

- Casey

Zachary

#7986

August 9, 2016

I feel like we've gone off-road here - but that's OK since it's (embarrassingly) pointed out a major flaw in my mental graphics model.

I feel like I've seen byproducts of what you are explaining before - but I can't put the full picture together.

At the risk of looking (more) stupid, I've drawn up something akin to how I perceive the pipeline to work under VSYNC. Just by drawing it out, I recognize there are flaws - I just don't know specifically where.

https://drive.google.com/foldervi...mnR1tgUjNiRGZwcmNwSTg&usp=sharing

I think my confusion is centered around:
a) Not knowing when the driver blocks, and how this controls framerate under VSYNC
b) Not knowing the degree at which render commands can be queued on the driver/GPU. Can the GPU wait on VSYNC while simultaneously getting commands from driver? If the driver is forced to hold back sending commands, does it necessarily mean it has to block our game? Can it block on any call submitting commands?

Lastly, since helping people understand this stuff isn't your job, can you recommend any place where I can find a definitive answer to how this stuff works? I haven't found a graphics text that describes this.

Thanks,
-Zach

Edited by Zachary on August 9, 2016, 8:26pm

Mārtiņš Možeiko

#7987

August 9, 2016

This is good presentation on how Windows deals with vsync: https://www.youtube.com/watch?v=E3wTajGZOsA
It focuses on D3D11 and D3D12 and Windows 10 improvements, but I'm pretty sure D3D11 concepts map to GL.

Edited by Mārtiņš Možeiko on August 11, 2016, 11:19pm

Zachary

#8013

August 11, 2016

@mmozeiko

Thanks for the link! Seems to be answering very similar questions - good find! Though I can't say my understanding is cemented.

Zachary

#9745

December 10, 2016

All of this just "clicked" now - you are right