Audio / video sync - detach audio from video

d7samurai

#541

December 13, 2014

Am watching the recorded day 20 stream.

Why sync the audio to the frame rate at all? Wouldn't it be better to detach the two and have them independently sync to a common, "objective" timeline of game events?

Audio @ 48000 Hz is (for all intents and purposes) continuous, while video @ 30 Hz is not (relatively speaking). Video frames are basically stills, moments frozen in time, with latency / delay(~33 ms in this case) in between. In this sense, video frames are ~always "out of sync" with [the game's] reality by some amount, while audio is not.

I'd let the audio - which is not dependent on monitor refresh rates etc - just play along "in the background", as a stream where you just merge in sounds at the times they are supposed to be played according to the game's underlying event timeline.

Video should then try to catch up / match up with the same objective timeline as best it can. In a way, video could be regarded as secondary to audio, since the displaying of frames is more volatile - and the audio stream's granularity maps more faithfully to real-time events.

Edited by d7samurai on December 13, 2014, 12:00pm

Casey Muratori

#545

December 13, 2014

The reason is because the user doesn't actually care about "real time", because they aren't looking at a millisecond stopwatch while they play. The only thing they know about in terms of timing is the video they're watching. So the perception of how tight the audio "feels" is all about how closely it lines up with what the video is showing.

So the goal is to synchronize the audio and video as much as possible. You can choose to try to sync the audio to the video, or the video to the audio, but you're going to have to try to do one or the other if you want to try to get a nice experience.

- Casey

d7samurai

#549

December 13, 2014

I agree that the user doesn't care about the game's "real time" - that is just an internal concept to keep the sync handling cleaner and the audio from having to suffer potential inaccuracies of the video refresh rate - like if there's a delay / hiccup in the frame flipping (which happens often if you're Present'ing a windowed SwapChain under DWM, for example - if DWM's desktop refresh is delayed for some reason, your window's refresh will be delayed, too).

In general, the user experience will be more or less the same. If both are synced against objective time, they will correct themselves as they go along and line up as much as possible anyway. And as much as it doesn't really matter if audio is a tiny bit behind the video, it doesn't matter if it is a little ahead of it.

But not all perception is linked to what the player sees. Some is linked to what the player does. If the player presses the trigger to fire a shot, it is better if the sound is precise and only the graphical representation of the shot is lagging than if the sound is synced to the picture and both are lagging.

Edited by d7samurai on December 13, 2014, 2:28pm

Petri

#553

December 13, 2014

Let's also keep in mind that audio and visuals are not equal: frame or two lag in audio is very common in everyday life, as the sound travels only about 34 cm/ms, after all. So audio is always lagging visuals in reality, and perception has evolved to cope with this. One can experiment with this by playing some talking head video in VLC, for example, and fiddling with synchronization settings (tools>synchronization) -- even a slight lead in audio will feel very awkward, but quite a considerable lag (a few tens of ms) can be tolerated (getting used to and forgotten after a while, not that it's preferable either, but not as bad).

Casey's current scheme of avoiding audio lead (on yet hypothetical minimal lag systems/libs) before the frame is shown, and otherwise with minimal extra lag, seems nice from this perspective. Especially if this kind of synchronization helps to keep the game code simple -- but we haven't got to game code yet, so no idea about it's implications yet. But I'm afraid there will be glitches when one starts testing this: try simulating a minimal lag system by lowering the target frame rate (increasing the frame time larger than the current lag), and also with variable simulated loads, and see if the game code really doesn't need to care...