Latency

I always thought latency, was the time from you call "play", until sound is heard, and nothing else. Is this wrong?
The general term "latency" refers to the amount of time between when something is initiated and when it is completed. "Audio latency" usually refers to what you just said.

- Casey
cmuratori
The general term "latency" refers to the amount of time between when something is initiated and when it is completed. "Audio latency" usually refers to what you just said.

- Casey


I was just thinking, that after you hear sound; the initial buffer has started playing, you aren't then measuring Directsound latency in your gameloop. In fact, whatever you measure there, like clipped sound has zero todo with directsound latency, but with not filling enough data into the buffer. As long as you CAN do this to avoid clipped sound, there cannot be anything wrong with latency, since playback time is a constant.

EDIT: Ah yes, maybe I see now. What you want is to CHANGE the sound, on a frameboundery, but you have to wait 3 frames? Is that correct? So is thats why you say there is a latency?

Sorry, I may have completely misunderstood what you were saying in the video. Think I am going to have to rewatch those a few more times.

Edited by Livet Ersomen Strøm on
It is a very complicated topic that is hard to understand, and I think I even said something on the stream about how I've worked with it many times and still often get confused :)

Basically the problem is that there are two clocks - one for the video frames (the vertical retrace, which we do not yet have access to, but which we will be using once we switch to OpenGL/D3D/etc.), and one for the audio. The problem is that you have some amount of latency between when you write a sound and when it comes out of the speakers, and there is also latency between when you draw a frame and when it is displayed on the monitor.

Your goal with the game timing loop, in my opinion, is to try to create something flexible enough that those two latencies line up on machines with low enough audio latency, and in the case where the audio latency is too high for that, it should be as close as possible to when the video frame is shown.

Does that make sense?

- Casey

Edited by Casey Muratori on
cmuratori
The problem is that you have some amount of latency between when you write a sound and when it comes out of the speakers,


And if this amounts to say 3 frames for 30hz, this means 6 frames for 60 hz? And more for 75, 90, 120 and so on?

cmuratori

and there is also latency between when you draw a frame and when it is displayed on the monitor.


Yes. About that. Is this time unaccounted for? Since the GPU does not have a TSC counter? Or is Windows including GPU timers inside of QPC ?

I was also wondering. When I "timed" 3500 frames for DirectX9, for basically a simple 2D built, these numbers are lying? I mean is most of those just no- operations? So that IF the monitor would have been able to display every drawcall, then the time for each would go up somewhat? Because it was then actually doing something with them? And so the real fps in that case would have been considerably lower? Or is that wrong?

cmuratori

Your goal with the game timing loop, in my opinion, is to try to create something flexible enough that those two latencies line up on machines with low enough audio latency, and in the case where the audio latency is too high for that, it should be as close as possible to when the video frame is shown.

Does that make sense?

- Casey


Yes. Well. But when considering how much memory is in a framebuffer, and how relatively little memory (6400 bytes or so) there is to fill a single frame worth (in 30hz), I would assume it really should be possible to line up at least the sounds pretty darn near a frame of latency.

The way I am thinking it:
At the time we reenter our gameloop, our prev. frame is showing. The playcursor has not cauth up with us, because it is just now playing, and entering the first of those 6400 bytes for the current frame. Plus/minus the time passed after the flipcall, until we got control again. Idk how much that is, but it's some fraction of those 6400 bytes?

______playcursor______________________byteToLock
[fraction|....unplayed sound..........................]| < Byte to lock at the end of the 6400 bytes we filled last time.

I find it hard to believe the writecursor needs to be 3 whole frames ahead here (33,33ms * 3). I just dont get it. I find it more likely that if the user gave some input to this frame, that we should be able to write new sound fast enough for it to change at the time the Playcursor reaches the end of those 6400 bytes.

But if you say that this latency is real, and that theres nothing we can do about it, then well I just have to accept it, I guess. But I find it very hard to believe that it could be lagging this many frames.

Could we not, devide the buffer into frametime segments, and index them as well, so that we know what part of the buffer we are in, and try to force a write since we know that we are ahead of the playcursor?
What I am trying to say, is that as long as our timing is right, we dont have to ask for where the playCursor is? Because it cannot have advanced more then the time, since playbackrate is a constant.

Or did I just confuse myself? :)
No - we do not know that the audio clock is at all synchronized with the other clocks, so it may drift. You cannot assume that, say, one second's worth of time passing on QueryPerformanceCounter equals one second's worth of samples elapsing on the PlayCursor.

- Casey
cmuratori
No - we do not know that the audio clock is at all synchronized with the other clocks, so it may drift. You cannot assume that, say, one second's worth of time passing on QueryPerformanceCounter equals one second's worth of samples elapsing on the PlayCursor.

- Casey


That's right. But I can "assume" that one second's worth of playbuffer really is one second, right? So that means 33.33333m/s worth of DSPlaybuffer really amounts to this amount of time?

I have now checked it, for several days. And it seem now 99.99% clear to me, unless I have made mistakes that the only latency to care about, is not comming from Directsound. But from the code we write, in the gameloop.

My timingcode is simply wrong (off). And it needs to be corrected, on the fly. This is the "only" thing the playcursor is useful for. To tell us which bytes in the buffer, which is to be avoided. In other words, to tell us to correct our timing, which is that which is off. We can use the playcursor for timing! It is that accurate.

I used your tricks with drawing a visual graph to confirm this, of course also the sound works now. (very useful debugging strategy, thanks for teaching me that).

Directsound is not off. If it would be off, then it would be impossible to play music, on the thing, right? And even If it would be off, which it's not, then we still would like to synchronize with it.

As long as we stay 1 frame worth of time ahead of the playcursor, in the buffer we write to, we can write exactly one frame worth of data. Samples for 33.33333m/s without ever asking. And without ever having to correct how many bytes we are writing. (which also saves us from writing too much). This is because we know from our last write, that the playcursor is still in the last (previous) frame worth of sample-data.


______playcursor______________________byteToLock/mywritecursor
[fraction|....unplayed sound..........................]| < Byte to lock at the end of the 6400 bytes we filled last time.

< CURRENT 1 frameworth of sample -> | Next frame

So what needs to be done, to update the buffer on a frame boundary, is to make sure that our own timing code, always stays 1 frame ahead of the playcursor. That frame "ahead" is the frame that will play when next screenframe is shown. And this is pretty much guarantied, if our soundcard can actually playback 48000 samples in one second.

In addition we must offset the waveperiod each second. If we play the same wavesound, by part of the wave that will not fit perfectly into the first 1 second buffer. And if we CHANGE the wavelengths we insert, we must reset new samples to start of a new period.

I found it better to do this by generating samples in my private buffer. I then always generate exactly 1 frame worth of samples. For 30hz this is 6400 bytes, for 60hz it is 3200 bytes, and so on. I am not sure, but I pretty much guess that I can use this buffer as a mixer, so I called it "mixer".

If I now fill a frame worth into the directsoundbuffer, and it starts to play, then if I can time my own code correctly, so that it, on each frameupdate can write exactly a frameworth of data, at exactly the right boundary, 1 frame ahead of the playcursor(roughly), or 1 frame ahead of the previous frame written, then directsound will never be a problem. DS is so fast, that it will allow you to update almost just a couple of bytes, if you just stay away from the playcursor.

The only problem is that we need to correct our own timing. Because what typically happens is that we fall behind. It is we who has latencies... Therefore, we must check the playcursor, and offset our update of the next frameworth of data. If we fall behind, the playcursor will catch up and cause noise. But if we make sure that it does not, and also that we do not move too far ahead, sound will play without any problems whatsoever.

I am only 99.99% sure of this. So it would be nice to get some comments. But everything seems fine, except for my timing, which is a little off, a little behind each frame. I fixed this by comparing my projection of when to update the buffer, with the playcursor, if the playcursor is close to MyWriteCursor, then I update imidiatly, and not wait for the next frame, but if the playcursor is staying safely behind (a frame or so), I wait until the frametime to do the update. As long as I can manage to control my timing, this works well for all framerates, so thats why I say DirectSound is not lagging :)

From what I understood, you will be revisiting this later. And maybe the way I am doing it now will not work for a game once it becomes real? IDK. I just wanted to say that as far as I know, DS is not laggy. And in fact it would be better to use that for a clock, since it's almost bound to be a better clock, due to it must be able to play those 48000 samles a second.

Right. Sorry for the long post. And sorry if I have jumped to conclusions. I just thought it could be an important topic, since if I don't have any feedback to the contrary of what I am saying, I am going to go with this and use it for my own projects.

Edited by Livet Ersomen Strøm on Reason: illustration.... whatever!;D