Question about sound latency on Day009

On Day009, we start to use latencySampleCounts. I assume that means we're writing to little further ahead than where the play cursor is.

I don't understand why this reduces the latency. Why using play cursor only does NOT change sound pitch immediately when we move the stick? Since we write to the sound buffer every frame, it seems to me it should change the sound pitch immediately.

I'm sure I don't understand at all about how the sound card works...and that's the root cause of my misunderstanding, but I don't know where to start to look into...So any comments would be appreciated.

Thanks in advance (and sorry about my English.)

In day 8, Casey tries to always have the sound buffer filled with audio.

On the first frame, since no audio is playing, BytesToWrite equal the size of the buffer. So he will fill the complete buffer with sample, and RunningSampleIndex will be a full buffer worth of sample. He will then start to play the buffer.

On the second frame, Casey compute ByteToLock based on the RunningSampleIndex which will be zero, since we filled the complete buffer and it's a circular buffer (we do a % SecondaryBufferSize). You can see ByteToLock here has 1 past the last byte we have written in the buffer on the previous frame. He compares that to the current value of the PlayCursor to get BytesToWrite. So BytesToWrite here is how much of the audio has been played already and thus contains audio that we need to replace. That also means that he always adds audio at the "end" (in quotes because it's a circular buffer) of the buffer, so there is always about 1 second of audio latency.

On day 9, Casey changes the code so that instead of writing at the "end" of the buffer, he writes at the "start" of the buffer but still a little ahead (LatencySampleCount) of the PlayCursor. That implies to not write a full buffer worth of samples (both at startup and during gameplay), so that RunningSampleIndex isn't far ahead of the PlayCursor. In later episodes, he will improve the latency by overwriting some samples to be closer to the PlayCursor.

In theory you shouldn't write samples between the PlayCursor and the WriteCursor, you should only write after the WriteCursor. This is because there is some sort of memory copy from the secondary buffer and the actual sound card memory, and if you write between the PlayCursor and the WriteCursor some of the memory might already have been copied. I said in theory, because DirectSound in Windows Vista and later is emulated on top of other Windows API (Core audio API such as Windows Audio Session API (WASAPI)). So in theory the best latency you'd get using DirectSound would be WriteCursor - PlayCursor bytes, and if I remember correctly, on modern Windows this is always 30ms.

If you search these forums you'll find some implementation of the audio code using WASAPI and there are also a lot of subject about the audio latency in handmade hero.


Edited by Simon Anciaux on

Hello, mrmixer

Thanks so much for replying.
Through reading your comment several times, I think I kind of have a rough understanding of the situation.
This is how I get:

image.png

Also, thanks for the comments about WASAPI and later episodes.
I will definitely check those, and I'm going to reread the comments in this thread from time to time.


Replying to mrmixer (#25140)

The fill part is correct on the first drawing. The rest is valid data that will be played, but since it's written a lot of time in advance, it will have a lot of latency. I assume that's what you meant with "play" and "latent".

- is samples in the buffer (valid data)
* is samples that were played in the last frame
^ is the play cursor
` is ByteToLock

At frame start we have
|----***-----|
     `  ^
The frame will write new data in place of the *.

The second part isn't correct. What we try to do is to write as little data as possible in front of the play cursor. The rest of the buffer contains old data, so if for some reason you stop updating the buffer, you will ear old data with one sound click (where the last valid data "transition" to old data).

* is old data
- is valid data
^ is the play cursor
` is ByteToLock
T is the target cursor

At frame start we have
|****--****|
     ^ ` T

After the frame update we have filled between ByteToLock and TargetCursor.

|****-----**|
     ^ ` T

mrmixer
In later episodes, he will improve the latency by overwriting some samples to be closer to the PlayCursor.
After reading previous topics, Casey never does "overwritting" I think, but he will improve latency. Just wanted to correct that.

Hi, mrmixer

I apologize for the delay in replying. I had a fever after getting the vaccine and was sick in bed for a while. After that, I got busy at work... But this can't be an excuse. Sorry.

I took a closer look at the diagram you wrote and I think I get it.
On Day08, we fill the entire buffer at the first time, and then ByteToLock is always set to follow PlayCursor.
The reason for the delay in Day08 is that ByteToLock is behind the PlayCursor, so it takes time for the PlayCursor to reach the newly filled buffer.

In contrast, on Day09 we fill the buffer a little bit for the first time before playing, so ByteToLock is always ahead of PlayCursor.
This is the point that I did not understand the most.
On Day09, we fill the buffer for the upcoming play constantly, so there is less delay.

As before, I apologize for my poor English. But thanks to your detailed explanation, I think I understand it better.

Thank you very much!


Replying to mrmixer (#25149)

There is no need to appologies for that. I'm glad it helped.

And your English is pretty good, it's not my mother tongue either but I couldn't tell it's not yours.