I think we can clamp to TargetVolume a bit early if both channels end during the same loop, but on different samples.
To pick numbers from thin air, assume SamplesToMix == 800, channel 0's VolumeSampleCount is 700, and channel 1's VolumeSampleCount is 100. My claim is that SamplesToMix will end at 100, but VolumeEnded[0] == VolumeEnded[1] == 1. This will result in channel 0's volume being clamped 600 samples earlier than it should. Of course, it's late and I could be missing something.