The problem is that you're not supposed to write in the buffer between the play cursor and the write cursor. It basically means that the minimum latency is equal to writeCursor - playCursor (on Windows Vista and later DirectSound is emulated, so you never get your real sound card latency, the latency is fixed at 30ms I believe). To get the minimum latency you would always write at the write cursor. But you'll not get better than 30ms using DirectSound.
An other issue is that if you overwrite sound, it means that you need to re-mix sounds and effects that you already mixed before. And it can cost some time that would possibly make you late since you already read the cursors to know how much to overwrite. Unless the game really need low audio latency the change will be imperceptible.
So to summarize, it could work, but with not much benefits.
A way I used to reduce latency is to track how much the write cursor advance in the frame for a period of time. For example keep track of the write cursor advance for the last 10 seconds. If in most frame the advance is 960 samples, try to have 960 samples after the write cursor each frame. If for a frame the advance is 480 than write only 480 samples (960 - 480).
frame n + 0: write cursor = 0, advance 0, write 960 samples, buffer filled up to 960
frame n + 1: write cursor = 960, advance 960, write 960 samples (960 - (960-advance)), buffer filled up to 1920
frame n + 2: write cursor = 1440, advance 480, write 480 samples (960 - (960-advance)), buffer filled up to 2400
frame n + 3: write cursor = 2400, advance 960, write 960 samples (960 - (960-advance)), buffer filled up to 3360
You still have a latency of writeCursor - playCursor, but it reduces added latency to avoid clicking sound.