is presently its sole maintainer,
You can support him:
Plan for today: SIMDizing the mixer
Aligning the temporary buffer
Making sure the temporary sound buffers are big enough to fit all samples
Explanation of Align16
Alignment macro for any power of two: AlignPow2
Clamping samples to the signed 16-bit integer range
(intermission) Two's complement
Back to SIMD
Rounding the samples
Downconverting from 32-bit to 16-bit integers. No clamping necessary!
Looking for intrinsics that interleave 16-bit values
Interleaving the samples before packing them
Making sure we don't write out of bounds
Debugging output using structured input
Padding the buffer in the platform layer to make sure we always have space for overwrites
Casey remembers that the horizontal mouse position was linked to music panning
Getting rid of unnecessary clamping operations
Using aligned loads and stores
Plan for next episode
More 2s complement. Full example
cubercaleb Q: Why isn't 2's complement used for floating-point numbers if it makes signed arithmetic easy?
poohshoes Q: Are you not going to profile it too see how much faster it gets?
dr_s80 Q: When you implemented streaming in chunks of audio; I believe the code actually loads the entire file (with a platform layer VirtualAlloc) for each chunk. Is this just an artifact of the debug nature of that code?
ishytarus Q: Does the audio make the framerate in debug mode?
cubercaleb Q: If 1111 (-1) is supposed to be less than 0000 (0) then how do number comparisons work on the CPU level?
marumoto Q: Do you have any tips for speeding up compile time when using multiple translation units?
sssmcgrath Q: It's movsx for signed