ZeroSize and memset

In episode 65 Casey wrote the functions ZeroSize/ZeroStruct to initialize all the bytes in a contiguous memory area to zero using a while loop.
Could memset be used to replace the loop? Would that buy us anything performance wise?

thanks!
It would buy us a lot of performance. System memset() functions are far better tuned to take advantage of the capabilities of modern CPUs.

But it wouldn't be Handmade. No doubt memset() or something like it will be what ships.
Actually, the answer is usually "it depends" :) Sometimes the system memset is actually slower because they stick a bunch of preamble in there to see what kind of clear is being done (alignment, size, etc.).

But yes, the right thing to do if the speed is an issue is to align your memory regions and clear with full-width writes, so you don't need any preamble like memset would have. Compilers like CLANG actually have memset built in, but I've never tested them to see if they do the right thing on memory that is marked with an alignment qualifier, etc. I would assume so?

- Casey
Good point. I didn't think about small blocks.
Also, some compilers actually optimize zero-setting for-loops into memset calls.

Googling about this I found several people complaining about that behavior in intel c and msvc, and I recall reading about that behavior with clang and/or gcc as well, so I believe it's pretty common optimization.

Here's someone's benchmarks about for-loops vs memset (with no info about what exactly was benchmarked, or on what, but anyway.. https://cs50.harvard.edu/resource...ference.com/stdstring/memset.html )