Hi Guys,
I am on Episode 3 of Handmade Hero, but I'm embarrassed to say that I have already encountered something that confused me.
https://youtu.be/hNKU8Jiza2g?t=342
Casey Says (In the Above Link): "In general on the x86 architecture often times there is a penalty for doing what's called unaligned accessing."
He goes on to explain that unaligned accessing is when say you operate on say a 32 bit value on a boundary that is not 4 bytes. However, I do not see how this relates to him allocating an extra byte of padding for the Bitmap's RGB. I would understand if he said he allocated the extra pad byte for something like SIMD or to access the byte as a single 32-bit integer as opposed to multiple chars. I would also understand if he said the OS had some faster path for DWORD sized data, because it can access them with a single 32-bit variable.
Is unaligned accessing really related here? I don't see how it makes sense, unless Casey's plan is to access these things on a 32-bit boundary. Also, can someone explain why unaligned accessing is slower for the CPU at all? my only thought is that the data may straddle a cache line leading to it needing to fetch a byte at a time then combine everything together, to make sure it can actually read all the data.