Handmade Hero»Forums»Code
6 posts
Question about cache memory speed
Edited by Tony on Reason: Initial post
I'm new the C++ and I'm trying to understand what kind of difference does these two examples have.

X, Y;

X = 5;
((float *)&Y)[0] = 5;

If both of them are in cache, is there a big difference between these two?

((real32 *)&(Array))[I], this is the code that Casey used in SSE optimization episode and it got me thinking is there a huge drop in performance if you have to cast it down to a real.
Mārtiņš Možeiko
2198 posts / 1 project
Question about cache memory speed
Edited by Mārtiņš Možeiko on
I think this is less about memory speed and more about compiler optimizations.

Modern compilers on modern architectures (64-bit x86) will optimize this code only to use register mov's. No memory operations.

Here's an example: https://godbolt.org/z/1lq4bZ
Float is passed in xmm0 register. Integer return value should be in eax register. Compiler simply moved bits from one register to another. No memory operations.

Same for arm64: https://godbolt.org/z/uo2H4o

For accessing SSE register as individual floats looks like this: https://godbolt.org/z/xEWs1x
Again, no memory operations - just some register shuffling.
6 posts
Question about cache memory speed
Thanks for the reply. I think i got it now a bit better and if I'm wondering about something like this, i could just check the assembly code, just like you shoved me.