Register
Handmade Hero»Forums»Code»Question about cache memory speed
Tony
2 posts
Question about cache memory speed
7 months, 2 weeks ago Edited by Tony on May 1, 2019, 10:48 a.m. Reason: Initial post
I'm new the C++ and I'm trying to understand what kind of difference does these two examples have.

X, Y;

X = 5;
((float *)&Y)[0] = 5;

If both of them are in cache, is there a big difference between these two?

((real32 *)&(Array))[I], this is the code that Casey used in SSE optimization episode and it got me thinking is there a huge drop in performance if you have to cast it down to a real.
Mārtiņš Možeiko
1991 posts / 1 project
Question about cache memory speed
7 months, 2 weeks ago Edited by Mārtiņš Možeiko on May 1, 2019, 5:57 p.m.
I think this is less about memory speed and more about compiler optimizations.

Modern compilers on modern architectures (64-bit x86) will optimize this code only to use register mov's. No memory operations.

Here's an example: https://godbolt.org/z/1lq4bZ
Float is passed in xmm0 register. Integer return value should be in eax register. Compiler simply moved bits from one register to another. No memory operations.

Same for arm64: https://godbolt.org/z/uo2H4o

For accessing SSE register as individual floats looks like this: https://godbolt.org/z/xEWs1x
Again, no memory operations - just some register shuffling.
Tony
2 posts
Question about cache memory speed
7 months, 2 weeks ago
Thanks for the reply. I think i got it now a bit better and if I'm wondering about something like this, i could just check the assembly code, just like you shoved me.