rdtsc gcc and asm

WARNING VERY SCARY ASM CODE:

I am in the process of moving code from my old computer to a new one, and one this computer I am using GCC to compile instead of msvc.

Of course GCC does not allow the use of __rdtsc so I searched the internets looking for a why to get the same functionality but with GCC. I found the code in question and I took my best shot at understanding it. But I could not fully understand.

I would someone to explain this too me. I do not feel good just copying the code and being done with it. I need to know how it works to be fully satisfied with myself.

Here is the scawy code:

#if defined(__i386__)
static __inline__ unsigned long long rdtsc(void)
{
    unsigned long long int x;
    __asm__ volatile (".byte 0x0f, 0x31" : "=A" (x));
    return x;
}

#elif defined(__x86_64__)

static __inline__ unsigned long long rdtsc(void)
{
    unsigned hi, lo;
    __asm__ __volatile__ ("rdtsc" : "=a"(lo), "=d"(hi));
    return ( (unsigned long long)lo)|( ((unsigned long long)hi)<<32 );
}
#endif

Thanks

~ Connor

Non-ancient versions of GCC support __rdtsc just fine. You include "x86intrin.h" and __rdtsc() will be available.

If you are forced to use very old GCC version, then you can use inline assembly to implement rdtsc. I would recommend to use code like this and not the one you pasted (there's no reason to use .byte to specify rdtsc instruction):

inline unsigned long long __rdtsc(void)
{
#if defined(__i386__)
    unsigned long long tick;
    __asm__ __volatile__("rdtsc" : "=A"(tick));
    return tick;
#elif defined(__x86_64__)
    unsigned int tickl, tickh;
    __asm__ __volatile__("rdtsc" : "=a"(tickl), "=d"(tickh));
    return ((unsigned long long)tickh << 32)|tickl;
#else
#error
#endif
}

In this code there are two a bit different inline assembly statements depending on wheter you are compiling code as 32-bit or 64-bit. They execute exactly same instruction, but GCC inline assembly requires a bit different specification for output parameters.

rdtsc stores result in EDX:EAX registers - 64-bit value split in two 32-bit parts.
For 32-bit code you can indicate that with "A" output modifier. This will instruct compiler to put EDX:EAX value in one 64-bit variable (tick).

For 64-bit code A that won't work. Because compiler will simply allocate full RAX (64-bit) register for tick variable. So you need to specify each register individually - "a" modifier makes value of EAX register put in tickl variable (low bits), and "d" makes same thing for EDX register and tickh variable (high bits). After that you simply shift high bits in proper place and return the result.

But as I told in the beginning - simply include x86intrin.h and you'll have __rdtsc function.

Actually it is even better! This __rdtsc function is implemented using special "builtin" function called __builtin_ia32_rdtsc. To use it you don't need to include anything. It is automatically available by compiler. So simply using __builtin_ia32_rdtsc() for GCC will work just fine (as for CLANG).

As far as I can tell gcc version at least 4.6 have this __builtin_ia32_rdtsc builtin. Maybe even older ones.