After reading the articles linked in this thread, I wanted to try to use only rdtsc to get time information.
I had in the past looked at the disassembly of QueryPerformanceCounter and knew it used rdtsc and modified the value before returning it. I know it's not a good idea to rely on rdtsc as QueryPerformanceCounter will do different things based on the hardware, bios, OS version. So my goal was to just to try to make it work on my machine (at least at first). My reason for wanting to do that is that in profiling code I use both QPC and rdtsc because two sequential calls of QPC can return the same value and I wanted a way to represent events with more granularity which I can somewhat do with rdtsc (as it never returns the same value). Also if I could rely only on the rdtsc value, it would save 8 bytes in my events (that are 16bytes at the moment).
I failed at doing what I wanted (using rdtsc, and doing the transformation that QPC does myself), but I wanted to know if anybody knows how to achieve it ?
| #include <windows.h>
int main( int argc, char** argv ) {
LARGE_INTEGER t;
QueryPerformanceCounter( &t );
return 0;
}
|
So the assembly for QPC is:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16 | sub rsp, 0x28
test byte ptr [0x7ffe02ed], 0x1
mov r9, rcx
jz 0x7719b930
mov r8, qword ptr [0x7ffe03b8]
rdtsc
movzx ecx, byte ptr [0x7ffe02ed]
shl rdx, 0x20
or rax, rdx
shr ecx, 0x2
add rax, r8
shr rax, cl
mov qword ptr [r9], rax
mov eax, 0x1
add rsp, 0x28
ret
|
The jump on line 4 was never taken on my machine.
The code grabs rdtsc, adds the x64 value from the address 0x7ffe03b8 and then divides by 1024 (shr rax, cl where cl is 10).
The value at 0x7ffe03b8 is 353662723 ( 0x15147703 ). I'm not sure if the value is the same after rebooting my machine.
My first question is: where does the 0x7ffexxxx memory range is coming from ? The debugger doesn't list any module that contain that in their memory range. How can I find what that is ? I tried step on the very first instruction of the program, and the values at those addresses are already set.
Since I couldn't figure out from where those value came from, I tried to read
Intel documentation.
- Time stamp counter, in Volume 3, chapter 17.17;
- Counting clocks, in Volume 3, chapter 18.7;
- CPUID instruction, in Volume 2, page 300 (3-198);
It seems that using CPUID with eax set to 0x15 I could compute the tsc frequency, but my CPU (i7 860) doesn't support value more than 0xb in CPUID. Chapter 18.7.3.2 specifies that Nehalem based processor should use MSR_PLATFORM_INFO to get the value, but using rdmsr to read msr register requires the application to run in kernel mode. I think I could write a kernel driver to do that, but I don't want to do it at the moment (I was thinking about doing that to be able to query cache misses, but I don't actually know what writing a kernel driver implies or how to use it in).
So does anyone knows how to get the "transformation" necessary to transform rdtsc values to either seconds, or QPC compatible values ?