I have been doing my own experiments using __rdtsc() after learning about it on HMH (thanks Casey!), and recently discovered what looks like XInputGetState() being terribly expensive to call; in the order of several milliseconds.
I googled and found this:
http://n-cpcom.googlecode.com/svn...der/input/xinput/xinputgamepad.cc
I checked to see if it was less expensive on devices that were connected, and that does indeed seem to be the case.
It would be great if someone could confirm this so it isn't just a case of my timing code being broken. This all kinds of freaks me out since I have shipped software that constantly polls (60Hz) for all 4 possible controllers, and I'm worried that I'm giving up several milliseconds of my precious 16.6 per frame for nothing...