Win32 timeBeginPeriod() alternative for OS X

Hey guys!

In win32 platform layer Casey is using timeBeginPeriod() (https://msdn.microsoft.com/en-us/library/dd757624(v=vs.85).aspx) function to tell Windows scheduler to be more precise when it wakes up HMH application from Sleep().

Do we have something like this on OS X? I've found something called "real-time threads" (https://developer.apple.com/libra...gramming/scheduler/scheduler.html) and it sounds nice but the document itself is more about kernel-level programming, hardware drivers and such so I am not sure if this stuff is actually usable by normal applications and should I try to use it or not.
As far as I know OSX and Linux doesn't allow to change kernel scheduler granularity from user code. That could make system more unstable. You need to recompile kernel to change shceduler granularity.

You can check resolution of scheduler by calling "sysconf(_SC_CLK_TCK)". It will return Hz.
I've been wondering about this too, and came across the same info on thread priority. I briefly tried setting the thread priority for the game loop but it didn't seem to have any effect.

Currently I'm using mach_absolute_time() and mach_wait_until() for the timing, and this is a sampling of the frame times I'm getting on my machine:
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
33.373106ms/frame
33.752043ms/frame
33.407167ms/frame
33.768781ms/frame
33.811562ms/frame
33.737440ms/frame
33.974245ms/frame
34.046318ms/frame
33.900373ms/frame
33.344060ms/frame


So I would be interested to know if there is a way to tighten up the timing on OS X even more to hit that 33.33ms time consistently as well. However, as long as we take into account dt (which we currently are), these tiny deviations may not be that big of a deal?
Rising priority will help only if our application is missing target frame time because of other applications running on system. It won't help if kernel scheduler granularity is too low.

I would suggest not to worry much about this now. Casey will switch to OpenGL at some point to use hardware accelerated rendering. And OpenGL is able to synchronize with vsync, so all current sleeping code will go away for platforms with OpenGL hardware acceleration (like Windows, Linux, OSX, Android, iOS, even RaspberryPi.
Flyingsand, thanks for mach_wait_until(), I missed it for some reason. I am using plain usleep right now and getting the same timing as you do.

mmozeiko, I am already using OpenGL with vSync enabled. But I want to also sleep after synced flush for two reasons:
  1. If I understand correctly vSync could be unavailable on some hardware. Is it really so?
  2. What if vSync syncs with 120Hz but I want to always have 30Hz and have my frame on the screen for 3 refreshes?
I'm pretty sure any hardware from at least last 10 years will allow to set vsync.

As for waiting for 3 vsyncs, not sure about OSX, but on Linux and Windows you can specify how many refereshes you want to wait till display frame. So basically you measure how much time did you took. And then if it is <8.3ms you wait 4 refreshes, if it is <16.7 you wait for 3 refreshes, if it is <25ms you wait 2 refreshes, else you wait 1 refresh.
@Flyingsand, so I've just tried a CPU-melting way for the first time for some reason =) And I am getting quite strong 33.33ms. I can't get strong 16.6ms no matter what I am doing - I am getting peaks of 27-30ms peaks from time to time.

So I want to investigate all of this a bit more. Could you please share your brief experiment with thread priority setting so I can play with it a bit more? Or could you please just explain a bit how to get started with all of this scheduler configuration stuff.
It could be that you simply can not get 16.6 msec with current bitmap drawing code - it is terribly inefficient. Try commenting out drawing code and see if that helps.
mmozeiko, I think it's not the case. I am only drawing WeirdGradient right now. It's instantaneous. I am doing some OpenGL to put result to the screen but I have this peaks even if I comment this out.

Actually I've profiled this a bit and I see that sometimes my sleep goes wrong and sometimes my OpenGL buffer flush goes wrong. Those are I/O blocking calls so I think it's scheduler who makes me slow.

Edited by Vadim Borodin on
vbo
@Flyingsand, so I've just tried a CPU-melting way for the first time for some reason =) And I am getting quite strong 33.33ms. I can't get strong 16.6ms no matter what I am doing - I am getting peaks of 27-30ms peaks from time to time.

So I want to investigate all of this a bit more. Could you please share your brief experiment with thread priority setting so I can play with it a bit more? Or could you please just explain a bit how to get started with all of this scheduler configuration stuff.


Cool. What CPU-melting way do you speak of?

The extent of my experiment with thread priority was basically this:
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
thread_precedence_policy_data_t policyData;
policyData.importance = 8;
kern_return_t result = thread_policy_set(mach_thread_self(),
                      THREAD_PRECEDENCE_POLICY,
                      (thread_policy_t)&policyData,
                      THREAD_PRECEDENCE_POLICY_COUNT);
if (result != KERN_SUCCESS)
{
    // Error...
}

I'm not really the person ask about scheduling/timing stuff like this though as I haven't had much experience with it. And I really don't know the inner workings of OS X/Unix at the kernel level.
Cool. What CPU-melting way do you speak of?

I was speaking about closed loop after OpenGL buffer flush with no sleeping at all. So called busy waiting.

Anyway after some work I've achieved (at least on my machine, day 32 game sources) quite strong 60 FPS timing even without CPU-melting. Looks like there was just a bug somewhere in my previous implementation and we don't actually need to tweak scheduler priorities!

It looks like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
0.016667s/frame
0.016667s/frame
0.016667s/frame
0.016667s/frame
0.016894s/frame
0.016667s/frame
0.016667s/frame
0.016774s/frame
0.016667s/frame
0.016667s/frame

Actually mach_wait_until() precision without any tweaking is like 500-750 microseconds and Apple officially suggests us to treat worse precision as a hardware bug (https://developer.apple.com/library/ios/technotes/tn2169/_index.html). I don't know if we can believe it or not though. Better to test all of this on some older Mac.
vbo
Cool. What CPU-melting way do you speak of?

I was speaking about closed loop after OpenGL buffer flush with no sleeping at all. So called busy waiting.

Anyway after some work I've achieved (at least on my machine, day 32 game sources) quite strong 60 FPS timing even without CPU-melting. Looks like there was just a bug somewhere in my previous implementation and we don't actually need to tweak scheduler priorities!

It looks like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
0.016667s/frame
0.016667s/frame
0.016667s/frame
0.016667s/frame
0.016894s/frame
0.016667s/frame
0.016667s/frame
0.016774s/frame
0.016667s/frame
0.016667s/frame

Actually mach_wait_until() precision without any tweaking is like 500-750 microseconds and Apple officially suggests us to treat worse precision as a hardware bug (https://developer.apple.com/library/ios/technotes/tn2169/_index.html). I don't know if we can believe it or not though. Better to test all of this on some older Mac.


Yes, I saw that tech note from Apple as well. My Mac is actually coming up on 7 years old, so maybe that's my problem with the timing.. :unsure: