Using AVX512 instruction when you are compiling for 32-bit code seems very wrong... :)
Typically you implement _ftoui3 like this:
1
2
3
4
5
6
7
8
9
10
11
12 | const float next_after_max_signed_int = 2147483648.0f; // represented exactly in float
if (float_value > next_after_max_signed_int)
{
int result = (int)(float_value - next_after_max_signed_int);
return (unsigned int)(result ^ 0x80000000);
}
else
{
int result = (int)float_value;
return (unsigned int)result;
}
|
if statement does not need to be real "if", it can be conditional move.
This is how it would look like with SSE2:
| // assumes float input is in xmm0 register
movss xmm1, [value_of_2147483648_constant]
movss xmm2, xmm0
subss xmm2, xmm1
cvttss2si eax, xmm2
cvttss2si edx, xmm0
xor eax, 0x80000000
ucomiss xmm1, xmm0
cmova eax, edx
// unsigned int value now is in eax register
|
Not sure if this will handle all the Inifinty or NaN floats exactly same as your regular C runtime cast, but otherwise it should work exactly the same.
Similar approach works four doubles in 32-bit code.
If you want to avoid SSE instructions, you can probably us x87 fpu instructions in similar way, or alternatively you can extract mantissa & exponents bits and use them to calculate actual value yourself. Check the code in llvm compiler-rt library that does this:
https://github.com/llvm-mirror/co.../builtins/fp_fixuint_impl.inc#L17