Handmade Hero»Forums»Code
Dejan
25 posts
Guide - How to avoid C/C++ runtime on Windows
FYI: I've set this up with VS2015 and it looks like memset() doesn't need to be defined anymore. The BigArray works fine without it on a x64 build.

Thanks for the detailed guide, very handy.
Andrew Kelley
7 posts
Open-source software, electronic music production, video game development
Guide - How to avoid C/C++ runtime on Windows
Thanks for this very informative post!

(6) No new/delete C++ operators, they are use global new/delete functions. You'll need to either override new/delete for each class, or implement global new/delete functions yourself.

You can use the "placement new" operator to call c++ constructors without depending on any external symbols or libstdc++. Example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
#include <new>
// create<MyClass>(a, b) is equivalent to: new MyClass(a, b)
template<typename T, typename... Args>
__attribute__((malloc)) static inline T * create(Args... args) {
    T * ptr = reinterpret_cast<T*>(malloc(sizeof(T)));
    if (!ptr)
        panic("create: out of memory");
    new (ptr) T(args...);
    return ptr;
}

// allocate<MyClass>(10) is equivalent to: new MyClass[10]
// calls the default constructor for each item in the array.
template<typename T>
__attribute__((malloc)) static inline T * allocate(size_t count) {
    T * ptr = reinterpret_cast<T*>(malloc(count * sizeof(T)));
    if (!ptr)
        panic("allocate: out of memory");
    for (size_t i = 0; i < count; i++)
        new (&ptr[i]) T;
    return ptr;
}

// calls destructors and frees the memory.
// the count parameter is only used to call destructors of array elements.
// provide a count of 1 if this is not an array,
// or a count of 0 to skip the destructors.
template<typename T>
static inline void destroy(T * ptr, size_t count) {
    if (ptr) {
        for (size_t i = 0; i < count; i += 1)
            ptr[i].~T();
    }
    free(ptr);
}


Obviously this does depend on malloc, free, and panic, left to be implemented as an exercise for the reader. But it does not introduce any dependencies on anything else.
Mārtiņš Možeiko
2559 posts / 2 projects
Guide - How to avoid C/C++ runtime on Windows
Why would you want something like that instead of global new/delete operators? Defining them doesn't introduce dependency on libstdc++.
Andrew Kelley
7 posts
Open-source software, electronic music production, video game development
Guide - How to avoid C/C++ runtime on Windows
What would that look like? I must admit that when you say "global new/delete operators" I don't have a clear understanding of what that entails.
Mārtiņš Možeiko
2559 posts / 2 projects
Guide - How to avoid C/C++ runtime on Windows
Edited by Mārtiņš Možeiko on
It would look like this: https://gist.github.com/mmozeiko/...25210cbe61bff49b6375762976acc866e

You'll need to override also array new[] and delete[]. See here:
http://en.cppreference.com/w/cpp/memory/new/operator_new
http://en.cppreference.com/w/cpp/memory/new/operator_delete
Andrew Kelley
7 posts
Open-source software, electronic music production, video game development
Guide - How to avoid C/C++ runtime on Windows
I see, so we could merge our two approaches in order to make new and delete look and act like normal, yet not depend on libstdc++.
Mārtiņš Možeiko
2559 posts / 2 projects
Guide - How to avoid C/C++ runtime on Windows
You don't need your approach at all.
Call to constructor and destructor will be generated automatically by compiler in my approach.

Try this (either run it and check the output, or step through with debugger): https://gist.github.com/mmozeiko/...3097cb3ba77370105fd5870c8a6c7367a
Andrew Kelley
7 posts
Open-source software, electronic music production, video game development
Guide - How to avoid C/C++ runtime on Windows
Aha. Thanks, this has been very informative for me.
Fred Harris
11 posts
Guide - How to avoid C/C++ runtime on Windows
Thanks very much for this mmozeiko! I have been working on this since almost the beginning of 2016. Here is where I found out how to fix the...

_fltused

thingie!

Just yesterday I got stuck on the _dtoi3 unresolved external linker error, and am going to try to incorporate your code to fix that for x86 builds involving numbers.

What I am trying to accomplish is a complete application development system in C++ without anything from the C++ Standard Libraries, which I view as completely unrestrained bloatware. For many years I have used my own String Class, which compiles much smaller than the C++ Std. Lib. string class, but the sizes of my builds are still too large for me with the standard MSVC compiles. This system you have described here remedies that situation, and I have my String Class compiled down to the point where I'm getting 5 to 6 k x64 asci/Unicode exes.

My String Class is different from the C++ Std. Lib. String Class in that it also has some of the formatting functionality of iostream. That's where I ran into the unresolved external linker errors with _dtoi3 in attempting to get it to work in x86.

This whole business has been like pulling teeth! One problem after another. Your postings have really helped though! Thanks a lot!!!

Fred
Mārtiņš Možeiko
2559 posts / 2 projects
Guide - How to avoid C/C++ runtime on Windows
Edited by Mārtiņš Možeiko on
I'm not aware of _dtoi3 function. msvcrt.lib MSVC C runtime library doesn't have such function. Are you sure it is _dtoi3 and not _dtol3? _dtol3 is generated when you cast double to int64. You need to implement yourself this function.

But if it is really _dtoi3 function, please show me the code that creates it - you can get assembly output by providing /Fa compiler switch for the .c/cpp file that produces call to this function.
Fred Harris
11 posts
Guide - How to avoid C/C++ runtime on Windows
It really is _dtoui3 Mmozeiko. I’ll see shortly about getting the asm output. Here’s the console output though showing the error…

1
2
3
TCLib.lib(FltToCh.obj) : 
error LNK2019: unresolved external symbol __dtoui3 referenced in function _FltToCh
Test1.exe : fatal error LNK1120: 1 unresolved externals


I’m using MS VC from Visual Studio 2015 (version 19 I believe).

In terms of either x86 or x64, its my understanding that a long and an int are basically the same thing. I mostly use Windows, but I’m guessing there must be other OSs where they are different.

Basically, for three months I’ve been back and forth between x86 and x64 code trying to get a system put together that will work for x86 /x64, wide / asci. Its just one problem after another, with no end in sight. And now this in x86 because of the numeric issue in handling 64 bit numbers, which, I might be able to see my way clear without if it will solve my problem, i.e., just limit x86 to dealing with 32 bit numbers. Haven’t decided yet on the issue. I’d like to see first if I can get this working with the code and techniques you’ve provided.

Basically, all my issues in trying to develop this have revolved around the fact that without msvcrt I have no way of converting floating point numbers to character strings. I originally started with Matt Pietrek’s (from Microsoft Systems Journal) LibCTiny.lib….

http://www.wheaty.net/

He originally did this work in the late 90s and updated it 2003 or so. He provided a rendition of printf and sprintf which worked, save for not being able to convert floating point numbers, which I consider to be a big problem. To solve this for x86 I found some asm code in Lib format at…

www.masm32.com

…written by Raymond Filiatreault. I got that to work for x86 with some inline asm, but I can’t call x86 libs from x64, obviously. I wanted a solution for both. As a quick hack I decided to do a Load Library / GetProcAddress on msvcrt.dll which did indeed solve my problem with printf/sprintf without adding anything to program size. However, that solution/hack wasn’t long term satisfactory to me. I wanted to free myself of both the C Runtime and anything in the C++ Standard Libraries. In that sense, doing a LoadLibrary/GetProcAddress is kind of cheating. So I wrote my own code to do the conversion from floating point to asci/wide character strings. It also does rounding, left/right justification in field, specification of number of places after decimal point, and accepts the character to be used for a decimal point, i.e., European countries tend to use the comma for a decimal point, I am told. Here is a standard C++ program using my FltToCh function just mentioned to convert 999.99999 to 1000.00 if two decimal places are specified…

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
// cl Problem1.cpp
#include <windows.h>
#include <string.h>
#include <stdlib.h>
#include <memory.h>
#include <stdio.h>


size_t __cdecl FltToCh(char* p, double x, size_t iFldWthTChrs, int iDecPlaces, char cDecimalSeperator, bool blnRightJustified)
{
 bool blnPositive,blnAbsValLessThanOne;
 bool blnRoundUpSuccessful=false;
 bool blnNeedToRoundUp=false;
 int iRoundingDigitLocation;
 size_t n,i=0,k=0;
 char* p1;
 
 p1=(char*)malloc(32);
 if(!p1)
    return 0;
 memset(p1,0,32);
 p1[0]=32, p1[1]=32;
 p1=p1+2;
 if(x>=0.0L)
    blnPositive=true;
 else
 { 
    blnPositive=false;
    x=x*-1.0L;
 }
 if(x<1.00000000000000L)
    blnAbsValLessThanOne=true;
 else
    blnAbsValLessThanOne=false;
 n=(size_t)x;
 while(n>0)
 {
   x=x/10;
   n=(size_t)x;
   i++;
 }
 *(p1+i) = cDecimalSeperator;
 x=x*10;
 n = (size_t)x;
 x = x-n;
 while(k<=17)
 {
   if(k == i)
      k++;
   *(p1+k)=48+(char)n;
   x=x*10;
   n = (size_t)x;
   x = x-n;
   k++;
 }
 *(p1+k) = '\0';
 iRoundingDigitLocation=(int)i+iDecPlaces+1;
 if(p1[iRoundingDigitLocation]>53)
    blnNeedToRoundUp=true;
 else
 {
    if(p1[iRoundingDigitLocation]==53  && p1[iRoundingDigitLocation-1] % 2)
       blnNeedToRoundUp=true;
 }
 p1[iRoundingDigitLocation]=0;
 if(blnNeedToRoundUp)
 { 
    int iStart=iRoundingDigitLocation-1;
    for(int h=iStart; h>=0; h--)
    {
        if(h==i)
           continue;
        else
        {
           if(p1[h]!=57 && p1[h]!=cDecimalSeperator)
           {
              p1[h]=p1[h]+1;
              blnRoundUpSuccessful=true;
              break;
           }
           else
              p1[h]=48;
        }
    }
 }
 if(blnPositive)
 {
    if(blnRoundUpSuccessful==false && blnNeedToRoundUp==true)
    { 
       p1[-1]=49;
       if(blnAbsValLessThanOne)
          blnAbsValLessThanOne=false;
    }   
    if(iDecPlaces==0)
       p1[i]='\0';     
    if(blnAbsValLessThanOne)
       p1[-1]=48;
 }
 else
 {
    if(blnRoundUpSuccessful==false && blnNeedToRoundUp==true)
    {
       p1[-1]=49;
       p1[-2]='-';
       if(blnAbsValLessThanOne)
          blnAbsValLessThanOne=false;
    }   
    if(iDecPlaces==0)
       p1[i]=0;     
    if(blnAbsValLessThanOne)
    {
       p1[-1]=48;
       p1[-2]='-';
    }
    else
    {
       if(p1[-1]==32)     
          p1[-1]='-';
    }       
 }
 
 int iSpaces=0;
 p1=p1-2;
 for(int i=0; i<18; i++)
 {
     if(p1[i]==32)
        iSpaces++;
 } 
 for(int i=0; i<18; i++) 
     p1[i]=p1[i+iSpaces];
 size_t iLen=strlen(p1);
 if(iLen>(iFldWthTChrs-1))
 {
	   memset(p,'F',iFldWthTChrs-1);
    p[iFldWthTChrs-1]=0;
    free(p1);
    return 0;
 }   
 char* pField=(char*)malloc(iFldWthTChrs);
 if(!pField)
 {
    free(p1);
    return 0;
 }
 size_t iDiff = iFldWthTChrs - iLen -1;
 if(blnRightJustified)
 {
    for(size_t i=0; i<iDiff; i++)
        pField[i]=32;
    pField[iDiff]=0;  
    strcat(pField,p1);
 }
 else
 {
    strcpy(pField,p1);
	   memset(pField+iLen,' ',iDiff);
    pField[iFldWthTChrs-1]=0;
 }
 strcpy(p,pField);
 free(p1);
 free(pField);
 
 return iFldWthTChrs-1;
}

int main()
{
 char szBuffer[16];
 double dblNumber;
 size_t iLen=0;
 
 memset(szBuffer, 0, 16);
 dblNumber=999.9999999;
 iLen=FltToCh(szBuffer, dblNumber, 16, 2, '.', false);
 printf("iLen     = %u\n",iLen);
 printf("szBuffer = %s\n",szBuffer);
 getchar();
 
 return 0;
}


And the compilation string and output is simply this with a program name of Problem1.cpp…

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
C:\Code\VStudio\VC15\LibCTiny\x86\Test2>cl Problem1.cpp
Microsoft (R) C/C++ Optimizing Compiler Version 19.00.23506 for x86
Copyright (C) Microsoft Corporation.  All rights reserved.

Problem1.cpp
Microsoft (R) Incremental Linker Version 14.00.23506.0
Copyright (C) Microsoft Corporation.  All rights reserved.

/out:Problem1.exe
Problem1.obj

C:\Code\VStudio\VC15\LibCTiny\x86\Test2>problem1
iLen     = 15
szBuffer = 1000.00


But when I compile my FltToCh and FltToWch into an obj file and add it to my Lib, then I get the error dtoui3 I previously posted.

In looking at the code and reading your reply above stating that _dtoul3 is used for converting doubles to 32 bit ints, I suspect the source of the call involves my variable ‘n’ above where there are several lines like so…

N = (size_t)x

Where ‘x’ is a double. Right now I’m in the process of studying your win32crt_math.cpp file and trying to incorporate it into my Lib to see if I can overcome yet this hurdle in this never ending battle to succeed at this. As far as I’m concerned I’ve finally got it completely beaten in x64. All that’s left now is x86. I just tested your win32_crt_math.cpp and it compiles fine. I added it to my library TCLib.lib, which contains the following *.obj files….

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
crt_con_a.obj          // start up code for console mode asci app
crt_con_w.obj          // start up code for console mode wide app
crt_win_a.obj          // start up code for windows gui asci app
crt_win_w.obj          // start up code for windows gui wide app
memset.obj
newdel.obj
printf.obj
sprintf.obj
_strnicmp.obj
strncpy.obj
strncmp.obj
_strrev.obj
strcat.obj
strcmp.obj
strcpy.obj
strlen.obj
getchar.obj
alloc.obj
alloc2.obj
allocsup.obj
FltToCh.obj           // my floating point to asci/Unicode characters
atol.obj
_atoi64.obj
abs.obj
win32_crt_math.obj    // molinkos code


I’m testing with this…

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
// cl Test1.cpp /O1 /Os /GS- /Zc:sizedDealloc- /link TCLib.lib kernel32.lib               // doesn't work
// cl Test1.cpp /O1 /Os /GS- /link TCLib.lib kernel32.lib                                 // doesn't work
// cl Test1.cpp /O1 /Os /GS- /Zc:sizedDealloc- /link TCLib.lib FltToCh.obj kernel32.lib   // doesn't work
// 5,632 bytes with TCLib.lib  22 times smaller than with C Std. Lib.
//#define  UNICODE
//#define  _UNICODE
#include <windows.h>
#include "stdio.h"
#include "tchar.h"
extern "C" int _fltused=1;

int _tmain()
{
 TCHAR szBuffer[16];
 double dblNumber;
 
 dblNumber=999.9999999;
 FltToTChar(szBuffer, dblNumber, 12, 2, _T('.'),false);
 _tprintf(_T("%s\n"),szBuffer);
 getchar();

 return 0;
}


Here’s the command line compilation results showing the _dtoui3 linker error…

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
C:\Code\VStudio\VC15\LibCTiny\x86\Test2>cl Test1.cpp /O1 /Os /GS- /Zc:sizedDealloc- /link TCLib.lib kernel32.lib
Microsoft (R) C/C++ Optimizing Compiler Version 19.00.23506 for x86
Copyright (C) Microsoft Corporation.  All rights reserved.

Test1.cpp
Microsoft (R) Incremental Linker Version 14.00.23506.0
Copyright (C) Microsoft Corporation.  All rights reserved.

/out:Test1.exe
TCLib.lib
kernel32.lib
Test1.obj
TCLib.lib(FltToCh.obj) : error LNK2019: unresolved external symbol __dtoui3 referenced in function _FltToCh
Test1.exe : fatal error LNK1120: 1 unresolved externals


Any thoughts? How would I go about creating a _dtoui function?
Fred Harris
11 posts
Guide - How to avoid C/C++ runtime on Windows
Just re-read your post mmozeiko. Its casting a double to int64 then. Well, its gotta be those lines I mentioned where I have this...

1
2
3
size_t n;
double x;
n=(size_t)x;


That's needed in my algorithm to seperate the digits of the input double so as to convert to asci/wide chars. Recommendations to beat this issue???
Mārtiņš Možeiko
2559 posts / 2 projects
Guide - How to avoid C/C++ runtime on Windows
Edited by Mārtiņš Možeiko on
First of all, initially you said you are missing _dtoi3 function, not _dtoui3. _dtoi3 function doesn't exist 100% :)

But _dtoui3 function exists. As you discovered yourself, it is meant for converting double to unsigned int (or any type that is the same, for 32-bit Windows code that would be size_t, unsigned long and others).

You have two options:
1) Instead of using casts, you implement your own casting function.
So instead of
1
2
3
size_t n;
double x;
n=(size_t)x;

you would write:
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
unsigned int DoubleToU32(double x)
{
    unsigned int r;
    ...
    return r;
}

size_t n;
double x;
n=DoubleToU32(x);

This is what Casey is doing in HandmadeHero. He is not implementing these functions yet, but that can be done later.

2) Second option would be to implement _dtoui3 function (and others, if you need):
1
unsigned int _dtoui3(double x) { ... }

Make function externally visible and implement in one file.

Now the question is how to implement it. There are several options here. First is just to take some opensource and copy/paste it. Here's llvm one (MIT licence): https://llvm.org/svn/llvm-project.../lib/builtins/fp_fixuint_impl.inc

But you can do better. If you know that range of double is limited, for example if double x is in [-2^31 .. 2^31) interval, you can do it with just one SSE2 instruction: _mm_cvtsd_si32. It is OK to use SSE2 instructions, because you are getting error about missing _dtoui3 function which is present only when you are compiling code with enabled SSE2 instruction set.

The conversion function will look like this:
1
2
3
4
unsigned int DoubleToU32(double x)
{
    return (unsigned int)_mm_cvtsd_si32(_mm_set_sd(x));
}


If you need to cover whole [0..2^32) range for unsigned int conversion, you can add a bit more code to get it to work. In pseudocode:
1
2
3
4
5
6
7
8
unsigned int DoubleToU32(double x)
{
    if (x >= double(2^31))
    {
        return 2^31 + _mm_cvtsd_si32(x - double(2^31));
    }
    return _mm_cvtsd_si32(x);
}

Of course for values outside of unsigned int range this will return garbage. You can add extra checks to return whatever value you want. This can be implemented completely branch free with SSE2 intrinsics - comparison will return mask and you can use it to select from two values - and + andnot + or instructions.

And I have comment about wide characters. Don't use it. They are bad. Their type is compiler specific, and on Windows they don't really do what they are supposed to do (cover all unicode characters) - so the surrogate pairs are introduced and all the advantage of fixed width unicode char type is lost. You cannot have random access to n-th string character and you cannot split string at any place. Doing so will lead to rendering bugs, or more serious security issues.

Just use UTF-8 and you'll be fine. Most of string functions will be exactly the same for ascii and utf-8 types. You can actually drop ascii and use just utf-8 strings, because utf-8 is superset of ascii, so all ascii strings automatically are valid utf-8 strings. If you need to interact with Windows wide char API, convert to wide char from utf-8 just before that call, and covert back to utf-8 immediately after the call. This will be much better way than using wchar_t type.

If utf-8 is really really unacceptable (why?), then at least use the char32_t type. It is C++11 type, but it covers all possible unicode char range.
511 posts
Guide - How to avoid C/C++ runtime on Windows
mmozeiko
Just use UTF-8 and you'll be fine. Most of string functions will be exactly the same for ascii and utf-8 types. You can actually drop ascii and use just utf-8 strings, because utf-8 is superset of ascii, so all ascii strings automatically are valid utf-8 strings. If you need to interact with Windows wide char API, convert to wide char from utf-8 just before that call, and covert back to utf-8 immediately after the call. This will be much better way than using wchar_t type.

If utf-8 is really really unacceptable (why?), then at least use the char32_t type. It is C++11 type, but it covers all possible unicode char range.


Windows usually has both utf8 and utf16 versions of the string-receiving functions you can usually just append A to ensure you always use the UTF8 version.
Mārtiņš Možeiko
2559 posts / 2 projects
Guide - How to avoid C/C++ runtime on Windows
That is not correct.

A functions stands for ANSI, and they do not support utf-8 encoding. Any byte in string with value >= 128 (part of utf-8 multibyte char) will be interpreted in system specific locale. Meaning it will be different characters for different configuration of Windows. https://msdn.microsoft.com/en-us/...ary/windows/desktop/dd317752.aspx
Characters represented by the remaining codes, 0x80 through 0xff, vary among character sets. Each character set includes different special characters, typically customized for a language or group of languages. Windows code page 1252 and OEM code page 437 are generally used in the United States.

W functions depends on Windows version. For older Windows that 2000 it interprets strings in UCS-2 encoding. For Windows 2000 and up it is UTF-16. Of course, nobody now cares about Windows version older than 2000, so it is UTF-16 pretty much everywhere.