Guide - How to avoid C/C++ runtime on Windows

WARNING This text is old, and a bit outdated. Some of information is not relevant anymore, and many things can be done in better way (read the new guide). WARNING

Couple of times Casey mentioned on stream that it would be nice to avoid C/C++ runtime, but it could take too much time explaining and doing that. So I made a guide how to do that. These instructions will make your executable to contain only code you are writing, no hidden code from runtime will be added (we'll add necessary stuff ourselves).

First of all, let's look at empty Windows application:

#include <windows.h>

int CALLBACK
WinMain(HINSTANCE Instance,
        HINSTANCE PrevInstance,
        LPSTR CommandLine,
        int ShowCode)
{
    return(0);
}

int CALLBACK WinMain(HINSTANCE Instance, HINSTANCE PrevInstance, LPSTR CommandLine, int ShowCode) { return(0); }[/code] Let's compile this as 64-bit application and look at size of executable (further compiles will be 64-bit only unless it will mention 32-bit compile):

C:\handmade>call "%VS120COMNTOOLS%..\..\VC\vcvarsall.bat" amd64

C:\handmade>cl.exe -nologo -Gm- -GR- -EHa- -Oi win32_handmade.cpp
win32_handmade.cpp

C:\handmade>dir win32_handmade.exe
...
12/12/2014  05:00 PM            68,096 win32_handmade.exe

C:\handmade>cl.exe -nologo -Gm- -GR- -EHa- -Oi win32_handmade.cpp win32_handmade.cpp

C:\handmade>dir win32_handmade.exe ... 12/12/2014 05:00 PM 68,096 win32_handmade.exe[/code] You can see it produces ~68KB executable.

Main switch to disable usage of C runtime is /NODEFAULTLIB linker argument, this will not pass any libraries linker thinks are "default" for application. This includes "msvcrt.lib" and also "kernel32.lib" libraries. We also want to specify what kind of application we are creating (GUI or console) with /SUBSYSTEM argument.

If we try to do that now we'll get an error:

C:\handmade>cl.exe -nologo -Gm- -GR- -EHa- -Oi win32_handmade.cpp -link -nodefaultlib -subsystem:windows
win32_handmade.cpp
LINK : error LNK2001: unresolved external symbol _WinMainCRTStartup
win32_handmade.exe : fatal error LNK1120: 1 unresolved externals
This is because WinMain is not the entry point of executable. It is WinMainCRTStartup. Visual C runtime provided it before to us and called our WinMain function (Casey talked about this in one of C intro streams). So now we need to implement our own WinMainCRTStartup. Let's do it like this:
void __stdcall WinMainCRTStartup()
{
    int Result = WinMain(GetModuleHandle(0), 0, 0, 0);
    ExitProcess(Result);
}
hat I'm passing NULL pointer to CommandLine argument, because Handmade Hero doesn't use it. In case you want to use it you can call GetCommandLineA/W() functions to get command-line. Alternative to calling ExitProcess you can simply return result value from function as int:
int __stdcall WinMainCRTStartup()
{
    int Result = WinMain(GetModuleHandle(0), 0, 0, 0);
    return Result;
}
value (or one passed to ExitProcess) will be set as process exit code.

Of course you can simplify code a bit and not call WinMain function at all, and just directly write code in WinMainCRTStartup function.

Now if we compile code (and add kernel32.lib import library for GetModuleHandle and ExitProcess functions) it will succeed and executable will be much smaller:

C:\handmade>cl.exe -nologo -Gm- -GR- -EHa- -Oi win32_handmade.cpp -link -nodefaultlib -subsystem:windows kernel32.lib
win32_handmade.cpp

C:\handmade>dir win32_handmade.exe
...
12/12/2014  05:09 PM             2,560 win32_handmade.exe

C:\handmade>dir win32_handmade.exe ... 12/12/2014 05:09 PM 2,560 win32_handmade.exe[/code] Just 2.5KB. Now that is very nice! No C runtime overhead!

But we are not quite done. If you apply what's done here till now to bigger programs you'll notice that you cannot use serveral features of C/C++ like:

  1. allocating large arrays/structure on stack (>4KB)
  2. some calculations with 64-bit integers in 32-bit code
  3. using floating point
  4. casting floating point to integer and back in 32-bit code
  5. initialization and assignment of large arrays/structures

Let's fix these issues.

1. allocating large arrays/structure on stack (>4KB)

If you allocate array or structure on stack that is greater that ~4KB, something like this:

#include <windows.h>

void __stdcall WinMainCRTStartup()
{
    char BigArray;
    BigArray = 0;

    ExitProcess(0);
}

void __stdcall WinMainCRTStartup() { char BigArray[4096]; BigArray[0] = 0;

ExitProcess(0);

} [/code] Then you'll get following linker errors:

C:\handmade>cl.exe -nologo -Gm- -GR- -EHa- -Oi win32_handmade.cpp -link -subsystem:windows -nodefaultlib kernel32.lib
win32_handmade.cpp
win32_handmade.obj : error LNK2019: unresolved external symbol ___report_rangecheckfailure referenced in function "void __stdcall WinMainCRTStartup(void)" (?WinMainCRTStartup@@YGXXZ)
win32_handmade.obj : error LNK2019: unresolved external symbol @__security_check_cookie@4 referenced in function "void __stdcall WinMainCRTStartup(void)" (?WinMainCRTStartup@@YGXXZ)
win32_handmade.obj : error LNK2019: unresolved external symbol __chkstk referenced in function "void __stdcall WinMainCRTStartup(void)" (?WinMainCRTStartup@@YGXXZ)
win32_handmade.obj : error LNK2019: unresolved external symbol ___security_cookie referenced in function "void __stdcall WinMainCRTStartup(void)" (?WinMainCRTStartup@@YGXXZ)
win32_handmade.exe : fatal error LNK1120: 4 unresolved externals
There are two issues here - one is security feature that could help in debug builds. So maybe you want to keep linking to C runtime in debug builds, but for final shipping executable you don't want any additional overhead inserted in your functions. Other issue is the way how stack is allocated for Windows executables. Avoiding going into more details, simply add /GS- /Gs9999999 arguments to commandline. Additionally you need to add "/STACK:0x100000,0x100000" to linker options so executable has full 1MiB of stack available to it. By default OS only reserves 1MiB of stack, but commits only few 4KiB pages. Then for each larger function in inserts code to check if more space is needed and actually commits new pages. But assuming we don't care about extra static 1MiB allocation, let's just commit it all at the startup and be done with this.

C:\handmade>cl.exe -nologo -Gm- -GR- -EHa- -Oi -GS- -Gs9999999 win32_handmade.cpp -link -subsystem:windows -nodefaultlib kernel32.lib -stack:0x100000,0x100000
win32_handmade.cpp

2. Some calculations with 64-bit integers in 32-bit code.

Now this is a bit trickier. 64-bit executable can use 64-bit registers to perform calculations on 64-bit values (int64_t and uint64_t). But if you compile for 32-bit code, all general purpose registers are only 32-bit long. How compiler can perfom operations then? Let's create test code that does various operations:

#include <stdint.h>
#include <windows.h>
#include "win32_crt_float.cpp"

void __stdcall WinMainCRTStartup()
{
    volatile int64_t s = 1;
    volatile uint64_t u = 1;

    s += s;
    s -= s;
    s *= s;
    s /= s;
    s %= s;
    s >>= 33;
    s <<= 33;

    u += u;
    u -= u;
    u *= u;
    u /= u;
    u %= u;
    u >>= 33;
    u <<= 33;

    ExitProcess(0);
}

void __stdcall WinMainCRTStartup() { volatile int64_t s = 1; volatile uint64_t u = 1;

s += s;
s -= s;
s *= s;
s /= s;
s %= s;
s >>= 33;
s <<= 33;

u += u;
u -= u;
u *= u;
u /= u;
u %= u;
u >>= 33;
u <<= 33;

ExitProcess(0);

}[/code] Now run compiler:

C:\handmade>call "%VS120COMNTOOLS%..\..\VC\vcvarsall.bat" x86

C:\handmade>cl.exe -nologo -Gm- -GR- -EHa- -Oi -GS- -Gs9999999 win32_handmade.cpp -link -subsystem:windows -nodefaultlib kernel32.lib -stack:0x100000,0x100000
win32_handmade.cpp
win32_handmade.obj : error LNK2019: unresolved external symbol __alldiv referenced in function "void __stdcall WinMainCRTStartup(void)" (?WinMainCRTStartup@@YGXXZ)
win32_handmade.obj : error LNK2019: unresolved external symbol __allmul referenced in function "void __stdcall WinMainCRTStartup(void)" (?WinMainCRTStartup@@YGXXZ)
win32_handmade.obj : error LNK2019: unresolved external symbol __allrem referenced in function "void __stdcall WinMainCRTStartup(void)" (?WinMainCRTStartup@@YGXXZ)
win32_handmade.obj : error LNK2019: unresolved external symbol __allshl referenced in function "void __stdcall WinMainCRTStartup(void)" (?WinMainCRTStartup@@YGXXZ)
win32_handmade.obj : error LNK2019: unresolved external symbol __allshr referenced in function "void __stdcall WinMainCRTStartup(void)" (?WinMainCRTStartup@@YGXXZ)
win32_handmade.obj : error LNK2019: unresolved external symbol __aulldiv referenced in function "void __stdcall WinMainCRTStartup(void)" (?WinMainCRTStartup@@YGXXZ)
win32_handmade.obj : error LNK2019: unresolved external symbol __aullrem referenced in function "void __stdcall WinMainCRTStartup(void)" (?WinMainCRTStartup@@YGXXZ)
win32_handmade.obj : error LNK2019: unresolved external symbol __aullshr referenced in function "void __stdcall WinMainCRTStartup(void)" (?WinMainCRTStartup@@YGXXZ)
win32_handmade.exe : fatal error LNK1120: 8 unresolved externals

C:\handmade>cl.exe -nologo -Gm- -GR- -EHa- -Oi -GS- -Gs9999999 win32_handmade.cpp -link -subsystem:windows -nodefaultlib kernel32.lib -stack:0x100000,0x100000 win32_handmade.cpp win32_handmade.obj : error LNK2019: unresolved external symbol __alldiv referenced in function "void __stdcall WinMainCRTStartup(void)" (?WinMainCRTStartup@@YGXXZ) win32_handmade.obj : error LNK2019: unresolved external symbol __allmul referenced in function "void __stdcall WinMainCRTStartup(void)" (?WinMainCRTStartup@@YGXXZ) win32_handmade.obj : error LNK2019: unresolved external symbol __allrem referenced in function "void __stdcall WinMainCRTStartup(void)" (?WinMainCRTStartup@@YGXXZ) win32_handmade.obj : error LNK2019: unresolved external symbol __allshl referenced in function "void __stdcall WinMainCRTStartup(void)" (?WinMainCRTStartup@@YGXXZ) win32_handmade.obj : error LNK2019: unresolved external symbol __allshr referenced in function "void __stdcall WinMainCRTStartup(void)" (?WinMainCRTStartup@@YGXXZ) win32_handmade.obj : error LNK2019: unresolved external symbol __aulldiv referenced in function "void __stdcall WinMainCRTStartup(void)" (?WinMainCRTStartup@@YGXXZ) win32_handmade.obj : error LNK2019: unresolved external symbol __aullrem referenced in function "void __stdcall WinMainCRTStartup(void)" (?WinMainCRTStartup@@YGXXZ) win32_handmade.obj : error LNK2019: unresolved external symbol __aullshr referenced in function "void __stdcall WinMainCRTStartup(void)" (?WinMainCRTStartup@@YGXXZ) win32_handmade.exe : fatal error LNK1120: 8 unresolved externals[/code] As you can see compiler uses bunch of function to perfom these operations. Source for these functions can be found in asm files under "C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\crt\src\intel" folder. We probably don't want to compile asm files now. Theoretically you could create C version of these functions, but let's just copy & paste assembly implementation into naked inline assembly functions (feel free to optimize them later :) Let's create win32_crt_math.cpp file. Then put #include "win32_crt_math.cpp" in your win32_handmade.cpp file and compile it:

C:\handmade>cl.exe -nologo -Gm- -GR- -EHa- -Oi -GS- -Gs9999999 win32_handmade.cpp -link -subsystem:windows -nodefaultlib kernel32.lib
win32_handmade.cpp
And success! Now you can use 64-bit types in 32-bit code.

Alternatively you can take implementation of these functions from SDL library: http://hg.libsdl.org/SDL/file/5c894fec85b9/src/stdlib/SDL_stdlib.c

3. using floating point

If you are using floating point in your code, then you'll get following linker error:

C:\handmade>cl.exe -nologo -Gm- -GR- -EHa- -Oi -GS- -Gs9999999 win32_handmade.cpp -link -subsystem:windows -nodefaultlib kernel32.lib -stack:0x100000,0x100000
win32_handmade.cpp
win32_handmade.obj : error LNK2001: unresolved external symbol __fltused
win32_handmade.exe : fatal error LNK1120: 1 unresolved externals

In this case linker wants to see _fltused symbol. It needs just the symbol, it doesn't care about its value. So let's provide it in win32_crt_float.cpp file:

extern "C" int _fltused;
And include this file in our win32_handmade.cpp:
#include <windows.h>
#include "win32_crt_float.cpp"

void __stdcall WinMainCRTStartup()
{
    float f;
    f = 0.0f;

    ExitProcess(0);
}

void __stdcall WinMainCRTStartup() { float f; f = 0.0f;

ExitProcess(0);

}[/code] Let's run compiler and we'll see that everything works:

C:\handmade>cl.exe -nologo -Gm- -GR- -EHa- -Oi -GS- -Gs9999999 win32_handmade.cpp -link -subsystem:windows -nodefaultlib kernel32.lib -stack:0x100000,0x100000
win32_handmade.cpp

4. Casting floating point to integer and back in 32-bit code

Following code compiled with SSE instruction set (VS2013 default one) will produce linker errors:

#include <stdint.h>
#include <windows.h>
#include "win32_crt_float.cpp"
#include "win32_crt_math.cpp"

void __stdcall WinMainCRTStartup()
{
    float f = 1000.0f;
    double d = 1000000000.0;

    int32_t i32f = (int32_t)f;
    int32_t i32d = (int32_t)d;
    uint32_t u32f = (uint32_t)f;
    uint32_t u32d = (uint32_t)d;

    int64_t i64f = (int64_t)f;
    int64_t i64d = (int64_t)d;
    uint64_t u64f = (uint64_t)f;
    uint64_t u64d = (uint64_t)d;

    f = (float)i32f;
    d = (double)i32d;
    f = (float)u32f;
    d = (double)u32d;

    f = (float)i64f;
    d = (double)i64d;
    f = (float)u64f;
    d = (double)u64d;

    ExitProcess(0);
}

void __stdcall WinMainCRTStartup() { float f = 1000.0f; double d = 1000000000.0;

int32_t i32f = (int32_t)f;
int32_t i32d = (int32_t)d;
uint32_t u32f = (uint32_t)f;
uint32_t u32d = (uint32_t)d;

int64_t i64f = (int64_t)f;
int64_t i64d = (int64_t)d;
uint64_t u64f = (uint64_t)f;
uint64_t u64d = (uint64_t)d;

f = (float)i32f;
d = (double)i32d;
f = (float)u32f;
d = (double)u32d;

f = (float)i64f;
d = (double)i64d;
f = (float)u64f;
d = (double)u64d;

ExitProcess(0);

}[/code] Compile it to see linker errors:

c:\handmade>cl.exe -Zi -nologo -Gm- -GR- -EHa- -Oi -GS- -Gs9999999 win32_handmade.cpp -link -subsystem:windows -nodefaultlib kernel32.lib -stack:0x100000,0x100000
win32_handmade.cpp
win32_handmade.obj : error LNK2019: unresolved external symbol __dtol3 referenced in function "void __stdcall WinMainCRTStartup(void)" (?WinMainCRTStartup@@YGXXZ)
win32_handmade.obj : error LNK2019: unresolved external symbol __dtoui3 referenced in function "void __stdcall WinMainCRTStartup(void)" (?WinMainCRTStartup@@YGXXZ)
win32_handmade.obj : error LNK2019: unresolved external symbol __dtoul3 referenced in function "void __stdcall WinMainCRTStartup(void)" (?WinMainCRTStartup@@YGXXZ)
win32_handmade.obj : error LNK2019: unresolved external symbol __ftol3 referenced in function "void __stdcall WinMainCRTStartup(void)" (?WinMainCRTStartup@@YGXXZ)
win32_handmade.obj : error LNK2019: unresolved external symbol __ftoui3 referenced in function "void __stdcall WinMainCRTStartup(void)" (?WinMainCRTStartup@@YGXXZ)
win32_handmade.obj : error LNK2019: unresolved external symbol __ftoul3 referenced in function "void __stdcall WinMainCRTStartup(void)" (?WinMainCRTStartup@@YGXXZ)
win32_handmade.obj : error LNK2019: unresolved external symbol __ltod3 referenced in function "void __stdcall WinMainCRTStartup(void)" (?WinMainCRTStartup@@YGXXZ)
win32_handmade.obj : error LNK2019: unresolved external symbol __ultod3 referenced in function "void __stdcall WinMainCRTStartup(void)" (?WinMainCRTStartup@@YGXXZ)
win32_handmade.exe : fatal error LNK1120: 8 unresolved externals
If you don't want to depend on SSE instruction set and you specify /arch:IA32 compiler argument, then error will be a bit different:
c:\handmade>cl.exe -arch:IA32 -nologo -Gm- -GR- -EHa- -Oi -GS- -Gs9999999 win32_handmade.cpp -link -subsystem:windows -nodefaultlib kernel32.lib
win32_handmade.cpp
win32_handmade.obj : error LNK2019: unresolved external symbol __ftol2 referenced in function "void __stdcall WinMainCRTStartup(void)" (?WinMainCRTStartup@@YGXXZ)
win32_handmade.obj : error LNK2019: unresolved external symbol __ftol2_sse referenced in function "void __stdcall WinMainCRTStartup(void)" (?WinMainCRTStartup@@YGXXZ)
win32_handmade.exe : fatal error LNK1120: 2 unresolved externals

I implemented functions only need for FPU path, no SSE stuff for 32-bit code (which you probably want anyway). See win32_crt_float.cpp file:

extern "C"
{
    int _fltused;

#ifdef _M_IX86 // following functions are needed only for 32-bit architecture

    __declspec(naked) void _ftol2()
    {
        __asm
        {
            fistp qword ptr 
            mov   edx,
            mov   eax,
            ret
        }
    }

    __declspec(naked) void _ftol2_sse()
    {
        __asm
        {
            fistp dword ptr 
            mov   eax,
            ret
        }
    }

#endif
}

#ifdef _M_IX86 // following functions are needed only for 32-bit architecture

__declspec(naked) void _ftol2()
{
    __asm
    {
        fistp qword ptr [esp-8]
        mov   edx,[esp-4]
        mov   eax,[esp-8]
        ret
    }
}

__declspec(naked) void _ftol2_sse()
{
    __asm
    {
        fistp dword ptr [esp-4]
        mov   eax,[esp-4]
        ret
    }
}

#endif }[/code] Warning! These functions are not exactly as regular casts. They do rounding differently (round to nearest, not truncate). Also they don't process correctly NaNs and don't produce floating point exceptions. But if you are ok with that, then this is good enough. For more accurate implementation consult SDL source I mentioned above. Alternatively you could create your own casting function and avoid C cast. You'll need to do that anyway for all trigonometry and other functions from math.h header.

When that is done your code will compile fine.

c:\handmade>cl.exe -arch:IA32 -nologo -Gm- -GR- -EHa- -Oi -GS- -Gs9999999 win32_handmade.cpp -link -subsystem:windows -nodefaultlib kernel32.lib -stack:0x100000,0x100000
win32_handmade.cpp

So remember to use -arch:IA32 for 32-bit builds.

5. initialization and assignment of large arrays/structures

If you will initialize big arrays or structures with 0, then compiler will assume it can call memset to clear that space on stack. For example this code:

#include <stdint.h>
#include <windows.h>
#include "win32_crt_float.cpp"
#include "win32_crt_math.cpp"

void __stdcall WinMainCRTStartup()
{
    char BigArray = {};

    ExitProcess(0);
}

void __stdcall WinMainCRTStartup() { char BigArray[100] = {};

ExitProcess(0);

}[/code]

Will call memset:

c:\handmade>cl.exe -Zi -nologo -Gm- -GR- -EHa- -Oi -GS- -Gs9999999 win32_handmade.cpp -link -subsystem:windows -nodefaultlib kernel32.lib
win32_handmade.cpp
win32_handmade.obj : error LNK2019: unresolved external symbol _memset referenced in function "void __stdcall WinMainCRTStartup(void)" (?WinMainCRTStartup@@YGXXZ)
win32_handmade.exe : fatal error LNK1120: 1 unresolved externals

To fix that create win32_crt_memory.cpp file and stick #include to it in win32_handmade.cpp. #pragmas before functions tell compiler that these two functions won't be intrinsics, so they can be compiled with "-Oi" option.

extern "C"
{
    #pragma function(memset)
    void *memset(void *dest, int c, size_t count)
    {
        char *bytes = (char *)dest;
        while (count--)
        {
            *bytes++ = (char)c;
        }
        return dest;
    }

    #pragma function(memcpy)
    void *memcpy(void *dest, const void *src, size_t count)
    {
        char *dest8 = (char *)dest;
        const char *src8 = (const char *)src;
        while (count--)
        {
            *dest8++ = *src8++;
        }
        return dest;
    }
}

#pragma function(memcpy)
void *memcpy(void *dest, const void *src, size_t count)
{
    char *dest8 = (char *)dest;
    const char *src8 = (const char *)src;
    while (count--)
    {
        *dest8++ = *src8++;
    }
    return dest;
}

}[/code]

Let's check if BigArray now compiles fine:

c:\handmade>cl.exe -Zi -nologo -Gm- -GR- -EHa- -Oi -GS- -Gs9999999 win32_handmade.cpp -link -subsystem:windows -nodefaultlib kernel32.lib -stack:0x100000,0x100000 win32_handmade.cpp
And we're good!

Now you can write pretty much almost any reasonable code as you wish!

Remember that you are not allowed to use following features:

  1. C++ RTTI (it's turned off by -GR- anyway)
  2. C++ exceptions - try/catch/throw (this it's turned off by -EHa-)
  3. SEH exceptions - you could use them if you implement _C_specific_handler (for 64-bit code) and _except_handler3 (for 32-bit code) functions. See simple expample how to do that by calling original C runtime functions in win32_crt_seh.cpp file.
  4. Global objects with C++ constructors and destructors - it's possible to implement it, but it's not needed for us.
  5. Pure virtual functions in C++ classes - for that you'll need to implement "__purecall" function, but we are also not interested in this.
  6. No new/delete C++ operators, they are using global new/delete functions. You'll need to either override new/delete for each class, or implement global new/delete functions yourself.

Of course you can not use any function from standard C or C++ runtime (stdlib.h, stdio.h, string.h, math.h, etc.. headers). Only safe headers are the ones that provide compiler specific functionality, such as: stddef.h - if you want size_t and NULL stdint.h - various intXX_t and uintXX_t typedefs stdarg.h - va_arg, va_start, va_end, va_arg intrinsics intrin.h and few other headers for intrinsic functions (rdtsc, cpuid, SSE, SSE2, etc..)


Edited by Mārtiņš Možeiko on
Awesome! Thanks for writing this up for people. I will put a link to it on the "Coding Resources" page.

- Casey
Wow! Very cool, thanks for this. I'm looking forward to trying it at some point.
Great Post and thank you for making it!

I used your guide to setup my project in Video Studio 2013 to have four builds:
Win32 Debug
Win32 Release
x64 Debug
x64 Release

The debug builds use the standard c runtime library and security checks. The release builds are setup as you described above.

They all built successfully until I added a call to StretchDIBits.

Adding that caused the Win32 Release build to give the following error:
[color=#bb0000]"unresolved external symbol _memset referenced in function "void __cdecl Win32PresentDisplayBuffer"
[/color]
So apparently the 32-bit version of StretchDIBits in gdi32.dll calls _memset defined in the CRT. What's the best way to work around this?
That does not sound right to me - I guess it is theoretically possible that the _import library_ site for StretchDIBits had a memset in it somehow, but it's not possible that gdi32.dll having a memset call would cause the _linker_ to emit an undefined symbol, because it doesn't work that way.

More likely is that the compiler is using memset for some reason to clear something, or something. I would define your own memset so it links, and then look at the assembly to see where the call sites are on your side.

- Casey
Yeah, forgot about this "optimization".

Probably because of optimizations compiler removed some code when call to StretchDIBits was not in code. But when it is present it needed some additional code.

And yes, it doesn't matter what gdi uses, because it calls CRT runtime on its own. What happens here is that compiler replaces large initializations of structures/arrays with call to memset. Because memset is optimized for larger sizes. but in this code we don't care about performance so we can we can implement pretty simple loop.

I believe same situation can hapen with memcpy - compiler will replace assignment of larger structures with call to memcpy.

I put these two functions in win32_crt_memory.cpp file.

So include win32_crt_memory.cpp and this error will disappear.

I'll updated first post with these instructions, I also made a small update for instructions regarding floating point casts and SSE for 32-bit arch.

EDIT: with info from next post (from aidtopia) I updated this info to not remove "-Oi" argument from commandline.

Edited by Mārtiņš Možeiko on
In the past, I've built executables without any CRT, and I've also run into the memset problem--not only when calling a Windows API, but also from instances where the compiler implicitly adds a memset call to zero out a static struct.

You can provide your own memset implementation, but you'll have compile-time diagnostics because memset can also be an intrinsic function, and the compiler doesn't want you to re-implement intrinsic functions. My solution.

One drawback to my solution is that you can't use whole program optimization (link-time code generation). I'm not sure why, but if you're building your entire program as a single translation unit, as Casy is doing for Handmade Hero, then this might not matter.
Not necessarily relevant to MSVC, but according to this old thread gcc and llvm require at least memcpy, memmove, memset and memcmp.

And be careful to use -ffreestanding when defining those functions in those compilers, as otherwise they will recognize the loop idioms you write and replace them with calls to the library functions you are trying to replace! (I know at least LLVM at one time did this).
Great work.
This is a great resource! A bit unrelated but in every version of Windows there is also msvcrt.dll, a C standard library implementation that is present on all releases of Windows (since 95 iirc) in the System32 directory. Programs like cmd.exe, notepad.exe, etc link to it. The thing is, more recent versions of msvc don't allow you to do it by default. Not a problem if you create the lib files manually. Be sure to run vcvarsall.bat/vcvars32.bat and then:
1
dumpbin /exports C:\Windows\System32\msvcrt.dll > msvcrt_exports.txt

Then you can select the the function names you want and build a separate file with EXPORTS being on the first line:
1
2
3
4
5
EXPORTS
toupper
strcmp
sscanf
fprintf

Just add new lines with function names you want. Save this as msvcrt.def then build a lib file that you can link to:
1
lib /def:msvcrt.def /out:msvcrt.lib

This avoids the standard library dll hell while also allowing your code size to be small by dynamically linking to a c runtime library (as well as working without redistributables across all Windows versions).

It's not something that Microsoft recommends though, as the dll is subject to change between releases. If you're seriously shipping something with it you would want to use functions that are present in your earliest target OS version. A lot of new binaries that ship with recent Windows are being linked against the System32 msvcrt.dll though and there are a ton which do already, so it's not out of the question.

Just a useful tip. It's nice if you want to use some convenience functions even though you don't get all the features you'd get if you were to statically link.

Edited by a_null on
Hey Mārtiņš,

IIRC, one more thing that msvcrt does for us is catching stack page access exceptions and commiting pages as needed until all reserved space is commited. By default, 1 MiB of stack is reserved, but only first couple of 4 KiB pages are actually commited. If I make BigArray in your example bigger, say 40000 elements big, then after I rebuild and run the program, I'm getting some pause and then a crash. To fix this, you can add /STACK:0x100000,0x100000 to your command line, which basically tells linker to put into resulting PE file that it should commit the whole 1 MiB when it starts.
You're right. "/Gs" argument is what performs this check and committing of memory. I'll update the first post.

Edited by Mārtiņš Možeiko on
Absolutely amazing, I will definitely follow that.

Cleaning the executable!! Yes! Yes! Yes!
I would love to see more of these kind low level things.

Actually my interest is not in game, but I am following this series because I am interested in the low level things that a game programmer has to discover and deal with.
This is great! Are there any special considerations for threading?
No, native Windows threading will work fine - it doesn't depend on C runtime, it's an OS feature.