mmozeiko
4) Global objects with C++ constructors and destructors - it's possible to implement it, but it's not needed for us.
1 2 3 | int atexit( void (__cdecl *func )( void ) ); |
I know this is an old post, but I thought this might be a helpful addition. Let me know if I can improve any of the formatting as I didn't see a format guide, so I just started throwing Markdown at it when I noticed it worked. :)
I have been playing around with no MSVCRT/UCRT using clang on Windows (LLVM for Windows, not under MinGW64) with the Windows 10 SDK. I'm not sure if this is because I am using a pure C compiler and not a C++ compiler, or clang vs MSVC, but I do not run into any of the issues mentioned in this thread (using floats or large arrays on stack). As such I didn't need to include any of the helper .cpp
files provided.
Executable size is around 4KB when compiled using -Os
.
Mode LastWriteTime Length Name ---- ------------- ------ ---- -a--- 2022-03-01 17:07 3072 win32.exe
Command-line:
clang win32.c -o"build/win32.exe" -nostdlib -ffreestanding -fuse-ld=lld -Xlinker /SUBSYSTEM:windows -Xlinker /STACK:0x100000,0x100000 -fno-stack-check -fno-stack-protector -mno-stack-arg-probe -std=c2x -Wpedantic -Wall -Wextra -Os -lkernel32 -lshell32 -luser32
This is using the LLVM lld-link.exe interface to actually link the executable, so I'm passing the /SUBSYSTEM
arg via -Xlinker
.
I have to manually include shellapi.h
because WIN32_LEAN_AND_MEAN
removes the include from Windows.h
.
The meat of WinMainCRTStartup
is converting the Windows UTF-16 command line into UTF-8 for main
to consume. The reason I'm passing to main is that, ideally, I would use it for entry on every other OS and WinMainCRTStartup
is just for bootstrapping Windows to provide anything main
would be provided by the other OSs.
Any constructive criticism is encouraged. :D I know I went a little nuts with the NO.....
defines. :D
My quick and dirty testing code, compiles with no errors or warnings and runs without issue on Windows 10 21H2 19044.1566
, YMMV.
#define WIN32_LEAN_AND_MEAN #define UNICODE #define NOMINMAX #define NOCOMM #define NOMCX #define NODRAWTEXT #define NOHELP #define NOMENUS #define NOCTLMGR #define NOKANJI #include <Windows.h> #include <shellapi.h> typedef unsigned long long usize; #define nullptr ((void*) 0) int main(int argc, char** argv); void __stdcall WinMainCRTStartup(void) { int argc = 0; wchar_t** argv_w = CommandLineToArgvW(GetCommandLineW(), &argc); if (argv_w == nullptr) { ExitProcess(GetLastError()); } char** argv = {0}; argv = HeapAlloc(GetProcessHeap(), 0, argc * sizeof(char*)); if (argv == nullptr) { ExitProcess(GetLastError()); } for (usize i = 0; i < (usize) argc; ++i) { int bufferSize = WideCharToMultiByte(CP_UTF8, WC_ERR_INVALID_CHARS, argv_w[i], -1, nullptr, 0, nullptr, nullptr); if (bufferSize == 0) { MessageBox(nullptr, L"Unable to get buffer size", L"Win32 Error", MB_OK); ExitProcess(GetLastError()); } argv[i] = HeapAlloc(GetProcessHeap(), HEAP_ZERO_MEMORY, bufferSize + 1); if (argv[i] == nullptr) { MessageBox(nullptr, L"Unable to allocate memory for item in argv", L"Win32 Error", MB_OK); ExitProcess(GetLastError()); } int bytesWritten = WideCharToMultiByte(CP_UTF8, WC_ERR_INVALID_CHARS, argv_w[i], -1, argv[i], bufferSize, nullptr, nullptr); if (bytesWritten == 0) { MessageBox(nullptr, L"Unable to convert argv", L"Win32 Error", MB_OK); ExitProcess(GetLastError()); } } LocalFree(argv_w); // NOTE(ddouglas - 2022-03-01): Pass to main() int result = main(argc, argv); // Exit with the result from main ExitProcess(result); } int main(int argc, char** argv) { for (usize i = 0; i < (usize) argc; ++i) { MessageBoxA(nullptr, argv[i], "Win32", MB_OK); } return 0; }
I'm not sure if this is because I am using a pure C compiler and not a C++ compiler, or clang vs MSVC, but I do not run into any of the issues mentioned in this thread (using floats or large arrays on stack).
_fltused
thing is cl.exe only issue. For clang it is not a problem.
And for large arrays it is not a problem for you because you use -mno-stack-arg-probe
argument that disables stack probes - which prevents chkstk() function calls. But remember to set call stack large enough in linker arguments, otherwise default 4KB commit will not be enough.
In your example there's some useless code - HeapFree() after return of main. There's no reason to run extra code when process is terminating, just waste of time.
strlen_w is not really needed in your code, because WideCharToMultiByte
accepts -1 for length, then it will use fact that strings are zero terminated.
This is using the LLVM lld-link.exe interface to actually link the executable, so I'm passing the /SUBSYSTEM arg via -Xlinker.
It is not. For clang to actually use lld linker you need to pass -fuse-ld=lld
argument. Otherwise it will use link.exe from MSVC for windows builds. You can see what you're using by passing -v
argument to clang.
_fltused thing is cl.exe only issue. For clang it is not a problem.
And for large arrays it is not a problem for you because you use -mno-stack-arg-probe argument that disables stack probes - which prevents chkstk() function calls. But remember to set call stack large enough in linker arguments, otherwise default 4KB commit will not be enough.
Thank you for the clarification! I'll adjust to make sure my stack is set properly.
In your example there's some useless code - HeapFree() after return of main. There's no reason to run extra code when process is terminating, just waste of time.
I agree that the OS will clean it up on process exit, but if you are running it through an analyzer (e.g. an equivalent to valgrind on Windows) it should complain about leaked memory at exit. I guess that's more of an OCD thing then a real concern as the user won't care of leaked memory on exit. ;)
strlen_w is not really needed in your code, because WideCharToMultiByte accepts -1 for length, then it will use fact that strings are zero terminated.
That was my misunderstanding, I was just using the -1 to find the buffer size (edit: I lied, I was not, even though the docs said you could - fixed) and thought I would need the actual length on subsequent call. Obviously if the -1 works on determining the buffer size, it should work in the other call. :D Thanks!
It is not. For clang to actually use lld linker you need to pass -fuse-ld=lld argument. Otherwise it will use link.exe from MSVC for windows builds. You can see what you're using by passing -v argument to clang.
Crap, you are 100% correct. I'll edit my post above to include the -fuse-ld=lld in the command. I ran it with -v
this time to confirm:
"C:\\Program Files\\LLVM\\bin\\lld-link" -out:build/win32.exe
WARNING This text is old, and a bit outdated. Some of information is not relevant anymore, and many things can be done in better way (read the new guide). WARNING
@mmozeiko, I searched around for a new guide on this website, but didn't find one. Where is the new guide you mentioned?
In general I would recommend using clang on Windows - as it can produce native debug info in pdb format, so you will have much better time debugging, profiling & using other tools on windows. And for clang-cl same logic applies as for cl.exe to not depend on CRT runtime code.
But for gcc it really depends on what kind of gcc build you have. Different gcc builds are made differently, so they may depend on different runtime dll files. In general running gcc -nostdlib file.c
where file.c has _start
function for entry point will prevent of including any CRT runtime code. But as I mentioned earlier - it will depend on actual gcc build, you might need extras there.