Casey's break down of the keywork "static"

In one of the vids Casey mentions that static in C++ really has more than one meaning. Then in handmade_types.h he writes them down, like so

#if !defined(internal)
#define internal static
#endif
#define local_persist static
#define global static

I've found only one function with local_persist

internal void
Win32VerifyMemoryListIntegrity(void)
{
    BeginTicketMutex(&GlobalWin32State.MemoryMutex);
    local_persist u32 FailCounter;
    win32_memory_block *Sentinel = &GlobalWin32State.MemorySentinel;
    for(win32_memory_block *SourceBlock = Sentinel->Next;
        SourceBlock != Sentinel;
        SourceBlock = SourceBlock->Next)
    {
        Assert(SourceBlock->Block.Size <= U32Max);
    }
    ++FailCounter;
    if(FailCounter == 35)
    {
        int BreakHere = 3;
    }
    EndTicketMutex(&GlobalWin32State.MemoryMutex);
}

I believe that means that all calls of that function will have the same FailCounter variable as if it was a sort of extern shared by different threads?


I know global we can define something like

struct foo { static int a; int b; }

and a will change value for all foo constructed structs, so there's a global inside a struct that I believe is the idea of the local_persist above.

But take now handmade_config.h

global b32 Global_Renderer_Camera_UseDebug = false;

At least for me, this doesn't work I guess because different calls for this will be local. So I put those as extern actually, or alternatively I wrap them around a struct with static keyword . At least me, if I insist on compiling this, my variables remain uninitialized so I need to extern them and initialize them in some cpp.

I believe that means that all calls of that function will have the same FailCounter variable as if it was a sort of extern shared by different threads?

Yes, when you declare a variable inside a function as static that means it is global to the function. So if Win32VerifyMemoryListIntegrity() is invoked on 3 separate occasions within a span of time that means the FailCounter variable will be incremented 3 times and have a value of 3. This is because, when specified with the keyword static in c/c++, this variable does not have automatic storage duration anymore, meaning it no longer follows the rules of stack memory allocation/deallocation which are the default storage duration rules for any variables defined within a function.

It is also true that delcaring a static variable within a struct will make that variable global to all those struct's defined within your program (though note this is only allowed in c++ and not c).

I don't always recommend stack overflow answers but answer# 2 and a couple others after it give a pretty good idea of what exactly you can expect with the static/extern keywords in c/c++.

https://stackoverflow.com/questions/3684450/what-is-the-difference-between-static-and-extern-in-c

Yeah that part I think is clearer to me, but how can he declare/init a static loose in a header file and make it work like an extern? This outright fails for me.

Just contrasting this with "internal", in the sense that the same function have different copies of itself in different cpp it is called, right? So for functions he is literally saying he does NOT want them inlined when he marks them internal, right? Hence the "internal", internal copy to the translation unit.

That's how I'm reading his semantics.


Replying to boagz57 (#25826)
global b32 Global_Renderer_Camera_UseDebug = false;

This works because Casey uses this variable in a single translation unit. Handmade hero is compiled with only 3 translation units: the platform layer win32_handmade.cpp, the renderer win32_handmade_opengl.cpp and the game handmade.cpp. Only the game translation unit uses that variable. [EDIT] It's not even 3 translation units as they produce different exe or dlls.

The static keyword has nothing to do with inlining.

internal will only be used in front of functions. It means that the function will only exists in the current translation unit and the compiler will not generate export symbols for it. If you use the function in several translation units, you'll get different "copies" of the function.

local_persist is used to declare a variable which scope will be the scope of the function, but it's lifetime will be the lifetime of the program (it's value persist after the function exits).

global is used to declare a variable which scope is the entire translation unit, and it's lifetime is the lifetime of the program. Each translation unit will get a different "instance" of the variable.

If you need a variable across different translation unit, you'll need to declare it with the extern keyword.

You also need to take into account that with an internal function containing a local_persist variable you'll get a different instance of the variable in each translation unit.

Using threads doesn't change any of this. If you need separate variable in different thread you need to use thread local storage. Here is some code from one of my project but I believe the original code was from Sean Barrett.

#if defined( _MSC_VER )
#define thread_local_storage __declspec( thread )
#elif defined( __clang__ ) || defined( __GNUC__ )
#define thread_local_storage __thread
#elif __cplusplus >= 201103L
#define thread_local_storage thread_local
#elif __STDC_VERSION_ >= 201112L
#define thread_local_storage _Thread_local
#else
# error Unsupported platform.
#endif

Edited by Simon Anciaux on Reason: Note on the exe and dlls

Was typing out a response but Mrmixer already beat me to it lol.

ahah all good no problem


Replying to boagz57 (#25829)

This works because Casey uses this variable in a single translation unit.

Hmm ok, but then global is a bit confusing naming -_- ...

If you need a variable across different translation unit, you'll need to declare it with the extern keyword.

If you wrap struct xxx{} around it you "trick" c++ to treat it like an extern in this sense, but only that it is visible only if you include the proper header, which is a weird way to make you conscious of where you are calling that thing I guess like it was an extern with a namespace around it?

The static keyword has nothing to do with inlining.

Ok here I'm a bit talking out of my b##, but wouldn't be a fair assessment to say that if you force any translation unit to have its own copy of that function, then in practicality you can not inline it? Or can you inline it in one source, but not the other? (it looks like it can be done in my linter here)

ps: hey how do you insert all the colors in the code block?


Replying to mrmixer (#25828)

A static variable in a class or struct is accessible as long as you know the definition of the class/struct. I don't know the details, but I think it's just defined like that in the Cpp specification, I don't think it's a good idea to try to relate it to the other uses of static or extern.

[Edit] Hopefully mmozeiko could answer about inlining. I removed my reply because it was pure guess on my part.

For the colors, you just need to type the language name just after the 3 back quotes without space between the name and the quotes.


Edited by Simon Anciaux on
Replying to da447m (#25831)

For the colors, you just need to type the language name just after the 3 back quotes without space between the name and the quotes.

printf("thank you!");

Which means that the function could be inlined in a translation unit and not in another.

This is how I think I understood this some time ago. As noob I avoid using such key words most of the time but I'm trying to understand everything Casey does.

IIRC Inline is a request that may not even be done, while IIRC some things are inlined by default. Like operator overloads defined inside class/struct.

I always assumed any "free" function is static by default even if you don't mark as such. Certainly any function that pertains to a cpp only is static, same way the variables there.

EDIT: which means that they are inserted exactly there, that's why I confused it with inline.


Edited by da447m on Reason: clarification on inline confusion
Replying to mrmixer (#25832)

Ability or desire to inline function or not compiler decides on its own regardless of inline or static keywords. As long as compiler sees function definition in translation unit it compiles, then it can decide to inline - regardless where it is defined. Regular C function, function in C++ namespace, member function in C++ class, operator overloads, template functions/classes, etc...

First of all you should forget about "header" files or #include's. For compiler they do not exist. Those are only preprocessor things. Compiler does "copy&paste" to take #include content and put in place where it included. Only then it actually compiles source file that produces one object file. This is called translation unit. All these static/inline/extern keywords mean something only in context of translation unit, headers do not mean anything.

Also by word "symbol" everywhere I mean global function or variable - they both work the same regardless static & extern keywords.

What static/inline/extern keywords do - they define symbol linkage. By default all global symbols in C are extern. Meaning they produce symbol in object file. And other translation units can call this function by name or use variable by name. This means you cannot have same named symbol in different translation units. Sometimes you want that, sometimes not. This also slows down compilation & linking process - they need to process a lot of symbol names this way.
There are minor differences in C++, for example, symbols inside anonymous namespaces do not produce "public" symbol - other TU's won't be able access it.

This is avoided by static keyword. What this keyword does is to make symbol "private" - only visible to current translation unit. This is called "internal linkage". This is great - compiler compiles faster (no need to put it public symbol table), and linker links faster (less symbols to process). If you place same named function/variable in multiple TU's that means compiler will optimize & codegen each place with its own copy of function/variable. Linkers have special step where they collect duplicated function bytes to generate only one copy in output binary, but that's implementation detail.

Third option is if you put inline keyword in front of function. This is kind of legacy from old C compilers and in my opinion should not be used in modern codebases. What it does is to generate function code same as static case inside translation unit, but still place its name in public symbol table with special marker saying it is "inline". This can happen in multiple translation units, so when linker links everything together, it won't error out like in "extern" case with duplicate symbol error - it will see this marker and simply take any copy it wants. This means every single translation unit must provide exactly same function as inline. Otherwise it is violation of "one definition rule" - and compiler is allowed to produce garbage output as result.

So technically with inline function some other TU could still access it by having extern declaration of it.

There some other minor differences between these static/inline/extern things, I just gave high level overview. You should read about them in details in documentation: https://en.cppreference.com/w/c/language/storage_duration https://en.cppreference.com/w/c/language/inline

And if you have not see Chat 13 episode, watch it - because it goes into details of translation units and how compilers & linkers work: https://guide.handmadehero.org/chat/chat013/


Edited by Mārtiņš Možeiko on
Replying to da447m (#25833)

First of all you should forget about "header" files or #include's. For compiler they do not exist. Those are only preprocessor things. Compiler does "copy&paste" to take #include content and put in place where it included.

Thanks for the explanations Martins.

For guys like you and Casey you are like a live linter, but for me it is very painful not to have the linter and code profiler finding the stuff, bringing me to the functions/structs definitions.

It makes so much easier to read direct code having the includes there so I at least manually include handmade.h etc.

I guess that will influence how I'm reading all this stuff about statics etc. So I think I get what you are saying about storage, but if Casey just basically just call everything in the platform unit (the win32_handmade.cpp for now), then why the need to call funcs static/internal IF that's really the only place they are called?

Idk let's take handmade_box.h/cpp, if I'm only using its functions in a sole TU, and if I don't even care if some other place will call it, then why not just letting the thing without the keyword? Because you would want to ensure the compilation actually follows in the path you described for static funcs in the case you just actually called those functions in another TU, potentially, later on?


Edited by da447m on
Replying to mmozeiko (#25835)

It's fine to use headers if you want to use headers. There's nothing wrong with it.

What I was saying there is just the way you should think about what happens during compilation process. Having or not having header file does not matter for things like static/inline/etc. You should think what compiler sees (translation unit) and then figure out what will happen with out statiscs/inlines/etc.

then why the need to call funcs static/internal IF that's really the only place they are called?

For faster compiler & linking time. As I wrote in my previous message - having things static makes compiler to produce smaller symbol table. It has fewer things to put there. And then links has less things to read & process. Everything compiles/links faster, because there's no need to do work that's useless anyway.

I don't know any numbers from modern machine/compiler, but in MSVC2008 times ~12 years ago I have seen build time improve from many minutes to ten+ seconds in large projects, just because you put everything static that is not supposed to be visible to other translation units.


Edited by Mārtiņš Možeiko on
Replying to da447m (#25838)

OK makes sense.

And what about the performance? Would having copies of the same functions be actually beneficial to performance somehow instead of going somewhere several times to call the same functions?

I know the question is very broad, but is there a rule-of-thumb just like "pass by value small stuff"?


Replying to mmozeiko (#25840)

Theoretically having function in one copy should be faster, because that will use less space in code cache - so you can put more code bytes inside there.

In practice it won't matter in almost all situations, because of multiple reasons:

  1. if code you call is really performance critical, you actually don't want any call at all, you want function to be inlined anyway.

  2. often called function (or in the loop) will be in cache already, so not really a problem.

  3. if function is really duplicated in different translation units, and compiles to exact same bytes, the linker has an option to simply keep only one copy of function in final executable - deduplicate it across translation units. In MSVC this is called ICF optimization: https://docs.microsoft.com/en-us/cpp/build/reference/opt-optimizations?view=msvc-170


Replying to da447m (#25844)

if code you call is really performance critical, you actually don't want any call at all, you want function to be inlined anyway.

I don't understand how the concept of cache locality/coherence/etc applies to functions. I mean, yes functions are just a stream of bytes as well, that is read in a special way, but while I can see why the cpu would pre fetch data because a called function uses it, I can't reason how the "fetch these funcs here cuz" happens.

So I suspect inline has to do with forcing something related to cached data somehow? That functions have their special cache but still that doesn't explain the whys and whanots of keeping stuff pre fetched there.


Replying to mmozeiko (#25845)