File organization & unity build

Casey, I know you touched on this a bit when setting up the build, but I'd like to hear some more about how you're organizing the project into multiple files, and especially how you're choosing what to put in .h vs .cpp, when to #include what, etc.

I'm familiar with the common practice of breaking everything up into a bunch of different compilation units, each with a .h full of forward declarations, macros, and struct definitions, and then a .cpp file with the function definitions separately. But that doesn't jive with the unity build. I understand the basic benefits and rationale of the unity build and I'm on board with that, but I'm having trouble discerning the pattern for how the code is organized so far.
In principle, you can do a strict scheme with declarations in headers and definitions in source files and still do a unity build. Just create a single file "unity.cpp" where you include all other source files, then give the compiler only this file. I'm not sure if there are any (compile-time) performance implications for that structure, though.
Please excuse my ignorance, but what is a "unity build"?
A unity build refers to "unifying" all the code into a single translation unit. In practice this means that you end up #including .cpp files to construct that single translation unit. It makes full builds fast since head files don't get compiled over and over again. On the flip side, it makes encapsulation harder since "file local" is effectively global.
ok, thanks.
Sure, I understand how unity builds work and how you *can* do them, but what I'm interested in is what Casey *is* doing and why, regarding how he organizes the code.
The way I think about things is:

1) I build everything in one go, so the only time I think about splitting up files for compilation purposes is when I have a special technical requirement, like the EXE/DLL split we have in Handmade Hero. Obviously that has to be obeyed.

2) I generally assume that everything is going to be in one giant translation unit logistically, so I don't care about what goes where and thus don't have things like "file-local globals" or "file local functions".

3) I make decisions about what goes into different files based on personal preference that has to do with what I want to think about as a logical unit. So I know, for example, that I want to think about the tile map storage as one thing, so I'll put it in its own file. I will often split that file into an .h and a .cpp based on what I want to be able to view side by side in Emacs - ie., I often want to look at all my structs for the tile map at the same time I'm looking at a function, so I put the structs in the .h and the functions in the .cpp.

4) I #include stuff wherever it works.

- Casey
cmuratori
The way I think about things is:

3) I make decisions about what goes into different files based on personal preference that has to do with what I want to think about as a logical unit. So I know, for example, that I want to think about the tile map storage as one thing, so I'll put it in its own file. I will often split that file into an .h and a .cpp based on what I want to be able to view side by side in Emacs - ie., I often want to look at all my structs for the tile map at the same time I'm looking at a function, so I put the structs in the .h and the functions in the .cpp.



So it seems that you're not against header/implementation files? Some programmers (including Jonathan Blow I believe) don't like the added maintenance of header files and would prefer to have all the code together in one file.

I, personally, find this cumbersome (in, for example, C#). I much prefer to separate interface from implementation most of the time for just the reason you pointed out. It also provides a convenient way to separate doxygen documentation from implementation-specific comments.

Edited by Flyingsand on
Is there actually a technical difference of the .h and .c/.cpp files or it is just for humans? The only way I can think of that header files would make sense is for API libarys. Otherwise I could just always #include the .c/.cpp files or can I?

I don't know...are there other reasons for header files?
3) Can't you just open two separate views one the same file? I'm sure such a nice OS as Emacs, has that feature.

But when I recently programmed little game in C, I kinda discovered situation where separating .cpp and .h files would come in handy. However I'm not sure it's not just bad design on my side. Basically it's something along these lines:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
// in main.cpp
#include "player.cpp"

struct gameState
{
    player Player;
    // plus all the other stuff
};

int main()
{
    // do stuff
}

----------------
// in player.cpp
struct player
{
    // some player data
}

void updatePlayer(gameState *GameState)
{
    // do stuff
}


Whether you include player first or you declare gameState struct first you get error. So I could imagine splitting player.cpp into .cpp and .h could help here, as I could load all the header files first. But I'm not if structuring code in the way I did isn't utterly stupid idea, so please correct me on that if you can.
So, basically everyone who is asking questions about this is correct in their assumptions :) But to summarize:

My file organization, beyond the DLL/EXE split, is just about readability and navigability for myself and would be 100% unnecessary if editors were actually better about showing me the contextual things. For example, if "IntelliSense" was less about completing field names and more about showing me the relevant structures and functions related to the things that I was typing, then I would just never need to think about files at all and everything would just be in one giant file.

Which would be much better.

If I ever get to making my own editor, it would work this way.

- Casey
BlueWolf
Can't you just open two separate views one the same file?


You can, but that is not as convenient as having the files split up, because you'd still have to hunt for where in the file you wanted to view. So I'd have to teach Emacs about, like, some known comment structure or something that it could "jump to" when I wanted to open a view to the implementation part of things, etc. So having two files is easier than writing that, since I suck at eLisp :P

Whether you include player first or you declare gameState struct first you get error. So I could imagine splitting player.cpp into .cpp and .h could help here
Yes, if need to interleave structs and functions for declaration order, then you need to split files up that way, definitely. But again, that's just because most C++ compilers are stupid - there actually have already been compilers where you don't need to have declarations sorted, it sorts them for you (see http://dl.acm.org/citation.cfm?id=288284 for an example). Unfortunately it does not seem to be the norm :(

- Casey
Got it - though I'm not 100% sure why not just build everything like .h files (with include guards) at which point you could #include them wherever you wanted and you wouldn't have to worry about double-includes once you get enough files that the dependency graph might get hairy.

It does seem like editors should have a "context" pane that shows the declaration of things related to your current position.
bhollis
Got it - though I'm not 100% sure why not just build everything like .h files (with include guards) at which point you could #include them wherever you wanted

That would be totally reasonable in my opinion. I've never actually had a problem with doubly-including my .cpp's, for whatever reason, so it was never an issue, but if you found that happening then include guarding everywhere seems like a nice cheap way to fix it!

- Casey