Why use a unity build?

MrCelsius88

#26788

September 6, 2022

I’m aware of the basic concept of a unity build and how it’s supposed to increase compilation time by reducing the amount of files compiled by #including everything into one .c file. And I am aware that Casey follows this. However I’m still a beginner and I’m trying to learn about c and in the book I read it said that it is best practice for data abstractions to be separated by public interface (.h) and private implementation (.c) so my question is is there any other benefit to using a unity build, and is it worth using a unity build over organizing the file structure in the way that the book I read said?

Mārtiņš Možeiko

#26790

September 6, 2022

That's the only goal of doing unity build - increased compilation speed.

There are some other advantages for optimized builds, for example, compiler can now inline across all functions which would not be possible if they are in different TU's. Although this is not a very important reason - as there are ways to do this in normal builds too, like inline functions or LTO.

The bigger disadvantage of unity builds is slower incremental builds. Sometimes if you include larger 3rd party libraries like stb image, or dear imgui, or sqlite and similar, then your incremental build times can go crazy high compared to building with separate TU's. In separate TU case those 3rd party source files will be built just once and .obj files reused when you're changing only your code.

longtran2904

#26795

September 8, 2022

Although this is not a very important reason - as there are ways to do this in normal builds too, like inline functions or LTO.

What did you mean by "inline functions"? You just said it's not possible to inline across all functions.

Also, What usually does a ".obj" file contain that makes it un-inlinable?

Replying to mmozeiko (#26790)

Mārtiņš Možeiko

#26796

September 8, 2022

If you put functions in .h file as "inline" functions, then they can be inlined. Of course you should not do that for all functions otherwise there's no point of using multiple TU's. But for few selected functions that are important to performance you could put them in .h.

Inlining is part of compiler codegen process. It needs to perform many optimizations afterwards for inlining to make sense. Like constant folding, dead code elimination, etc.. But you cannot really do that on .obj file, as .obj file contains generated machine code (technically you could decompile it, inline and generate code again, but that would reduce how many optimizations you can apply because a lot of information about original code has been lost).

Edited by Mārtiņš Možeiko on September 8, 2022, 9:59pm

Replying to longtran2904 (#26795)

longtran2904

#26798

September 9, 2022

Does the compiler still generate .obj files for the "inline" functions?

Replying to mmozeiko (#26796)

Mārtiņš Možeiko

#26799

September 9, 2022

Compiler generates obj file for every translation unit you compile. It puts everything from this translation unit into .obj file, including inline functions (as long as they are not static). Linker then is free to choose any only one copy from all of them to put into output file in case function was not inlined and is required for some calls. This is know as one definition rule.

Replying to longtran2904 (#26798)

longtran2904

#26800

September 9, 2022

Does a header file consider a .obj file? How does the "put all your "inline" functions in a .h file" example work?

Replying to mmozeiko (#26799)

Simon Anciaux

#26803

September 9, 2022

Each translation units produce a .obj file. It means if you pass 3 files to the compiler (like cl main.c a.c b.c), it will output 3 obj file.

If you have some code that you want to be inlined, the translation units need to have access to the source code of the function to be able to have the instruction in each .obj files, it's not sufficient to only have the function prototype.

So if you want a function to be (considered to be) inlined in several translation units, in the file you include (.h or .c it doesn't matter) you need the full function body.

Also remember that the inline keyword is only a hint to the compiler and will most likely be ignored (I didn't include any in the examples).

Possible to inline:

/* possible_to_inline.h */
int function_to_inline( int x ) {
    int result = x * x;
    return x;
}

/* Translation unit 1 */
#include "possible_to_inline.h"
int main( void ) {
    /* ... */
    int result = function_to_inline( 3 );
    /* ... */
}

/* Translation unit 2 */
#include "possible_to_inline.h"
void tu_1( void ) {
    function_to_inline( 3 );
}

Impossible to inline

/* impossible_to_inline.h */
int function_to_inline( int x );

/* Translation unit 1 */
#include "impossible_to_inline.h"
int main( void ) {
    /* ... */
    int result = function_to_inline( 3 );
    /* ... */
}

/* Translation unit 2 */
#include "impossible_to_inline.h"

int function_to_inline( int x ) {
    int result = x * x;
    return x;
}

void tu_1( void ) {
    function_to_inline( 3 );
}